Sei sulla pagina 1di 1047

The Cambridge Encyclopedia of the Language Sciences

Have you lost track of developments in generative linguistics, fi nding yourself unsure about
the distinctive features of Minimalism? Would you like to know more about recent advances in
the genetics of language, or about right hemisphere linguistic operation? Has your interest in
narrative drawn you to question the relation between stories and grammars? The Cambridge
Encyclopedia of the Language Sciences addresses these issues, along with hundreds of others.
It includes basic entries for those unfamiliar with a given topic and more specific entries for
those seeking more specialized knowledge. It incorporates both well-established fi ndings and
cutting-edge research as well as classical approaches and new theoretical innovations. The volume is aimed at readers who have an interest in some aspect of language science but wish to
learn more about the broad range of ideas, fi ndings, practices, and prospects that constitute
this rapidly expanding field, a field arguably at the center of current research on the human
mind and human society.
Patrick Colm Hogan is a professor in the Department of English and the Program in Cognitive
Science at the University of Connecticut. He is the author of ten books, including Cognitive
Science, Literature, and the Arts: A Guide for Humanists and The Mind and Its Stories: Narrative
Universals and Human Emotion (Cambridge University Press, 2003).

Advance Praise for


The Cambridge Encyclopedia of the Language Sciences

For both range and depth of exposition and commentary on the diverse disciplinary angles
that exist on the nature of language, there is no single volume to match this fi ne work of
reference.
Akeel Bilgrami, Columbia University
The Cambridge Encyclopedia of the Language Sciences is a very welcome addition to the field
of language sciences. Its comprehensiveness is praiseworthy, as is the quality of its entries and
discussions.
Seymour Chatman, University of California, Berkeley
Th is ambitious and comprehensive work, and the very high quality of the editors and contributors, ensure that it will be a valuable contribution to the understanding of language and
its uses, for both professionals and a more general audience.
Noam Chomsky, Massachusetts Institute of Technology

THE CAMBRIDGE ENCYCLOPEDIA OF

THE LANGUAGE SCIENCES


Edited by
PATRICK COLM HOGAN
University of Connecticut

C AM BRIDG E U N I VE RSI T Y PRE SS


Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore,
So Paulo, Delhi, Dubai, Tokyo, Mexico City
Cambridge University Press
32 Avenue of the Americas, New York, NY 10013-2473, USA
www.cambridge.org
Information on this title: www.cambridge.org/9780521866897
Cambridge University Press 2011
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2011
Printed in the United States of America
A catalog record for this publication is available from the British Library.
Library of Congress Cataloging in Publication data
The Cambridge encyclopedia of the language sciences / edited by Patrick Colm Hogan.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-521-86689-7 (hardback)
1. Linguistics Encyclopedias. I. Hogan, Patrick Colm. II. Title.
P 29.C28 2009
410.3dc22
2008041978
ISBN 978-0-521-86689-7 Hardback
Cambridge University Press has no responsibility for the persistence or
accuracy of URLs for external or third-party Internet Web sites referred to in
this publication and does not guarantee that any content on such Web sites is,
or will remain, accurate or appropriate.

GENERAL EDITOR
Patrick Colm Hogan
University of Connecticut, Storrs

ADVISORY EDITORIAL BOARD


Florian Coulmas
German Institute of Japanese Studies
and Duisberg-Essen University

Barbara Lust
Cornell University

William Croft
University of New Mexico

Lee Osterhout
University of Washington

Lyle Jenkins
Biolinguistics Institute

James Pustejovsky
Brandeis University

CONSULTING EDITORIAL BOARD


Mark Baker
Rutgers University

Howard Lasnik
University of Maryland

Deborah Cameron
University of Oxford

Loraine Obler
City University of New York

Nigel Fabb
University of Strathclyde

William OGrady
University of Hawaii

Carol Ann Fowler


Haskins Laboratories
and University of Connecticut

Susan Pintzuk
University of York

Ronald Geluykens
University of Oldenburg

Eleanor Rosch
University of California, Berkeley

Margaret Harris
Oxford Brookes University

Jay Rueckl
University of Connecticut

Zoltn Kvecses
Etvs Lornd University

Mark Turner
Case Western Reserve University

To the memory of B. N. Pandit (19162007)


philosopher, Sanskritist, father-in-law
Purua-artha-nyn gun pratiprasava kaivalya
sva-rpa-pratih v citi-aktir-iti
Patajali

CONTENTS

List of Entries
A Note on Cross-References and the Alphabetization of the Entries
Preface: On the Very Idea of Language Sciences
Acknowledgments

Language Structure in Its Human Context: New Directions


for the Language Sciences in the Twenty-First Century

page xi
xv
xvii
xxiii

William Croft

The Psychology of Linguistic Form

12

Lee Osterhout, Richard A. Wright, and Mark D. Allen

The Structure of Meaning

23

James Pustejovsky

Social Practices of Speech and Writing

35

Florian Coulmas

Explaining Language: Neuroscience, Genetics, and Evolution

46

Lyle Jenkins

Acquisition of Language

56

Barbara Lust

Elaborating Speech and Writing: Verbal Art

65

Patrick Colm Hogan

ENTRIES

77

List of Contributors

941

Index

953

ix

ENTRIES

A
Abduction
Absolute and Statistical Universals
Accessibility Hierarchy
Acoustic Phonetics
Adaptation
Ad Hoc Categories
Adjacency Pair
Age Groups
Aging and Language
Agreement
Agreement Maximization
Alliteration
Ambiguity
Amygdala
Analogy
Analogy: Synchronic and Diachronic
Analyticity
Anaphora
Animacy
Animal Communication and Human
Language
Aphasia
Areal Distinctness and Literature
Art, Languages of
Articulatory Phonetics
Artificial Languages
Aspect
Auditory Processing
Autism and Language
Autonomy of Syntax
B
Babbling
Basal Ganglia
Basic Level Concepts
Bilingual Education
Bilingualism, Neurobiology of
Bilingualism and Multilingualism
Binding

Biolinguistics
Birdsong and Human Language
Blended Space
Blindness and Language
Bounding
Brain and Language
Brocas Area
C
Cartesian Linguistics
Case
Categorial Grammar
Categorization
Causative Constructions
C-Command
Cerebellum
Charity, Principle of
Childrens Grammatical Errors
Chirographic Culture
Clitics and Cliticization
Codeswitching
Cognitive Architecture
Cognitive Grammar
Cognitive Linguistics and Language
Learning
Cognitive Linguistics, Language Science,
and Metatheory
Cognitive Poetics
Coherence, Discourse
Coherence, Logical
Colonialism and Language
Color Classification
Communication
Communication, Prelinguistic
Communicative Action
Communicative Intention
Comparative Method
Competence
Competence and Performance, Literary
Compositionality

Computational Linguistics
Concepts
Conceptual Blending
Conceptual Development and Change
Conceptual Metaphor
Conduit Metaphor
Connectionism and Grammar
Connectionism, Language Science, and
Meaning
Connectionist Models, Language
Structure, and Representation
Consciousness and Language
Consistency, Truth, and Paradox
Constituent Structure
Constraints in Language Acquisition
Construction Grammars
Contact, Language
Context and Co-Text
Control Structures
Conversational Implicature
Conversational Repair
Conversation Analysis
Cooperative Principle
Core and Periphery
Corpus Callosum
Corpus Linguistics
Creativity in Language Use
Creoles
Critical Discourse Analysis
Critical Periods
Culture and Language
Cycle, The

D
Deconstruction
Defi nite Descriptions
Deixis
Descriptive, Observational, and
Explanatory Adequacy
Dhvani and Rasa

xi

List of Entries
Dialect
Dialogism and Heteroglossia
Diff usion
Digital Media
Diglossia
Discourse Analysis (Foucaultian)
Discourse Analysis (Linguistic)
Discrete Infi nity
Disorders of Reading and Writing
Division of Linguistic Labor
Dyslexia

E
Ellipsis
Embodiment
Emergentism
Emergent Structure
Emotion and Language
Emotion, Speech, and Writing
Emotion Words
Emplotment
Encoding
nonc/Statement (Foucault)
Essentialism and Meaning
Ethics and Language
Ethnolinguistic Identity
Event Structure and Grammar
Evidentiality
Evolutionary Psychology
Exemplar
Exemplar Theory
Extinction of Languages

F
Family Resemblance
Feature Analysis
Felicity Conditions
Field (Bourdieu)
Film and Language
Filters
Focus
Foregrounding
Forensic Linguistics
Formal Semantics
Forms of Life
Frame Semantics
Framing Effects
Frontal Lobe
Functional Linguistics

G
Games and Language
Gender and Language
Gender Marking
Generative Grammar
Generative Poetics
Generative Semantics
Generic- and Specific-Level Metaphors

xii

Genes and Language


Gesture
Government and Binding
Grammaticality
Grammaticality Judgments
Grammaticalization
Grooming, Gossip, and Language

H
Habitus, Linguistic
Head-Driven Phrase Structure Grammar
Hippocampus
Historical Linguistics
Historical Reconstruction
Holophrastic Stage, The
Homologies and Transformation Sets

I
Icon, Index, and Symbol
Ideal Speech Situation
Identity, Language and
Ideology and Language
Idioms
Idle Talk and Authenticity
Ijtihd (Interpretive Effort)
I-Language and E-Language
Illocutionary Force and Sentence Types
Image Schema
Implicational Universals
Indeterminacy of Translation
Indexicals
Inequality, Linguistic and
Communicative
Infantile Responses to Language
Information Structure in Discourse
Information Theory
Innateness and Innatism
Integrational Linguistics
Intension and Extension
Intentionality
Internal Reconstruction
Interpretation and Explanation
Interpretive Community
Intertextuality
Intonation
Irony

L
Language, Natural and Symbolic
Language Acquisition Device (LAD)
Language Change, Universals of
Language Families
Language-Game
Language-Learning Environment
Language of Thought
Language Policy
Laws of Language
Learnability

Left Hemisphere Language Processing


Legal Interpretation
Lexical Acquisition
Lexical-Functional Grammar
Lexical Learning Hypothesis
Lexical Processing, Neurobiology of
Lexical Relations
Lexical Semantics
Lexicography
Linguistic Relativism
Literacy
Literariness
Literary Character and Character Types
Literary Universals
Literature, Empirical Study of
Logic and Language
Logical Form
Logical Positivism

M
Mapping
Markedness
Market, Linguistic
Marxism and Language
Meaning and Belief
Meaning and Stipulation
Meaning Externalism and Internalism
Media of Communication
Memes and Language
Memory and Language
Mental Models and Language
Mental Space
Merge
Metalanguage
Metaphor
Metaphor, Acquisition of
Metaphor, Information Transfer in
Metaphor, Neural Substrates of
Metaphor, Universals of
Meter
Methodological Solipsism
Methodology
Metonymy
Minimalism
Mirror Systems, Imitation, and
Language
Modality
Modern World-System, Language and
the
Modularity
Montague Grammar
Mood
Morpheme
Morphological Change
Morphological Typology
Morphology
Morphology, Acquisition of
Morphology, Evolution and

List of Entries
Morphology, Neurobiology of
Morphology, Universals of
Motif
Movement
Music, Language and

N
Narrative, Grammar and
Narrative, Neurobiology of
Narrative, Scientific Approaches to
Narrative Universals
Narratives of Personal Experience
Narratology
Nationalism and Language
Natural Kind Terms
Necessary and Sufficient Conditions
Negation and Negative Polarity
Network Theory
Neurochemistry and Language
Neuroimaging
Number

O
Occipital Lobe
Optimality Theory
Oral Composition
Oral Culture
Ordinary Language Philosophy
Origins of Language
Overregularizations

P
Parable
Paralanguage
Parameters
Parietal Lobe
Parsing, Human
Parsing, Machine
Passing Theories
Performance
Performative and Constative
Perisylvian Cortex
Perlocution
Person
Philology and Hermeneutics
Phoneme
Phonetics
Phonetics and Phonology, Neurobiology
of
Phonological Awareness
Phonology
Phonology, Acquisition of
Phonology, Evolution of
Phonology, Universals of
Phrase Structure
Pidgins
Pitch
Poetic Form, Universals of

Poetic Language, Neurobiology of


Poetic Metaphor
Poetics
Point of View
Politeness
Politics of Language
Possible Worlds Semantics
Possible Worlds Semantics and Fiction
Pragmatic Competence, Acquisition of
Pragmatics
Pragmatics, Evolution and
Pragmatics, Neuroscience of
Pragmatics, Universals in
Pragmatism and Language
Predicate and Argument
Preference Rules
Prestige
Presupposition
Primate Vocal Communication
Priming, Semantic
Principles and Parameters Theory
Principles and Parameters Theory and
Language Acquisition
Print Culture
Private Language
Projectibility of Predicates
Projection (Blending Theory)
Projection Principle
Proposition
Propositional Attitudes
Prototypes
Proverbs
Psychoanalysis and Language
Psycholinguistics
Psychonarratology
Psychophysics of Speech

Q
Qualia Roles
Quantification
Quantitative Linguistics

R
Radical Interpretation
Reading
Realization Structure
Rectification of Names (Zheng Ming)
Recursion, Iteration, and
Metarepresentation
Reference and Extension
Reference Tracking
Register
Regularization
Relevance Theory
Religion and Language
Representations
Rhetoric and Persuasion
Rhyme and Assonance

Rhythm
Right Hemisphere Language Processing
Role and Reference Grammar
Rule-Following

S
Schema
Scripts
Second Language Acquisition
Self-Concept
Self-Organizing Systems
Semantic Change
Semantic Fields
Semantic Memory
Semantic Primitives (Primes)
Semantics
Semantics, Acquisition of
Semantics, Evolution and
Semantics, Neurobiology of
Semantics, Universals of
Semantics-Phonology Interface
Semantics-Pragmatics Interaction
Semiotics
Sense and Reference
Sentence
Sentence Meaning
Sexuality and Language
Signed Languages, Neurobiology of
Sign Language, Acquisition of
Sign Languages
Sinewave Synthesis
Situation Semantics
Socially Distributed Cognition
Sociolinguistics
Source and Target
Specific Language Impairment
Speech-Acts
Speech Anatomy, Evolution of
Speech-Language Pathology
Speech Perception
Speech Perception in Infants
Speech Production
Spelling
Spreading Activation
Standardization
Standard Theory and Extended Standard
Theory
Stereotypes
Story and Discourse
Story Grammar
Story Schemas, Scripts, and Prototypes
Stress
Structuralism
Stylistics
Stylometrics
Subjacency Principle
Suggestion Structure
Syllable

xiii

List of Entries
Synchrony and Diachrony
Syntactic Change
Syntax
Syntax, Acquisition of
Syntax, Evolution of
Syntax, Neurobiology of
Syntax, Universals of
Syntax-Phonology Interface

T
Teaching Language
Teaching Reading
Teaching Writing
Temporal Lobe
Tense
Text
Text Linguistics
Thalamus
Thematic Roles
Theory of Mind and Language Acquisition
Tone
Topicalization
Topic and Comment
Traces

xiv

Transformational Grammar
Translation
Truth
Truth Conditional Semantics
Two-Word Stage
Typology

Verbal Reasoning
Verbal Reasoning, Development of
Verifiability Criterion
Verse Line
Voice
Voice Interaction Design

Underlying Structure and Surface


Structure
Universal Grammar
Universal Pragmatics
Universals, Nongenetic
Usage-Based Theory
Use and Mention

Wernickes Area
Word Classes (Parts of Speech)
Word Meaning
Word Order
Word Recognition, Auditory
Word Recognition, Visual
Words
Working Memory and Language
Processing
Writing, Origin and History of
Writing and Reading, Acquisition of
Writing and Reading, Neurobiology of
Writing Systems

V
Vagueness
Verbal Art, Evolution and
Verbal Art, Neuropsychology of
Verbal Display
Verbal Humor
Verbal Humor, Development of
Verbal Humor, Neurobiology of

X
X-Bar Theory

A NOTE ON CROSS-REFERENCES AND THE ALPHABETIZATION


OF THE ENTRIES

Cross-references are signaled by small capitals (boldface when implicit). They are designed
to indicate the general relevance of the cross-referenced entry and do not necessarily imply
that the entries support one another. Note that the phrasing of the cross-references does not
always match the entry headings precisely. In order to minimize the disruption of reading,
entries often use shortened forms of the entry headings for cross-references. For example, this
process involves parietal structures points to the entry Parietal Lobe. In some cases, a
cross-reference may refer to a set of entries. For example, architectures of this sort are found
in connectionism alerts the reader to the presence of entries on connectionism generally,
rather than to a single entry. Finally, a cross-reference may present a heading in a different
word order. For example, the target entry for here we see another universal of phonology would be listed as Phonology, Universals of.
In general, entries with multiword headings are alphabetized under their main language
term. Thus, the entry for Universals of Phonology is listed as Phonology, Universals of. The
main exceptions to this involve the words language and linguistic or linguistics, where another
term in the heading seemed more informative or distinctive in the context of language sciences
(e.g., Linguistic Market is listed as Market, Linguistic).

xv

PREFACE: ON THE VERY IDEA OF LANGUAGE SCIENCES

A title referring to language sciences tacitly raises at least three


questions. First, what is a science? Second, what is language?
Finally, what is a language science? I cannot propose answers
to these questions in a short preface. Moreover, it would not be
appropriate to give answers here. The questions form a sort of
background to the essays and entries in the following pages,
essays and entries that often differ in their (explicit or implicit)
answers. However, a preface of this sort can and should
indicate the general ideas about science and language that
governed the development of The Cambridge Encyclopedia of
the Language Sciences.

WHAT IS SCIENCE?
Philosophers of science have often been concerned to defi ne a
demarcation criterion, separating science from nonscience. I
have not found any single criterion, or any combination of criteria, compelling in the sense that I have not found any argument that, to my satisfaction, successfully provides necessary
and sufficient conditions for what constitutes a science.
In many ways, ones acceptance of a demarcation criterion is
guided by what one already considers to be a science. More
exactly, ones formulation of a demarcation criterion tends to
be a function of what one takes to be a paradigmatic science or,
in some cases, an exemplary case of scientific practice.
The advocates of strict demarcation criteria meet their mirror opposites in writers who assert the social construction of
science, writers who maintain that the difference between
science and nonscience is simply the difference between distinct positions within institutions, distinct relations to power.
Suppose we say that one discipline or theory is a science and
another is not. Th is is just to say that the former is treated as
science, while the latter is not. The former is given authority in
academic departments, in relevant institutions (e.g., banks, in
the case of economics), and so on.
Again, this is not the place for a treatise on the philosophy
of science. Here it is enough to note that I believe both sides
are partially correct and partially incorrect. First, as already
noted, I do not believe that there is a strict, defi nitive demarcation criterion for science. However, I do believe that there is

a complex of principles that roughly defi ne scientific method.


These principles do not apply in the same way to chemical
interactions and group relations and that is one reason why
narrow demarcation criteria fail. However, they are the same
general principles across different domains. Very simply,
scientific method involves inter alia the following practices:
1) the systematic study of empirically ascertainable patterns
in a given area of research; 2) the formulation of general principles that govern those patterns; 3) the attempt to uncover
cases where these principles do not govern observed patterns; 4) the attempt to eliminate gaps, vagueness, ambiguity,
and the like from ones principles and from the sequences of
principles and data that produce particular explanations; and
5) the attempt to increase the simplicity of ones principles
and particular explanations. Discourses are scientific to the
extent that they routinely involve these and related practices.
Note that none of this requires, for example, strict falsification or detailed prediction. For example, social phenomena
are most often too complex to allow for significant prediction,
in part because one cannot gather all the relevant data beforehand. Th is does not mean that they are closed to systematic
explanations after the fact, as more data become available.
Of course, following such methodological guidelines is not
all there is to the actual practice of science. There are always
multiple options for formulating general principles that fit
the current data. The evaluation of simplicity is never entirely
straightforward. Theories almost invariably encounter anomalous data in some areas and fail to examine other areas.
Moreover, in many cases, the very status of the data is unclear.
Despite all this, we hierarchize theories. We teach some and do
not teach others. Agencies fund some and do not fund others.
The very nature of the enterprise indicates that even in ideal circumstances, this cannot be purely meritocratic. Moreover, real
circumstances are far from ideal. Thus, in the real world, adherence to methodological principles may be very limited (see, for
example, Faust 1984, Mahoney 1977, and Peters and Ceci 1982).
Th is is where social constructionism enters. It seems undeniable that relations of institutional power, the political economy
of professions, and ideologies of nation or gender guide what is
institutionalized, valued, funded, and so forth.

xvii

Preface
In putting together a volume on science, then, I have tried
to incorporate the insights of both the more positive views
of science and the more social constructionist views. Put in
a way that may seem paradoxical, I have tried to include all
approaches that fit the loose criteria for science just mentioned.
I believe that these loose criteria apply not only to paradigmatic
sciences themselves but also to many social critiques of science that stress social construction. I have therefore included a
wide range of what are sometimes called the human sciences.
Indeed, the volume could be understood as encompassing the
language-relevant part of the human sciences which leads to
our second question.

Group Dynamics
Society

Mind

Individual Interactions

Mental Representations
Intentions

WHAT IS LANGUAGE?
Like science, ones defi nition of language depends to a great
extent on just what the word calls to mind. Ones view of language is likely to vary if one has in mind syntax or semantics, hearers or speakers, dialogues or diaries, brain damage
or propaganda, storytelling or acoustic phonetics. A fi rst
impulse may be to see one view of language as correct and the
others as false. And, of course, some views are false. However,
I believe that our understanding of language can and, indeed,
should sustain a great deal of pluralism.
In many ways, my own paradigm for human sciences is cognitive science. Cognitive science brings together work from a
remarkable array of disciplines literally, from Anthropology
to Zoology. Moreover, it sustains a range of cognitive architectures, as well as a range of theories within those architectures. Thus, it is almost by its very nature pluralistic. Of course,
some writers wish to constrain this pluralism, insisting that one
architecture is right and the others are wrong. Certainly, one
can argue that particular architectures are wrong. However,
perhaps the most noteworthy aspect of cognitive science is
that it sustains a series of types of cognitive architecture. In
Cognitive Science, Literature, and the Arts (2003), I argued that
these types capture patterns at different levels of analysis. Thus,
all are scientifically valuable.
More exactly, we may distinguish three levels of cognitive
analysis: bodies, minds, and groups or societies. These levels stand in a hierarchical relation such that bodies are more
explanatorily basic than minds, and minds are more explanatorily basic than groups or societies. Lower levels provide
causally necessary principles for higher levels. Minds do not
operate without brains. People without minds do not interact
in groups. In other words, lower levels explain higher levels.
However, higher-level patterns provide interpretive principles
for understanding lower levels (see interpretation and
explanation). We explain the (mental) feeling of fear by reference to the (bodily) operation of the amygdala. But, at the
same time, we understand amygdala activity as fear because
we interpret that activity in terms of the mental level.
In the analysis of cognition, the most basic, bodily cognitive architecture is provided by neurobiology. However, due
to the intricate particularity of neurobiology, we often draw
on more abstract associative models at this level. These models serve to make the isolation and explanation of patterns
less computationally complex and individually variable. The

xviii

Body

Associative Networks
Brains

Figure 1. Levels of cognitive analysis. Between the levels, black


arrows represent the direction of explanation, while hollow arrows represent the direction of interpretation. Within the levels, the superior items
are more computationally tractable or algorithmically specifiable models
of the inferior items, either singly (in the case of brains and intentions)
or collectively (in the case of individual interactions). Tractability may be
produced by simplification (as in the case of bodily architectures), by
systematic objectification (as in the case of mental architectures), or by
statistical abstraction (as in the case of social analysis).

most important architectures of the latter sort are found in


connectionism.
As a wide range of writers have stressed, the distinctive feature of mind our second level of analysis is intentionality. However, intentionality, as subjective and experiential, is
often not well suited for scientific study. Many theorists have
therefore sought to systematize and objectify our understanding of mind. Most cognitive treatments of the mental level
have their roots in folk psychology, a minimal, commonsense objectification of intention in terms of beliefs and aims.
But these cognitive treatments draw on empirical research
and principles of scientific method to develop models of the
human mind that are sometimes very far removed from folk
psychology. Specifically, they most often replace belief by
mental representations and algorithmically specifiable
operations on those representations. We may therefore refer to
these models as representational. Representationalism serves
to make intention more tractable through a mentalistic architecture that is precisely articulated in its structures, processes,
and contents.
Finally, our treatment of societies may be loosely divided
into the more intentional or mental pole of individual interaction and the less subjective, more broadly statistical pole of
group dynamics. (See Figure 1.)
These divisions apply to language no less than they apply
to other areas of human science. We draw on our representational account of syntax to understand certain brain processes in the perisylvian cortex . Conversely, we explain

Preface
the impairment of (mental) syntactic capacities by reference
to (bodily) lesions in that area. For our purposes, the crucial
part of this analysis is its implication that language includes
all three levels and that the sciences of language should
therefore encompass brains, associative networks, intentions,
mental representations, individual interactions, and group
dynamics. Th is takes us to our third question.

WHAT IS A SCIENCE OF LANGUAGE?


The preceding sections converge on a broad, pluralistic but
not indiscriminate account of what constitutes a language
science. Specifically, a language science is the application
of general principles of scientific method to language phenomena at any level. At the level of brain, we have neurolinguistics (see brain and language). At the level of
associative networks, we have connectionism. Intentionalism
leads us to certain forms of ordinary language philosophy. Representational architectures are particularly well
developed, including Noam Chomskys various theories (see,
for example, minimalism), cognitive linguistics, and
other approaches. Personal interaction and group dynamics
are taken up in pragmatics, linguistic discourse analysis, and sociolinguistics. Just as language encompasses
patterns at all these levels, language science necessarily
includes systematic accounts of language at all these levels.
Again, the levels of language are interrelated without being
reducible. Similarly, the various sciences are interrelated
systematically interrelated through upward explanation
and downward understanding or interpretation without
being reducible.
The preceding points should serve to clarify something that
is obvious, but rather vague, in ordinary speech: Language
science is not the same as language. Language science is a
systematic treatment of language that seeks to provide both
explanation and understanding. Thus, an encyclopedia of
the language sciences does not present the same material as
an encyclopedia of language. It presents the current state of
theoretical explanation and understanding (along with some
historical background that is important for contextualizing
current theories). It does not present the current state of knowledge about particular features of particular languages except
insofar as these features enter into research programs that aim
toward broader explanatory accounts or principles of more
general understanding. Thus, the phonology of Urdu, the
morphology of Quechua, the metrical principles of English
verse lines, and the story and discourse structure of
Chinese narratives enter into the following essays and entries
only insofar as they enter into larger theoretical concerns.
Of course, to say this is only to mark out a general area for an
encyclopedia of the language sciences. It does not determine
precisely what essays and/or entries should make up such a
work. Th is leads to our fi nal concern.

THE STRUCTURE OF THE VOLUME


The preceding view of language science guided the formulation
of topics for the entries and the organization of the introductory

essays. However, it was not the only factor. In language sciences,


and indeed in human sciences generally, we need to add two
further considerations. The preceding analysis implicitly treats
language patterns as if they are comparable to any patterns
isolated in the natural sciences. However, there are two differences between patterns in language and, say, the patterns isolated by physicists. First, language patterns are mutable. They
are mutable in three ways at the level of groups, at the level of
individual minds or brains, and at the level of common genetic
inheritance. Insofar as language patterns change at the level
of groups, this mutability is comprehended by group dynamics and related processes (most obviously in historical linguistics). But mental and neurobiological theories do not
necessarily treat the other two sorts of mutability, for such theories tend to focus on steady states of language. We therefore
account for changes in the individual mind or brain by reference to development or acquisition (see phonology, acquisition of; syntax, acquisition of ; and so on). We account
for changes in common genetic properties through the evolution of language (see phonology, evolution of; syntax,
evolution of; and so on).
The second difference between patterns in language and patterns isolated by physicists is related to the fi rst. Just as we may
be insufficient in language, we may be more than sufficient. In
other words, there is a difference between ordinary usage and
skilled usage. Rocks do not fall well or badly. They simply fall,
and they do so at the same rate. People, however, speak well
or badly, effectively or ineffectively, in a manner that is clichd
or unusually creative (see creativity in language use).
The point is most obvious in verbal art which leads us to the
most sweet and pleasing sciences of poetry, as Cervantes put
it (1950, 426).
In keeping with the preceding analysis, then, the main topics in language science are treated initially in a series of seven
overview essays. The fi rst essay provides a general introduction to the study of language. Its purpose is to orient readers
toward the field as a whole. The second and third essays turn to
the mental level of language since this is the most widely analyzed. Due to the amount of work in this area, and due to the
diversity of approaches, the treatment of this level is divided
into two chapters. The fi rst addresses formal aspects of language syntax, phonology, and so forth. The second takes up
meaning. The fourth and fi fth chapters address the other two
levels of language society (at the top) and the brain (at the
bottom). The latter also addresses the topics of genetics and
evolution, integrating these with the treatment of the brain. The
sixth chapter takes up language acquisition. Thus, it turns from
the evolution of the general language capacities of the human
brain to the development of the particular language competence of individual human minds. Finally, the seventh chapter considers the nonordinary use of language in verbal art.
The subsequent entries specify, elaborate, augment, and
revise the ideas of these essays. Here, of course, the editor of
a volume on language sciences faces the problem of just what
entries should be included. In other words, if language sciences
encompass the language-related part of neuroscience, social
science, and so forth, just what is that language-related part?
What does it include, and what does it exclude? One might defi ne

xix

Preface
this part very narrowly as including only phenomena that are
necessarily bound up with oral speech, sign languages,
or writing. More theoretically, one might defi ne this part as
comprising neurobiological, mental, or social phenomena that
occur only in connection with distinctive properties of speech,
signing, or writing.
Certainly, an encyclopedia treating language will focus
on phenomena that are inseparable from speech, sign languages, and/or writing and on such distinctive aspects of natural language as syntax. However, here, too, I believe it would
be a mistake to confi ne language sciences within a narrowly
defi ned domain. Therefore, I have adopted a looser criterion.
The volume centrally addresses distinctive properties of natural language. However, it takes up a wider range of phenomena
that are closely connected with the architectural or, even more
importantly, the functional features of speech, sign languages,
and writing.
There are several cognitive operations for which speech,
signing, and writing appear to have direct functional consequences. One is referential the specification, compilation,
and interrelation of intentional objects (see the entries on
reference). Here I have in mind phenomena ranging from
the division of the color spectrum to the elaboration of causal
relations. A second area is mnemonic the facilitation and partial organization of memory (see, for example, encoding). A
third is inferential the derivation of logical implications. A
fourth is imaginative the expansion and partial structuring
of simulation. One could think of the fi rst and second functions as bearing on direct, experiential knowledge of present
or past objects and events. The third and fourth functions bear,
rather, on indirect knowledge of actual or possible objects and
events. Two other functions are connected with action rather
than with knowledge. The fi rst is motivational the extension
or elaboration of the possibilities for emotional response (see
emotion and language). A fi nal area is interpersonal the
communication of referential intents, memories, inferences,
simulations, and motivations.
In determining what should be included in the volume,
I have taken these functions into account, along with architectural considerations. Thus I see issues of interpretation
and emplotment (one of the key ways in which we organize
causal relations) as no less important than phonology or syntactic structure. Of course, we have more fi rmly established
and systematic knowledge in some areas than in others. Thus
some entries will necessarily be more tentative, and make reference to a broader variety of opinion or a more limited research
base. But that is not a reason to leave such entries aside. Again,
the purpose of an encyclopedia of language science is not to
present a compilation of well-established particular facts, but
rather to present our current state of knowledge with respect to
explanation and understanding.
In keeping with this, when generating the entries (e.g.,
Phonology, Syntax, Neurobiology of Phonology,
Neurobiology of Syntax, Acquisition of Phonology, and so
on), I have tried to be as systematic as possible. Thus the volume includes some topics that have been under-researched
and under-theorized. For example, if neurobiology does in fact
provide a causal substrate for higher levels, then there should

xx

be important things to say, not only about the neurobiology


of syntax, but also about the neurobiology of pragmatics and the neuropsychology of verbal art. The fi rst has
certainly been more fully researched than the second or third.
But that is only more reason to stress the importance of the second and third, to bring together what research has been done,
and to point to areas where this research could be productively
extended.
While it is possible to be systematic with research areas, one
cannot be systematic with theories. Theories are more idiosyncratic. They differ from one another along many axes and cannot be generated as a set of logical possibilities. I have sought to
represent theories that have achieved some level of acceptance
in scientific communities. Given limitations of space, decisions
on this score have often been difficult particularly because
social constructionist and related analyses show that acceptance in scientific communities is by no means a straightforward function of objective scientific value.
Th is leads us once again to the issue of the validity of theories. It should come as no surprise that my view of the issue in
effect combines a pluralistic realism with a roughly Lakatosian
advocacy of research programs and a Feyerabend-like practical
anarchism (Feyerabend 1975; Lakatos 1970). Specifically, I take
it that some theories are true and others are not. However, I do
not believe that only one theory is true. Different theories may
organize the world in different ways. There is no correct way of
organizing the world (though some ways will be more useful
than others for particular purposes). On the other hand, once
the world is organized in a certain way, then certain accounts of
the world are correct and certain accounts are incorrect. To take
a simple example, we may divide the color spectrum in different ways (see color classification). No division is correct
or incorrect. But once we have a division, there are facts about
the color of particular objects. (Th is view is related to Donald
Davidsons (1984) argument that truth is not relative to a conceptual scheme, though it is, of course, relative to the meaning of ones words. It also may have some similarity to Hilary
Putnams (1981) internal realism, depending on how that is
interpreted.)
Moreover, even for one organization of the world, we can
never defi nitively say that a given theory is or is not true. Note
that this means we cannot even strictly falsify a theory. We can
refer to the ongoing success of a research program and that is
important. Yet I do not share Imre Lakatoss (1970) optimism
about research programs. To some extent, research programs
appear to succeed insofar as they have powerful institutional
support, often for not very good intellectual reasons. Here, then,
I agree with Paul Feyerabend (1975) that orthodoxy in theorization is wrong. It is wrong not only in explicitly or implicitly
identifying institutional support with validity. Thus, it is wrong
not only for social constructionist reasons. It is wrong also for,
so to speak, positivist reasons. It is wrong in diminishing the
likelihood of intellectual progress, the likelihood of increasing
the validity of our theories, which is to say the scope of explanation and understanding produced by these theories.
Whether or not this very brief sketch points toward a
good philosophy of science, it does, I believe, point toward
a good philosophy for an encyclopedia of science perhaps

Preface
particularly language science. I have tried to follow this philosophy throughout the volume. Specifically, I have sought to
present a range of theoretical ideas (as well as more theoryindependent topics), placing them together in such a way as
to encourage a mutual sharpening of ideas and insights. To
borrow M. M. Bakhtins terms (1981), I have not set out to provide a monological source of authoritative discourse. Rather, I
have sought to present a heteroglot volume with which readers may interact dialogically (see dialogism and heteroglossia) hopefully, to produce more intellectually adequate
theories later. Toward this end, I have encouraged authors to
be open about their own judgments and attitudes. There is a
common view that a piece of writing is biased if the speaker
frankly advocates one point of view. But, in fact, the opposite is
the case. A piece of writing is biased if a speaker acts as though
he or she is simply reporting undisputed facts, when in fact he
or she is articulating a partisan argument. Being open, dialogical, and multivocal does not mean being bland. Indeed, insight
is more likely to be produced through the tension among ideas
and hypotheses that are clearly delineated in their differences.
Th is is no less true in the language sciences than elsewhere.
Indeed, that is one reason why this volume treats language sciences, not the science of language.
Patrick Colm Hogan

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Bakhtin, M. M. 1981. The Dialogic Imagination: Four Essays. Ed.
Michael Holquist. Trans. Caryl Emerson and Michael Holquist.
Austin: University of Texas Press.
Cervantes Saavedra, Miguel de. 1950. The Adventures of Don Quixote.
Trans. J. M. Cohen . New York : Penguin.
Davidson, Donald. 1984. On the very idea of a conceptual scheme.
In Inquiries into Truth and Interpretation, 18398. Oxford: Oxford
University Press.
Faust, David. 1984. The Limits of Scientific Reasoning. Minneapolis:
University of Minnesota Press.
Feyerabend, Paul. 1975. Against Method: Outline of an Anarchistic
Theory of Knowledge. London: Verso.
Hogan, Patrick Colm. 2003. Cognitive Science, Literature, and the Arts:
A Guide for Humanists. New York : Routledge.
Lakatos, Imre. 1970. Falsification and the methodology of scientific
research programmes. In Criticism and the Growth of Knowledge,
ed. Imre Lakatos and Alan Musgrave, 91195. Cambridge: Cambridge
University Press.
Mahoney, Michael. 1977. Publication prejudices: An experimental
study of confi rmatory bias in the peer review system. Cognitive
Therapy and Research 1: 16175.
Peters, Douglas, and Stephen Ceci. 1982. Peer-review practices of psychological journals: The fate of published articles, submitted again.
Behavioral and Brain Sciences 5.2: 18795.
Putnam, Hilary. 1981. Reason, Truth, and History. Cambridge:
Cambridge University Press.

xxi

ACKNOWLEDGMENTS

First of all, I must thank Phil Laughlin, who (inspired by the exemplary MIT Encyclopedia of
the Cognitive Sciences) suggested the project initially and invited me to make a more formal
proposal. Without Phils initial idea and subsequent encouragement, this volume would not
now exist. Eric Schwartz took over from Phil, then Simina Calin took over from Eric; both were
supportive and helpful, as were the several editorial assistants, most recently April Potenciano,
Christie Polchowski, and Jeanie Lee. Regina Paleski and Mark Fox ably shepherded this complex project through the production process; Phyllis Berk worked with devotion on copy editing the manuscript; and Robert Swanson took on the tough job of indexing.
The members of the editorial board kindly provided comments on the list of entries and
suggested possible authors. They also served as second readers for most of the entries. I am
indebted to them all. It is difficult and unrewarding work, but extremely valuable. Some entries
were evaluated by specialists not on the editorial board. I am deeply grateful to the following
scholars who agreed to read and comment on entries: J. Abutalebi, E. Ahlsn, A. Aikhenvald,
S. Anderson, A. Atkin, S. Barker, J. Beall, D. Beaver, H. Bejoint, H. Ben-Yami, A. Berger,
D. Bickerton, A. Bilgrami, S. Blackmore, J. Blommaert, C. Bowern, E. Brenowitz, J. Bybee,
J. Carroll, T. Deacon, M. DeGraff, J.-L. Dessalles, A. Edgar, C. Elgin, R. Ferrer i Cancho, J. Field,
H. Filip, D. Finkelstein, J. Forrester, R. Gibbs, R. Gibson, R. Giora, R. Gleave, K. Gluer-Pagin,
M. Goral, M. Hashimoto, J. Heath, D. Herman, R. Hilpinen, J. Hintikka, K. Hoff man, K. Holyoak,
P. Hutchinson, J. Hyun, P. Indefrey, M. Israel, K. Johnson, M. Johnson, J. Kane, P. Kay, A. Kibort,
S. Kiran, C. Kitzinger, W. Labov, B. Lafford, C. Larkosh, A. Libert, P. Livingston, K. Ludwig,
M. Lynch, J. Magnuson, G. Marcus, R. May, J. McGilvray, A. Mehler, S. Mills, D. Moyal-Sharrock,
K. Oatley, B. OConnor, L. Pandit, B. Partee, J. Pennebaker, P. Portner, C. Potts, J. Robinson,
S. Rosen, S. Ross, J. Saul, R. Schleifer, M. Shibatani, R. Skousen, S. Small, W. Snyder, M. Solms,
F. Staal, P. Stockwell, L. Talmy, H. Truckenbrodt, J. P. Van Bendeghem, W. van Peer, S. Wheeler,
and L. Zhang. Thanks to M. Cutter for help with the illustrations.
For some time, a graduate assistant, Karen Renner, took care of many secretarial duties.
Th is work was supported by the English Department at the University of Connecticut, with
some added funding from the University of Connecticut Research Foundation. The support of
the English Department was due to the kindness and commitment of our department head,
Bob Tilton without his help, this project would not have been possible.
Work for the entry on Ad Hoc Categories was supported by National Science Foundation
Grant BCS-0212134 and DARPA Contract FA865005-C-7256 to Lawrence W. Barsalou.
The entry on Dyslexia was prepared with support from a British Academy Research
Readership to Margaret J. Snowling.
Preparation of the manuscript for Speech Production was supported by NIDCD A-93 and
NIDCD grant DC-03782, both to Haskins Laboratories.
Research for Paisley Livingstons entries benefited from fi nancial support from the Research
and Postgraduate Studies Committee of Lingnan University, Hong Kong.

xxiii

1
LANGUAGE STRUCTURE IN ITS HUMAN
CONTEXT: NEW DIRECTIONS FOR THE LANGUAGE
SCIENCES IN THE TWENTY-FIRST CENTURY
William Croft

The science of language in the twenty-fi rst century is likely to


expand its scope compared to that of the twentieth century.
The twentieth-century science of language focused its attention on the analysis of language structure: the sound system
of a language (phonology) and its grammatical system
(morphology and syntax). The analysis of linguistic structure, or form, is central to the science of language. After the
middle of the twentieth century, however, greater attention
was placed on the relationship between language form and its
psychological and social context.
The analysis of linguistic structure will remain central to
the science of language. However, understanding language
in context will undoubtedly be a crucial feature of language
science in the twenty-fi rst century. Th is essay focuses on the
basic principles that have emerged in research on language in
its social and cognitive context, the ways that this context constrains language structure and use, and the new directions in
research implied by the integration of language structure and
context. Th is essay is necessarily selective in the topics covered, and the selection represents a particular way to integrate
language form and its context. It also brings together theories
that have originated in philosophy, psychology, and sociology,
as well as different branches of linguistics. Such effort is necessary in order to treat language as a unitary phenomenon, and
also to relate central questions of linguistic analysis to other
scientific domains. Language structure cannot be fully understood without situating it with respect to current theories of
joint action, social cognition, conceptualization of experience,
memory and learning, cultural transmission and evolution,
shared knowledge and practice in communities, and demographic processes in human history.

WHY TALK? THE PRAGMATICS OF LANGUAGE


Why do we talk? Why does language exist? It is only by answering these questions that we can understand how language fits
in its context. The answer is that language plays an essential
role in social interaction, fundamentally at the level of joint
action between two or more individuals (Clark 1996; Tomasello,
2008). What makes a joint action joint is that it is more than just

the sum of individual actions performed by separate persons;


in particular, each individual involved must take into consideration the other individuals beliefs, intentions, and actions in
a way that can be described as cooperative. A shared cooperative activity between two individuals can be defi ned in terms
of a set of attitudes held by the cooperating individuals and as
a way of carrying out the individual action (Bratman 1992). The
attitudes are as follows:
(a) Each individual participant intends to perform the
joint action. That is, each participants intention is not
directed simply toward his/her individual action but
toward the joint action that is carried out by both participants together.
(b) Each participant intends to perform the joint action
in accordance with and because of each ones meshing
subplans. That is, each participants individual actions are
intended to mesh with the other participants actions in
order to successfully achieve the joint action.
(c) Neither participant is coercing the other.
(d) Each participant has a commitment to mutual support. That is, each one will help the other to carry out the
subplans; each participant is thus responsible for more than
just execution of his/her own subplan.
(e) All of (a)(d) are common ground, or shared knowledge between the individuals. The concept of common
ground plays a central role in understanding the function
of language in social interaction; it is discussed more fully
toward the end of this essay.
Finally, in addition to these mental attitudes on the part
of the participants, there must be mutual responsiveness in
action. That is, the participants will coordinate their individual
actions as they are executed in order to ensure that they mesh
with each other and, hence, that the joint action will be successfully carried out (to the best of their abilities). Coordination
is essential in carrying out joint actions successfully, and this is
where language plays a central role in joint actions.
The social cognitive abilities necessary for shared cooperative activity appear to be unique to humans, providing what
Michael Tomasello (2008) calls the social cognitive infrastructure necessary for the evolution of the capacity for modern
human language. Other species than humans have a capacity
for imitative learning of complex vocalizations (see animal
communication and human language). Th is has not
been sufficient to lead to the evolution of human-like language
among these species. Nonhuman primates have the ability to
plan actions and to recognize regularities in behavior of other
creatures, enough to manipulate their behavior. These abilities
are preconditions for executing complex actions such as joint
actions, but they are not sufficient for doing so.
Research on primate behavior in natural and experimental settings suggest that some primates even have the ability
to recognize conspecifics as beings with intentional states like
themselves in some circumstances (Tomasello, 2008; this ability develops in humans only at around nine months of age).
Nevertheless, it has not been demonstrated that nonhuman primates have the ability to engage in shared cooperative activity

The Cambridge Encyclopedia of the Language Sciences


as already defi ned. Tomasello (ibid.) suggests that in particular, helpfulness, Michael Bratmans condition (d), may be critical to the evolution of the ability to carry out joint actions.
The fi nal condition for joint action is that the individual
actions must be coordinated in accordance with the shared
attitudes of the participants. Any joint action poses coordination problems between the participants (Lewis 1969). Any
means that is used to solve the coordination problem on a
particular occasion is a coordination device. There are various coordination devices that human beings use to solve the
coordination problems of joint actions, of which the simplest is
joint attention to jointly salient properties of the environment
(Tomasello 1999, 2003). But by far the most effective coordination device is for the participants to communicate with each
other: By communicating their mental states, the participants
greatly facilitate the execution of any joint action.
communication is itself a joint action, however. The
speaker and hearer must converge on a recognition of the
speakers intention by the hearer (see communicative
intention; see also cooperative principle). Th is is H.
Paul Grices defi nition of meaning ([1948] 1989), or Herbert
Clarks informative act (Clark 1992; see the next section). And
this joint action poses coordination problems of its own. The
essential problem for the joint action of communication is that
the participants cannot read each others minds. Language is
the primary coordination device used to solve the coordination problem of communication, which is in turn used to solve
the coordination problem for joint actions in general. Indeed,
that is the ultimate purpose of language: to solve the coordination problem for joint actions, ranging from the mundane to the
monumental (Clark 1999). Th is fact is essential for understanding the structure of discourse and the linguistic expressions
used in it, as Clark (1992, 1996) has shown for many aspects of
conversational interaction, and it also accounts for many fundamental properties of linguistic structure.

LANGUAGE, COMMUNICATION, AND CONVENTION


A language can be provisionally described as a conventional
system for communication (this defi nition is modified later in
this section). David Lewis (1969) and Clark (1996, Chapter 5)
defi ne convention as follows:
(i) A regularity in behavior
(ii) that is partly arbitrary (that is, we could have equally
chosen an alternative regularity of behavior),
(iii) that is common ground in the community,
(iv) as a coordination device
(v) for a recurrent coordination problem.

butterfly) or a grammatical construction (such as the ModifierHead construction for English noun phrases) emerges as a convention when it becomes a regularly used means for solving the
recurrent coordination problem of referring to a specific experience that is to be communicated.
Linguistic convention actually operates at two levels: the
grammatical level of words and constructions, at which the
speakers intentions are formulated; and the phonological level
of the articulation and perception of the sounds that make up
the grammatical units. Th is is the phenomenon described as
duality of patterning in language (Hockett 1960). One could
imagine in principle that linguistic convention possessed only
one level: perceivable sounds (or gestures or written images,
depending on the medium), corresponding to part (i) in the
defi nition of convention, that directly conveyed the speakers
intention (the recurrent coordination problem) as a whole, corresponding to part (v) in the defi nition of convention. These
exist in interjections with specific functions such as Hello and
Thanks. However, most linguistic expressions are complex,
consisting of discrete, meaningful units. Complex linguistic
expressions evolved for two reasons: First, the number of different speaker intentions to be communicated grew to be indefinitely large; and second, a speakers intended message came to
be broken down into recurrent conceptual parts that could be
recombined to produce the indefi nite variety of messages.
Again, one could imagine that each conventional linguistic unit consisted of a unique sound (gesture, image). But languages have distinct meaningful units that are made up of
different sequences of the same sounds: bat, sat, Sam, same,
tame, time, etc. Th is system has evolved for the same two reasons: the increasing number of meaningful units (even the
recurring ones) necessary to convey the indefi nitely large number of speaker intentions, and an ability to break down a sound
signal (or gesture, or image) into parts that can be recombined
as a sequence of sounds (or gestures or images). Thus, the duality of patterning characteristic of human language has evolved
to accommodate the huge number of speaker intentions that
people want to convey, and to exploit the facts that intentions
can be broken down into recombinable conceptual units and
that the medium of expression can be broken down into recombinable units as well.
Language is therefore a joint action that operates simultaneously at four levels (Clark 1996). The higher-numbered levels
are dependent on the lower-numbered levels; the individual
actions of the interlocutors are given in italics:
(4) proposing and taking up a joint project (joint action);
(3) signaling and recognizing the communicative intention;
(2) formulating and identifying the proposition;
(1) producing and attending to the utterance.

In other words, conventions can emerge when members of


the community have shared knowledge that a certain repeated
behavior can act among them as a coordination device for a
recurrent coordination problem. Th is defi nition of convention
is general: It applies to conventions such as shaking hands (or
kissing on the cheek) for greeting, or driving on the right (left)
side of the road. The defi nition also applies straightforwardly to
language: A string of sounds (i.e., a word or morpheme, such as

The highest level corresponds to the illocutionary


force in speech-act theory (Austin 1962); the next level to
Gricean meaning, or the informative act (Clark 1992); the next
level to the propositional act (Searle 1969); and the lowest level to
the utterance act (Austin 1962; Searle 1969). Each level enables
the level(s) above it, and succeeds only if the level(s) below has
been successfully achieved (e.g., one cannot recognize the

Language Structure in Its Human Context


communicative intention if one did not pay attention to the
utterance produced).

THE INCOMPLETENESS OF CONVENTION


The model of language as joint action describes the social cognitive system that must have evolved in the human species for
modern human language to have emerged. It describes what
appears to be a stable system that led to the emergence of highly
complex cooperative activity among humans, namely, what is
called society or culture. But it is not a complete picture of the
nature of language in social interaction.
Linguistic convention can function as a coordination device
for communication because there are recurrent coordination
problems in communication: People have repeatedly wished
to convey similar intentions formulated in similar concepts.
Convention, linguistic or otherwise, is a regularity of behavior that emerges in a community or society. But convention
must emerge from previous successful communication events
where a convention did not previously exist. In other words,
there must be a precedent: You and I use a coordination device
because we used it before (or observed it used before), and it
worked. Following a precedent is a coordination device, but it
is not (yet) convention; it is based not on regular behavior that
is mutually known in the community but only on previous successful uses that we are aware of (Lewis 1969).
Following a precedent cannot be the ultimate root of convention either. It always requires a successfully coordinated
communicative act as a precedent. The ultimate coordination
device is joint salience: Each participant can assume that in a
particular situation, certain features are salient to both participants (Lewis 1969). Joint salience is possible because humans
have the social cognitive capacity for joint attention to their
environment (Tomasello 2003). Joint attention forms a basis for
common ground, as discussed later in this article.
Linguistic convention, however, is not perfect; it does not
trump or replace the nonconventional coordination devices of
precedent and joint salience in the act of communication. Th is
is partly because of the kind of conventions found in language,
and partly because of the nature of convention itself.
Linguistic conventions are incomplete because of the phenomena of indexicality and ambiguity (Clark 1996). A linguistic convention such as hat or find represents a type, but on
a particular occasion of use, we often intend to convey a particular token of the category. Thus, I found the hat communicates
a particular taking event involving a specific hat. In order to
identify which fi nding event and which hat, the interlocutors
must rely on joint salience in the context, facilitated in part by
the past tense of find and the article the combined with hat, to
coordinate successfully on the right fi nding event and the right
hat. Linguistic shifters, such as the pronoun I, more explicitly
require joint salience, namely, who is the speaker in the context. Proper names denote tokens, but even a proper name such
as William Croft may be (and is) used for more than one individual, for example, the contemporary linguist and the English
Baroque musical composer.
Most words are also highly ambiguous; that is, the same
regularity of behavior is used as a coordination device to solve

different recurrent coordination problems. For example, patient


is ambiguous between the linguistic semantic role (The patient
in sentence 25 is Roland ) and a role in the domain of medicine
(The patient in room 25 is Roland ). Linguistic convention alone
cannot tell which meaning is intended by the speaker. Only
joint salience, provided in the example sentences by the meanings of the other words and the broader context of conversation, will successfully solve the coordination problem of what
is meant by patient.
Indexicality and ambiguity are so pervasive in language that
no utterance can be successfully conveyed without recourse to
nonconventional coordination devices. But convention itself is
also incomplete. Th is is because every situation being communicated is unique and can be construed as the recurrence of
different coordination problems. The simplest example of this
phenomenon is that different words can be used to describe
the current situation, each representing a different construal
of the current situation in comparison to prior situations. For
example, one can refer to an individual as the prime minister,
Tony Blair, the Labour Party leader, my friend, that guy, he, etc.;
each expression construes reference to the current person as
the recurrence of a different coordination problem.
The need to use nonconventional coordination devices as
well as linguistic convention in communication is not generally
a problem for successful joint actions by cooperative human
beings. However, in some contexts, successful coordination
is quite difficult. For example, scholarly discourse on abstract
theoretical concepts often leads to alternative construals of
what is intended by particular scholars. What do we take Plato
to have meant? Th is changes over time and across persons.
Alternative construals, not always accurately described as
misunderstandings, occur in more everyday circumstances
as well, as readers can verify for themselves.
In addition, human beings are not always cooperative. The
complexity of language as joint action here leaves open many
possible means of language abuse. For example, lying abuses
linguistic convention in its role of helping coordinate a shared
cooperative activity, namely, coming to a shared belief. Other
types of language abuse exploit nonconventional coordination
devices. For example, in one lawsuit, the courts ordered a government agency to destroy certain documents, intending the
term to denote their information content; the agency destroyed
the documents, that is, the physical objects, after making copies of them (Bolinger 1980). Here, the ambiguity of documents
requires recourse to joint salience, but the agency abused this
nonconventional coordination device (the lawsuit was about
privacy of information). Finally, the fact that a current situation
can be construed as an instance of different recurrent coordination problems leads to alternative framings of the situation,
such as referring to an entity as a fetus or an unborn baby. These
alternative framings bias the conceptualization of the current
situation in ways that invite certain inferences and courses of
action, rather than others.

THE LINGUISTIC SYSTEM IN CONTEXT


In the preceding sections, language is described as a conventional system for communication, and the role of convention

The Cambridge Encyclopedia of the Language Sciences


in language and of language in communication was discussed.
In this section, the linguistic system is described in broad outline. Linguistic structure has been intensively studied over the
past century ever since Ferdinand de Saussure inaugurated
the modern analysis of linguistic structure, Structuralism
(Saussure [1916] 1966). Th is section focuses on those aspects of
linguistic structure that are generally agreed upon and shows
the extent to which they emerge from the principles that have
been presented in the preceding section.
The most fundamental structuralist principle is the centrality of the linguistic sign or symbol, that is, the notion that language pairs form and meaning, and that particular linguistic
forms convey particular meanings. Th is principle fits directly
with the defi nition of convention. The regularity in behavior in
part (i) of the defi nition of convention is the expression of a linguistic form by a speaker; the recurrent coordination problem
in part (v) of the defi nition is the communication of a meaning
between the interlocutors.
Also central to the structural analysis of language is the
arbitrariness of the linguistic sign. That is, arbitrariness exists
in the particular form and meaning that are paired. Th is conforms with part (ii) of the defi nition of convention, namely,
that the convention is partly arbitrary. Arbitrariness is usually defi ned in structuralist analysis as the principle that one
cannot entirely predict the form used from the meaning that is
intended. From a communicative point of view, arbitrariness
means that another choice could have served approximately
equally well. For example, the choice of producing the string
of sounds butterfly for a particular meaning could have been
replaced with the choice of producing the string of sounds
Schmetterling a choice made by members of the German
speech community. Two different choices are communicatively equivalent in that neither choice is preferred for the
meaning intended and that is usually because the choice of
one expression over the other is arbitrary in the structuralist
sense.
Another principle that can be traced back to Saussure is
the distinction between the paradigmatic and syntagmatic
contrast of linguistic units. In a complex (multiword or multimorpheme) grammatical construction, such as The cat sat on
the mat, each word enters into two different types of contrast.
For example, the fi rst word the contrasts with the word cat in
that the s role in the construction (determiner) contrasts with
cat s role (head noun). Th is is a syntagmatic contrast. But the
also contrasts with another possible fi ller of the same role in
the construction, such as a in A cat sat on the mat ; and cat contrasts with hamster, parakeet, etc. in the same way. These are
paradigmatic contrasts.
More recent grammatical theories represent paradigmatic
contrast in terms of a set of elements belonging to a grammatical category. Thus, the and a belong to the category determiner,
and cat, hamster, parakeet, etc. belong to the category noun.
Syntagmatic contrasts are represented by contrasting roles in
the syntactic structure or constructions used in the utterance.
For example, the determiner category is functioning as a modifier of the noun category in a noun phrase construction. Finally,
the syntagmaticparadigmatic distinction also applies to phonology (sound structure): Paradigmatic contrast is represented

by phonological categories, and syntagmatic contrasts by the


phonological structure of words and larger prosodic units.
The syntagmaticparadigmatic distinction is the most basic
way to describe the fact that the linguistic system allows a (re-)
combination of meaningful units in different ways. The adaptive motivation for the emergence of such a communication
system was described previously: The number of intentions to
be communicated is so great that a set of simple (atomic) symbols will not suffice, but experience is such that it can be broken
down into recurrent parts for which conventional linguistic
expressions can develop. The same motivations gave rise to the
syntagmaticparadigmatic distinction in phonology as well.
Paradigmatic principles of structure in grammar and phonology are represented in terms of linguistic categories, phonological and grammatical. These abstract linguistic categories
can be mapped onto the substantive categories of the actual
phonetic realization (for phonology) and of utterance meaning
(for grammar). Linguistic typology (Comrie 1989; Croft 2003),
which takes a cross-linguistic perspective on grammatical
analysis, has demonstrated that the ways in which phonological categories are mapped onto phonetic space, and grammatical or lexical categories are mapped onto conceptual space, are
not unlimited. For example, phonetic similarities and conceptual similarities constrain the structure of phonological and
grammatical categories, respectively.
Syntagmatic principles of structure are represented in various ways, but all such representations reflect another basic
principle, the hierarchical organization of the structure of
utterances. Sentences are organized in a hierarchical structure, representing groupings of words at different levels. So
The cat sat on the mat is not just a string of roles that contrast
syntagmatically, as in [Determiner Noun Copula Preposition
Determiner Noun]. Instead, it is a set of nested groupings of
words: [[Determiner Noun] [Copula] [Preposition [Determiner
Noun]]]. The nested groupings are frequently represented in a
variety of ways, such as the syntactic trees of phrase (constituent) structure analysis. They can also be represented as dependency diagrams (for example, the determiner is related to the
noun as its modifier, which in turn is related to the copula as
its subject), and representations combining constituency and
dependency also exist.
The structure of a construction often appears to be motivated,
though not entirely predicted, by the structure of the meaning
that it is intended to convey. For example, the syntactic groupings in [[The cat] is [on [the mat]]] are motivated semantically;
the in the cat modifies cat semantically as well as syntactically
(indicating that the cats identity is known to both speaker and
hearer). The (partial) motivation of syntactic structure by its
meaning is captured by general principles in different theories.
These principles can be described as variants of the broader
principle of diagrammatic iconicity (Peirce 1932): roughly, that
the abstract structure of the linguistic expression parallels the
abstract structure of the meaning intended, to a great extent.
It is difficult to evaluate the structure of meaning independently of the structure of linguistic form. However, different
speech communities settle on a similar range of constructions
to express the same complex meaning the regularities discovered in linguistic typology (see, for example, the studies

Language Structure in Its Human Context


published in Typological Studies in Language and the Oxford
Studies in Typology and Linguistic Theory). Th is fact suggests
that there are regularities in the meaning to be conveyed that
are then reflected in the grammatical constructions used to
express them.

GRAMMAR AND THE VERBALIZATION OF EXPERIENCE


The preceding sections have described the general context of
language use and the basic principles of language structure.
The grammars of particular languages conform to the basic
principles of language structure in the preceding section. But
the grammars of particular languages, while diverse in many
ways, are similar to a much greater degree than would be predicted from the general principles in the preceding section, or
even the context of language use described in the earlier sections. For example, all languages have structures like clauses
in which some concept (prototypically an action concept,
usually labeled a verb) is predicated on one or more concepts
that are referred to (prototypically an object or person, usually
labeled a noun). The noun-like expressions are in turn organized into phrases with modifiers. Clauses are related to each
other by varying degrees of grammatical integration. Certain
semantic categories are repeatedly expressed across languages
as grammatical inflections or function words (e.g., articles,
prepositions, auxiliaries) that combine with the major lexical
categories of words in sentences.
These universal patterns in grammar are attributable to the
way that experience is verbalized by human beings. The fundamental problem of verbalization is that each experience that
a speaker wishes to verbalize is a unique whole. But a linguistic utterance is unlike an experience: An utterance is broken
down into parts, and these parts are not unique; they have
been used before in other utterances. (Th is latter point is the
fact of convention; a particular linguistic form is used regularly
and repeatedly for a recurrent coordination problem.)
The process by which the unique whole of experience is
turned into a linguistic utterance made up of reusable parts has
been described by Wallace Chafe (1977). The fi rst step is that
the speaker subchunks the experience into smaller parts, each
also a unique Gestalt similar in this way to the original experience. The subchunking process may be iterated (in later work,
Chafe emphasizes how consciousness shifts from one chunk
to another in the experience to be verbalized). A subchunk of
the experience is then propositionalized; this is the second
step. Propositionalizing involves breaking up an experience by
extracting certain entities that are (at least prototypically) persistent, existing across subchunks. These entities are the referents that function as arguments of the predicate; the predicate
is what is left of the subchunk after the arguments have been
separated. Propositionalizing therefore breaks down the experience into parts arguments and the predicate that are not of
the same type as the original experience (i.e., not a Gestalt).
Once the whole has been broken down into these parts,
the parts must be categorized, that is, assigned a category that
relates the parts of the current experience to similar parts of
prior experiences. Categorizing is the third step in the verbalization process. These categories are what are expressed by

content words, such as nouns and verbs. In this way, the speaker
has transformed the unique whole of the original experience
into parts that can be expressed by language.
Th is is not the end of the verbalization process. Content
words denote only general categories of parts of the experience to be verbalized. In order to communicate the original
experience, the speaker must tie down the categories to the
unique instances of objects, events, and so forth in the experience, and the speaker must assemble the parts into a structure representing the original whole that the speaker intends
to verbalize. That is to say, corresponding to the categorizing
step in verbalizing the parts of the experience, there is a particularizing step that indicates the unique parts; and corresponding to the steps of propositionalizing and subchunking
are integrative steps of structuring and cohering, respectively
(Croft 2007). These latter three steps give rise to grammar in the
sense of grammatical constructions, inflections, and particles,
and the semantic commonalities among grammatical categories across languages.
The particularizing step takes a category (a type) and selects
an instance (token) or set of tokens, and also identifies it by situating it in space and time. For object concepts, selecting can
be accomplished via the inflectional category of number, and
via the grammatical categories of number and quantification
(three books, an ounce of gold ). For action concepts, selecting
is done via grammatical aspect, which helps to individuate
events in time (ate vs. was eating), and via agreement with subject and/or object, since events are also individuated by the
participants in them (I read the paper and She read the magazine describe different reading events). Objects and events can
be situated in space via deictic expressions and other sorts of
locative expressions (this book, the book on the table). Events
and some types of objects can be situated in time via tense and
temporal expressions (I ate two hours ago; ex-mayor). Events
and objects can also be situated relative to the mental states
of the interlocutors: The article in the book indicates that the
particular object is known to both speaker and hearer, and the
modal auxiliary in She should come indicates that the event
exists not in the real world but in the attitude of obligation in
the mind of the speaker.
The structuring step takes participants and the predicated
event in a clause and puts them together, reassembling the
predicate and the argument(s) into the subchunk from which
they were derived by propositionalizing. Grammatically this is
a complex area. It includes the expression of grammatical relations in what is called the argument structure of a predicate, so
that She put the clock on the mantle indicates which referent is
the agent (the subject), which the thing moved (the object), and
which the destination of the motion (the prepositional phrase).
But it also includes alternative formulations of the same event,
such as The clock was put on the mantle (the passive voice construction) and It was the mantle where she put the clock (a cleft
construction). The alternative constructions function to present the information in the proposition in different ways to the
hearer, depending on the way the discourse is unfolding; they
are referred to as information structure or discourse function.
Finally, the cohering step takes the clauses (subchunks) and
reassembles them into a whole that evokes the original whole

The Cambridge Encyclopedia of the Language Sciences


experience for the hearer. Th is step can be accomplished by
various clause-linking devices, including subordination of various kinds, coordination, and other clause-linking constructions found in the worlds languages. Coherence of clauses
in discourse is also brought about by discourse particles and
reference tracking, that is, grammatical devices, such as
pronouns or ellipsis, which show that an event is related to
another event via a shared participant (Harry filled out the form
and _ mailed it to the customs office).
The three steps of particularizing, structuring, and cohering
result in a grammatical structure that evokes a reconstituted
version of the original unique whole. These six steps in verbalization are not necessarily processed sequentially or independently. The steps in the verbalization process are dependent on
the grammatical resources available in the language, which
constrain the possibilities available to the speaker. For example, when a speaker takes a subchunk and extracts participants
from it, there must be a construction available in the language
to relate the participants to the predicate, as with put in the
earlier example. Thus, subchunking must be coordinated with
propositionalizing and structuring. Also, the steps may not be
overtly expressed by grammatical inflections or particles. For
example, The book fell does not overtly express the singular
number of book, or that the event is situated in the real world
rather than a nonreal mental space of the speaker.
Finally, the reconstituted experience evoked by the linguistic utterance is not the same as the unique whole with which
the speaker began. The cognitive processes humans use in verbalization do not simply carry out one or more of the six steps
described. They also conceptualize the experience in different
ways, depending on the speakers choices. These choices range
from the subtle difference between describing something as
leaves or foliage, or the more dramatic framing differences
between fetus and unborn baby referred to previously. There
are a wide range of conceptualization processes or construal
operations that have been identified in language (see, e.g.,
Langacker 1987; Talmy 2000). The construal operations can be
accounted for by processes familiar from cognitive psychology: attention, comparison, perspective, and Gestalt (Croft
and Cruse 2004, Chapter 4). These psychological processes are
part of the meaning of all linguistic units: words, inflections,
and constructions. As a consequence, every utterance presents
a complex conceptualization of the original experience that
the speaker intends to verbalize for the hearer. The conventionalized conceptualizations embodied in the grammatical
resources of a language represent cultural traditions of ways to
verbalize experience in the speech community.

VARIATION AND THE USAGE - BASED MODEL


One of the results of recent research on language structure
and language use is the focus on the ubiquity of variation in
language use, that is, in the verbalization of experience and its
phonetic realization. The ubiquity of variation in language use
has led to new models of the representation of linguistic knowledge in the mind that incorporate variation as an essential
characteristic of language. These models are more developed in
phonetics and phonology. The phonological model is described

fi rst and then recent proposals to apply it to grammar (syntax


and semantics) are examined.
One of the major results of instrumental phonetics is the
discovery that phonetic variation in speech is ubiquitous.
Variation in the realization of phonemes is found not just
across speakers but also in the speech of a single speaker. There
are at least two reasons why such variation in the speech signal
would exist. Presumably, the level of neuromuscular control
over articulatory gestures needed for identical (invariant) productions of a phoneme is beyond a speakers ability. At least as
important, the variation in the speech signal does not prevent
successful communication (or not enough of the time to lead
to the evolution of even fi ner neuromuscular control abilities
in humans).
There is evidence, moreover, that the mental representation of phonological categories includes the representation of
individual tokens of sounds and the words that contain them.
Speakers retain knowledge of fi ne-grained phonetic detail
(Bybee 2001; Pierrehumbert 2003). Also, there are many frequency effects on phonological patterns (Bybee 2001). For
example, higher-frequency forms tend to have more reduced
phonetic realizations of phonemes than lower-frequency
forms.
Finally, human beings are extremely good pattern detectors
from infancy on into adulthood. Infants are able to develop
sensitivity to subtle statistical patterns of the phonetic signals
they are exposed to. Th is type of learning, which occurs without actively attending to the stimulus or an intention to learn
is called implicit learning (Vihman and Gathercole, unpublished manuscript). It contrasts with explicit learning, which
takes place under attention from the learner particularly
joint attention between an infant learning language and an
adult and is involved in the formation of categories and symbolic processing. There is neuroscientific evidence that implicit
learning is associated with the neocortex and explicit learning
with the hippocampus (ibid.).
A number of researchers have proposed a usage-based or
exemplar model of phonological representation to account
for these patterns (Bybee 2001; Pierrehumbert 2003). In this
model, phonological categories are not represented by specific
phonetic values for the phoneme in the language, but by a cluster of remembered tokens that form a density distribution over
a space of phonetic parameters. The phonetic space represents
the phonetic similarities of tokens of the phonological category.
Th is model includes properties of implicit learning (the cluster of individual tokens) and explicit learning (the labeling of
the density distribution as representing tokens of, say, /e/ and
not /i/). Consolidation of token memories also takes place
individual tokens decay in memory, highly similar tokens are
merged, and the distribution of tokens can be restructured
but new tokens are constantly being incorporated into the representation and influencing it.
Marilyn Vihman and S. Kunnari (2006) propose three types
of learning for an exemplar model. First, there is an initial
implicit learning of statistical regularities of the sensory input.
Second, explicit learning of linguistic categories, such as the
words that are templates containing the sound segments, takes
place. Finally, a second layer of implicit learning of statistical

Language Structure in Its Human Context


regularities gives rise to probability distributions for each linguistic phonological and lexical category. The result of this
last layer of learning is the exemplar or usage-based model
described by Janet Pierrehumbert and Joan Bybee.
The application of the usage-based/exemplar model to
grammar is more complex. Most research in this area has
compared the range of uses of a particular word or grammatical construction. However, this does not represent the process
of language production (that is, verbalization), analogous to
the phonetic variation found in the production of phonemes.
Studies of parallel verbalizations of particular scenes demonstrate that variation in the verbalization of the same scene by
speakers in similar circumstances is ubiquitous, much like the
phonetic realization of phonological categories (Croft 2010).
There is also substantial evidence for frequency effects in
grammar. For example, English has a grammatical category of
auxiliary verb that has distinctive syntax in negation (I ca nt
sing vs. I didnt sing), questions (Can he sing? vs. Did he sing?).
These syntactic patterns are actually a relic of an earlier stage
of English when word order was freer; it has survived in the
auxiliaries of modern English because of their higher token
frequency (Bybee and Thompson 1997), as well as their semantic coherence. Frequency plays a central role in the historical
process of Grammaticalization (Hopper and Traugott 2003), in
which certain constructions develop a grammatical function
(more precisely, they are recruited to serve the particularizing,
structuring, and cohering steps of the verbalization process).
Part of the grammaticalization process is that the construction
increases in frequency; it therefore undergoes grammatical
and phonological changes, such as fi xation of word order, loss
of syntactic flexibility, and phonetic reduction (Bybee 2003). A
well-known example is the recruitment of the go + Infi nitive
construction for the future tense: She is going (to Sears) to buy
a food processor becomes future Shes going to buy a food processor, with no possibility of inserting a phrase between go and
the infi nitive, and is fi nally reduced to Shes gonna buy a food
processor.
Finally, early syntactic acquisition is driven by implicit
learning of patterns in the linguistic input (Tomasello 2003).
The process of syntactic acquisition is very gradual and inductive, involving an interplay between detection of statistical regularities and the formation of categories that permit productive
extension of grammatical constructions. Children occasionally
produce overregularization errors, and these are also sensitive
to frequency (more frequent forms are more likely to be produced correctly, and less frequent forms are more likely to be
subject to regularization).
A usage-based model of grammatical form and meaning
is gradually emerging from this research. An exemplar model
of grammatical knowledge would treat linguistic meanings as
possessing a frequency distribution of tokens of remembered
constructions used for that meaning. Those constructions
would be organized in a multidimensional syntactic space
organized by structural similarity (e.g., Croft 2001, Chapter 8)
and whose dimensions are organized by the function played
by the construction in the verbalization process. The meanings of constructions are themselves organized in a conceptual
space whose structure can be inferred empirically via cross-

linguistic comparison of the meanings expressed by grammatical categories and constructions. The typological approach to
grammar has constructed conceptual spaces for a number of
semantic domains using techniques such as the semantic map
model (see Haspelmath 2003 for a survey of recent studies) and
multidimensional scaling (Croft and Poole 2008).
To sum up, the usage-based/exemplar model can be applied
to both phonological patterns in words and grammatical structures in constructions. A speakers knowledge of language
is the result of the interplay between two learning processes.
One learning process is the tallying of statistical regularities of
tokens of words and constructions with a particular phonetic
realization, performing a particular communicative act in a
specific social interaction. The other is the organization of these
tokens into categories and the formation of generalizations that
allow the reuse or replication of these grammatical structures
to solve future coordination problems in communication.

VARIATION AND CHANGE: AN EVOLUTIONARY APPROACH


The view of language described in the preceding sections roots
both language structure and a speakers linguistic knowledge
in the individual acts of linguistic behavior that a speaker has
engaged in and will engage in. It is a dynamic view of language
in that linguistic behavior is essentially historical: a temporal series of utterances, each one linked to prior utterances as
repeated behavior to solve recurrent coordination problems in
social interaction. Each member of a speech community has a
history of his or her participation in linguistic events, either as
speaker or hearer. Th is history is remembered in the exemplarbased representation of that members linguistic knowledge,
but also consolidated and organized in such a way that each
unique experience is broken down and categorized in ways
that allow for reuse of words and constructions in future communication events.
Each time a speaker produces an utterance, he or she replicates tokens of linguistic structures sounds, words, and
constructions based on the remembering of prior tokens of
linguistic structures, following the principles of convention
and verbalization described earlier. However, the replication
process is never perfect: Variation is generated all of the time,
as described in the preceding section. The variation generated
in the process of language use can be called fi rst-order variation. Variation in replication is the starting point for language
change. Language change is an instance of change by replication (rather than inherent change); change by replication is the
domain of an evolutionary model of change (Hull 1988; Croft
2000).
Change by replication is a two-step process. The fi rst step is
the generation of variation in replication. Th is requires a replicator and a process of replication by which copies are produced
that preserve much of the structure of the original. In biological evolution, the canonical replicator is the gene, and the
process of replication takes place in meiosis (which in sexual
organisms occurs in sexual reproduction). Copies of the gene
are produced, preserving much of the structure of the original
gene. Variation is generated by random mutation processes
and by recombination in sexual reproduction.

The Cambridge Encyclopedia of the Language Sciences


In language, replication occurs in language use. The replicators are tokens of linguistic structures in utterances (called
linguemes in Croft 2000). These tokens are instances of linguistic behavior. The process of language change is therefore
an example of cultural transmission, governed by principles
of evolutionary change. The replication process in language
change is governed by the principle of convention. As we have
seen in the preceding section, variation is generated in the process of verbalization, including the recombination of linguistic
forms. Th is represents innovation in language change. Firstorder variation is the source of language change. Experiments
in phonological perception and production indicate that sound
change is drawn from a pool of synchronic variation (the title
of Ohala 1989). Indeterminacy in the interpretation of a complex acoustic signal can lead to reanalysis of the phonological
categories in that signal. Likewise, it appears that grammatical change is also drawn from a pool of synchronic variation,
namely, variation in verbalization. There is an indeterminacy
in the understanding of the meaning of a word or construction
because we cannot read each others minds, our knowledge of
linguistic conventions differs because we have been exposed
to different exemplars, and every situation is unique and can
be construed in different ways. Th is indeterminacy gives rise
to variation in verbalization (Croft 2010), and can lead to the
reanalysis of the mapping of function into grammatical form
(Croft 2000).
The second step of the evolutionary process is the selection
of variants. Selection requires an entity other than the replicator, namely, the interactor. The interactor interacts with its
environment in such a way that this interaction causes replication to be differential (Hull 1988). In biological evolution, the
canonical interactor is the organism. The organism interacts
with its environment. In natural selection, some organisms
survive to reproduce and therefore replicate their genes while
others do not; this process causes differential replication.
In language, selection occurs in language use as well. The
interactor is the speaker. The speaker has variant linguistic
forms available and chooses one over others based on his or
her environment. In language, the most important environmental interaction is the social relationship between speaker
and hearer and the social context of the speech event. Th is is, of
course, the realm of sociolinguistics (see, e.g., Labov 2001, and
the following section). Selection goes under the name of propagation in language change.
Selection (propagation) is a function of the social value that
variants acquire in language use. First-order variation does
not have a social value. Socially conditioned variation is second-order variation. Once a variant is propagated in a speech
community, it can lead to third-order variation, that is, variation in linguistic conventions across dialects and languages.
Linguistic diversity is the result of language change.
The evolutionary model requires a revision to the defi nition of language offered near the beginning of this essay. In the
evolutionary model, a language is a population of utterances,
the result of the employment of linguistic conventions in a
speech community. The linguistic system is the result of the
ways in which speakers have consolidated the uses of language
in which they have participated into their knowledge of the

conventions of the speech community. Each speakers systematic knowledge of his or her language is different, because of
differences in the range of language use to which each speaker
is exposed.

SPEECH COMMUNITIES AND COMMON GROUND


Language in this revised sense is the product of a speech community: the utterances produced by communicative interactions among speakers. A speech community is defi ned by its
social interactions involving language: Members of the speech
community communicate with one another, and the community is defi ned by communicative isolation from other communities. Communicative isolation is relative, of course, and
in fact the structure of human speech communities is far more
complex than the structure of biological populations.
Two related phenomena serve to defi ne communities: common ground and shared practice. Common ground plays an
essential role in defi ning joint action and convention, both central to understanding the nature of language. Common ground
consists of knowledge, beliefs, and attitudes presumed by two or
more individuals to be shared between them. Common ground
can be divided into two types: personal common ground and
communal common ground (Clark 1996, Chapter 4). Personal
common ground is shared directly in face-to-face interaction
by the persons. Personal common ground has two bases. The
fi rst is the perceptual basis: We share knowledge of what is in
our shared perceptual field. The perceptual basis is provided
by virtue of joint attention and salience, as mentioned earlier.
A shared basis for common ground has the following properties: The shared basis provides information to the persons
involved that the shared basis holds; the shared basis indicates
to each person that it provides information to every person that
the shared basis holds; and the shared basis indicates the proposition in the common ground (Clark 1996, 94). A basis for common ground varies in how well it is justified; hence, we may not
always be certain of what is common ground or not.
The second basis for personal common ground is a discourse basis. When I report on situations I have experienced
to you in conversation, and vice versa, these become part of
our personal common ground. Although we did not experience them perceptually together, we did experience the
reporting of them linguistically together. The discourse basis
thus involves joint attention (to the linguistic signal), as well as
the common ground of a shared language. The discourse basis
and the perceptual basis both require direct interaction by the
interlocutors. They correspond to social networks, which are
instrumental in language maintenance and change (Milroy
1987).
The other type of common ground is communal common
ground. Communal common ground is shared by virtue of
common community membership. A person can establish
common ground with a stranger if they both belong to a common community (e.g., Americans, linguists, etc.). Some communities are quite specialized while other communities are
very broad and even all-encompassing, such as the community
of human beings in this world, which gives rise to the possibility of communication in the fi rst place.

Language Structure in Its Human Context


Clark argues that the basis of communal common ground
is shared expertise. tienne Wenger, on the other hand, defi nes
communities of practice in terms of shared practice: Individuals
engage in joint actions together, and this gives them common
ground and creates a community (Wenger 1998). Wengers
defi nition of a community of practice, therefore, requires faceto-face interaction, like personal common ground. However,
shared practice can be passed on as new members enter the
community and share practice with remaining current members. Th is is cultural transmission and can lead to individuals
being members of the same community through a history of
shared practice, even if they do not interact directly with every
other member of the community.
Since communities are defi ned by shared practice, and
human beings engage in a great variety of joint actions with
different groups of people, the community structure of human
society is very complex. Every society is made up of multiple
communities. Each person in the society is a member of multiple communities, depending on the range of shared activities
he or she engages in. The different communities have only partially overlapping memberships.
As a consequence, the structure of a language is equally
complex. A linguistic structure a pronunciation, a word, a
construction is associated with a particular community, or
set of communities, in a society. A pronunciation is recognized as an accent characteristic of a particular community.
Words will have different meanings in different communities (e.g., subject is a grammatical relation for linguists but
a person in an experiment for psychologists). The same concept will have different forms in different communities (e.g.,
Zinfandel for the general layperson, Zin to a wine aficionado).
Thus, a linguistic convention is not just a symbol a pairing
of form and meaning but includes a third part, the community in which it is conventional. Th is is part (iii) of the defi nition of convention given in an earlier section. Finally, each
individual has a linguistic repertoire that reflects his or her
knowledge and exposure to the communities in which he or
she acts.
The choice of a linguistic form on the part of a speaker is
an act of identification with the community that uses it. Th is
is the chief mechanism for selection (propagation) in language
change: Ihe propagation of variants reflects the dynamics of
social change. More recent work in sociolinguistics has argued
that linguistic acts of social identity are not always passive: Individuals institute linguistic conventions to construct
an identity as well as to adopt one (Eckert 2000).

LANGUAGE DIVERSITY AND ITS ENDANGERMENT


Variation in language can lead to language change if it is propagated through a speech community. Social processes over
human history have led to the enormous linguistic diversity
we fi nd today a diversity that newer social processes also
threaten. The basic social process giving rise to linguistic
diversity is the expansion and separation of populations into
distinct societies. As groups of people divide for whatever reason, they become communicatively isolated, and the common
language that they once spoke changes in different directions,

leading to distinct dialects and eventually to mutually unintelligible languages.


Th is ubiquitous demographic process is reflected in the family trees of languages that have been constructed by linguists
working on genetic classification. These family trees allow for
the possibility of reconstructing not just protolanguages but
also the underlying social processes that are traced in them.
Even sociolinguistic situations that obscure family trees leave
linguistic evidence of other social processes. Extensive borrowing indicates a period of intensive social contact. Difficulty
in separating branches of a linguistic family tree indicates an
expansion through a new area but continued low-level contact between the former dialects. These can be seen in the dialect continua found in much of Europe, where the Romance,
Germanic, and Slavic peoples expanded over a mostly continuous terrain (Chambers and Trudgill 1998). Shared typological
(structural) traits may be due to intimate contact between languages with continued language maintenance, or to a major
language shift by a social group, resulting in a large proportion
of non-native speakers at one point in a languages history.
The spread of human beings across the globe led to the creation of a huge number of distinct societies whose languages
diverged. The number of distinct languages that have survived
until the beginning of the twenty-fi rst century is about 6,000.
Most linguists generally accept the hypothesis that modern
human language evolved just once in human history, probably no later than 70,000 to 100,000 years ago. So in principle,
all modern human languages may have originated in a single
common ancestor. Tracing back the actual lineages of contemporary languages deep into human prehistory appears to
be extremely difficult, if not impossible. Nevertheless, there is
no doubt that contemporary linguistic diversity is extremely
ancient in human history. What we can discover about linguistic history by the comparison of existing languages can potentially shed important light on human history and prehistory.
There are linguistic descriptions of a small proportion of existing human languages, though descriptive work
has increased and the overall quality of descriptions has
improved dramatically, thanks to advances in linguistic science throughout the twentieth century. It would be safe to say
that the diversity of linguistic structure, and how that structure is manifested in phonetic reality on the one hand and in
the expression of meaning on the other, is truly remarkable
and often unexpected. Many proposed universals of language
have had to be revised or even abandoned as a consequence,
although systematic analysis of existing linguistic descriptions
by typologists have revealed many other language universals
that appear to be valid. Linguistic diversity has revealed alternative ways of conceptualizing experience in other societies, as
well as alternative methods of learning and alternative means
for communication for the accomplishment of joint actions.
But the single most important fact about the diversity of
human language is that it is severely endangered. Of the 6,000
different languages extant today, 5,000 are spoken by fewer
than 100,000 people. The median number of speakers for a
language is only 6,000 (Crystal 2000). Many languages are no
longer spoken by children in the community, and therefore
will go extinct in another generation. The loss for the science

The Cambridge Encyclopedia of the Language Sciences


of language, and more generally for our understanding of
human history, human thought, and human social behavior, is immense. But the loss is at least as great for the speakers themselves. Language use is a mode of social identity, not
just in terms of identifying with a speech community but as
the vehicle of cultural transmission. The loss of languages, like
other linguistic phenomena, is a reflection of social processes.
The most common social processes leading to language loss are
disruption, dislocation, or destruction of the society (language
loss rarely occurs via genocide of its speakers). The enormous
consequences of language loss has led to a shift in linguistic
fieldwork from mere language description and documentation
to language revitalization in collaboration with members of
the speech community. But reversing language shift ultimately
requires a change in the social status of the speech community
in the local and global socioeconomic system.

SUMMARY
The scientific study of language in its pragmatic, cognitive,
and social context beginning in the latter half of the twentieth century is converging on a new perspective on language
in the twenty-fi rst century. Linguistic conventions coordinate
communication, which in turn coordinates joint actions. The
fragility of social interaction by individuals leads to creativity,
variation, and dynamism in the verbalization and vocalization
of language. Individual linguistic knowledge (the linguistic
system) reflects the remembered history of language use and
mediates processes of language change. The continually changing structure of society, defi ned by common ground emerging
from shared practices (joint actions), guides the evolution of
linguistic conventions throughout its history. Human history
in turn has spawned tremendous linguistic diversity, which
reflects the diversity of human social and cognitive capacity.
But the unchecked operation of contemporary social forces is
leading to the destruction of speech communities and the mass
extinction of human languages today.
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Austin, J. L. 1962. How to Do Things with Words. Cambridge: Harvard
University Press.
Bolinger, Dwight. 1980. Language, the Loaded Weapon. London:
Longmans.
Bratman, Michael. 1992. Shared cooperative activity. Philosophical
Review 101: 32741.
Bybee, Joan L. 2001. Phonology and Language Use. Cambridge
Cambridge University Press.
. 2003. Mechanisms of change in grammaticalization: The role
of frequency. In Handbook of Historical Linguistics, ed. Brian Joseph
and Richard Janda, 60223. Oxford: Blackwell.
Bybee, Joan L ., and Sandra A. Thompson. 1997. Th ree frequency effects
in syntax. In Proceedings of the 23rd Annual Meeting of the Berkeley
Linguistics Society, ed. Matthew L. Juge and Jeri O. Moxley, 37888.
Berkeley : Berkeley Linguistics Society.
Chafe, Wallace. 1977. The recall and verbalization of past experience. In Current Issues in Linguistic Theory, ed. Peter Cole, 21546.
Bloomington: Indiana University Press.
Chambers, J. K., and Peter Trudgill. 1998. Dialectology. 2d ed.
Cambridge: Cambridge University Press.

10

Clark , Herbert H. 1992. Arenas of Language Use. Chicago and


Stanford: University of Chicago Press and the Center for the Study of
Language and Information.
. 1996. Using Language. Cambridge: Cambridge University Press.
Clark , Herbert H.. 1999. On the origins of conversation. Verbum
21: 14761.
Comrie, Bernard. 1989. Language Universals and Linguistic Typology.
2d ed. Chicago: University of Chicago Press.
Croft, William. 2000. Explaining language change: An evolutionary
approach. Harlow, Essex : Longman.
. 2001. Radical Construction Grammar: Syntactic Theory in
Typological Perspective. Oxford: Oxford University Press.
. 2003. Typology and Universals. 2d ed. Cambridge: Cambridge
University Press.
. 2007. The origins of grammar in the verbalization of experience. Cognitive Linguistics 18: 33982.
. 2010. The origins of grammaticalization in the verbalization of
experience. Linguistics 48: 148.
Croft, William, and D. Alan Cruse. 2004. Cognitive Linguistics.
Cambridge: Cambridge University Press.
Croft, William, and Keith T. Poole. 2008. Inferring universals from
grammatical variation: Multidimensional scaling for typological
analysis. Theoretical Linguistics 34: 137.
Crystal, David. 2000. Language Death. Cambridge: Cambridge
University Press.
Eckert, Penelope. 2000. Linguistic Variation as Social Practice: The
Linguistic Construction of Identity in Belten High . Oxford: Blackwell.
Grice, H. Paul. [1948] 1989. Meaning. In Studies in the Way of Words,
21323. Cambridge: Harvard University Press.
Haspelmath, Martin. 2003. The geometry of grammatical meaning: Semantic maps and cross-linguistic comparison. In The New
Psychology of Language. Vol. 2. Ed. Michael Tomasello, 21142.
Mahwah, NJ: Lawrence Erlbaum Associates.
Hockett, Charles F. 1960. The origin of speech. Scientific American
203: 8896.
Hopper, Paul, and Elizabeth Traugott. 2003. Grammaticalization. 2d
ed. Cambridge: Cambridge University Press.
Hull, David L. 1988. Science as a Process: An Evolutionary Account of the
Social and Conceptual Development of Science. Chicago: University
of Chicago Press.
Labov, William. 2001. Principles of Linguistic Change. Vol. 2. Social
Factors. Oxford: Blackwell.
Langacker, Ronald W. 1987. Foundations of Cognitive Grammar. Vol. 1.
Theoretical Prerequisites. Stanford: Stanford University Press.
Lewis, David. 1969. Convention. Cambridge, MA : MIT Press.
Milroy, Lesley. 1987. Language and Social Networks. 2d ed. Oxford: Basil
Blackwell.
Ohala, John. 1989. Sound change is drawn from a pool of synchronic variation. In Language Change: Contributions to the Study
of its Causes, ed. Leiv Egil Breivik and Ernst Hkon Jahr, 17398.
Berlin: Mouton de Gruyter.
Peirce, Charles Sanders. 1932. Ground, object and interpretant. In
Collected Papers of Charles Sanders Peirce. Vol. 2: Elements of Logic,
ed. Charles Hartshorne and Paul Weiss, 13455. Cambridge: Harvard
University Press.
Pierrehumbert, Janet B. 2003. Probabilistic phonology: discrimination
and robustness. In Probabilistic Linguistics, ed. Rens Bod, Jennifer
Hay, and Stefanie Jannedy, 177228. Cambridge, MA: MIT Press.
Saussure, Ferdinand de. [1916] 1966. Cours de linguistique gnrale. Ed.
Ch. Bally and A. Sechehaye. (Course in General Linguistics. Trans.
Wade Baskin. New York : McGraw-Hill.)
Searle, John R. 1969. Speech Acts: An Essay in the Philosophy of Language.
Cambridge: Cambridge University Press.

Language Structure in Its Human Context


Talmy, Leonard. 2000. Toward a Cognitive Semantics. Vol. 1. Concept
Structuring Systems Cambridge, MA : MIT Press.
Tomasello, Michael. 1999. The Cultural Origins of Human Cognition.
Cambridge: Harvard University Press.
. 2003. Constructing a Language: A Usage-Based Theory of
Language Acquisition. Cambridge: Harvard University Press.
. 2008. The Origins of Human Communication. Cambridge,
MA : MIT Press.

Vihman, Marilyn M., and V. M. Gathercole. Language Development.


Unpublished manuscript.
Vihman, Marilyn M., and S. Kunnari. 2006. The sources of phonological knowledge: A crosslinguistic perspective. Recherches
Linguistiques de Vincennes 35: 13364.
Wenger, tienne. 1998. Communities of Practice: Learning, Meaning
and Identity. Cambridge: Cambridge University Press.

11

with a speakers (or hearers) semantic and contextual knowledge. Here, we review some of what we have learned about the
psychology of linguistic form, as it pertains to sounds, words,
and sentences.

SOUNDS

THE PSYCHOLOGY OF LINGUISTIC FORM


Lee Osterhout, Richard A. Wright, and Mark D. Allen

Humans can generate and comprehend a stunning variety of


conceptual messages, ranging from sophisticated types of mental representations, such as ideas, intentions, and propositions,
to more primal messages that satisfy demands of the immediate environment, such as salutations and warnings. In order
for these messages to be transmitted and received, however,
they must be put into a physical form, such as a sound wave
or a visual marking. As noted by the Swiss linguist Ferdinand
de Saussure ([2002] 2006), the relationship between mental
concepts and physical manifestations of language is almost
always arbitrary. The words cat, sat, and mat are quite similar
in terms of how they sound but are very dissimilar in meaning;
one would expect otherwise if the relationship between sound
and meaning was principled instead of arbitrary. Although the
relationship between linguistic form and meaning is arbitrary,
it is also highly systematic. For example, changing a phoneme
in a word predictably also changes its meaning (as in the cat,
sat, and mat example).
Human language is perhaps unique in the complexity of
its linguistic forms (and, by implication, the system underlying these forms). Human language is compositional; that is,
every sentence is made up of smaller linguistic units that have
been combined in highly constrained ways. A standard view
(Chomsky 1965; Pinker 1999) is that units and rules of combination exist at the levels of sound (phonemes and phonology),
words (morphemes and morphology), and sentences (words
and phrases and syntax). Collectively, these rules comprise a
grammar that defi nes the permissible linguistic forms in the
language. These forms are systematically related to, but distinct from, linguistic meaning (semantics).
Linguistic theories, however, are based on linguistic
description and observation and therefore have an uncertain
relation to the psychological underpinnings of human language. Researchers interested in describing the psychologically
relevant aspects of linguistic form require their own methods and evidence. Furthermore, psychological theories must
describe not only the relevant linguistic forms but also the processes that assemble these forms (during language production)
and disassemble them (during language comprehension). Such
theories should also explain how these forms are associated

12

Sound units. Since the advent of speech research, one of the


most intensively pursued topics in speech science has been
the search for the fundamental sound units of language. Many
researchers have found evidence for phonological units that
are abstract (i.e., generalizations across any number of heard
utterances, rather than memories of specific utterances) and
componential (constituent elements that operate as part of a
combinatorial system). However, there is other evidence for
less abstract phonological forms that may be stored as whole
words. As a result, two competing hypotheses about phonological units have emerged: an abstract componential one versus
a holistic one.
The more widespread view is the componential one. It posits abstract units that typically relate either to abstract versions
of the articulatory gestures used to produce the speech sounds
(Liberman and Mattingly 1985, Browman and Goldstein 1990),
or to ones derived from descriptive units of phonological theory. Such descriptive units include the feature (see feature
analysis), an abstract subphonemic unit of contrast; the phoneme, an abstract unit of lexical contrast that is either a consonant or a vowel; the phone or allophone, a surface variant of the
phoneme; the syllable, a timing unit that is made up of a vowel
and one or more of its flanking consonants; the prosodic word ,
the rhythmic structure that relates to patterns of emphasized
syllables; or various structures that related to tone, the lexically contrastive use of the voices pitch, and intonation, the
pitch-based tune that relates to the meaning of a sentence (for
reviews, see Frazier 1995; Studdert-Kennedy 1980).
In the holistic view, the word is the basic unit, whereas
other smaller units are considered to be epiphenomenal (e.g.,
Goldinger, Pisoni, and Logan, 1991). Instance-specific memory traces of particular spoken words are often referred to as
episodes. Proponents of this view point out that while abstract
units are convenient for description and relate transparently
to segment-based writing systems, such as those based on
the alphabet, there is evidence that listeners draw on a variety of highly detailed and instance-specific aspects of a words
pronunciation in making lexical decisions (for reviews, see
Goldinger and Azuma 2003; Nygaard and Pisoni 1995).
Some researchers have proposed hybrid models in which
there are two layers of representation: the episodic layer, in which
highly detailed memory traces are stored, and an abstract layer
organized into features or phones (Scharenborg et al. 2005). The
proponents of hybrid models try to capture the instance-specific effects in perception that inspire episodic approaches, as
well as the highly abstracted lexical contrast effects.
PROCESSES. speech production refers to the process by
which the sounds of language are produced. The process necessarily involves both a planning stage, in which the words and
other linguistic units that make up an utterance are assembled

The Psychology of Linguistic Form


in some fashion, and an implementation stage, in which the
various parts of the vocal tract, for example the articulators,
execute a motor plan to generate the acoustic signal. See Carol
A. Fowler (1995) for a detailed review of the stages involved in
speech production. It is worth noting here that even if abstract
phonological units such as features are involved in planning
an utterance, at some point the linguistic string must be implemented as a motor plan and a set of highly coordinated movements. Th is has motivated gestural representations that include
movement plans, rather than static featural ones (Browman
and Goldstein 1990; Fowler 1986, 1996; Saltzman and Munhall
1989; Stetson 1951).
speech perception is the process by which human listeners identify and interpret the sounds of language. It, too,
necessarily involves at least two stages: 1) the conversion of the
acoustic signal into an electrochemical response at the auditory periphery and 2) the extraction of meaning from the neurophysiological response at the cortical levels. Brian C. J. Moore
(1989) presents a thorough review of the physiological processes
and some of the issues involved in speech perception. A fundamental point of interest here is perceptual constancy in the face
of a massively variable signal. Restated as a question, how is
it that a human listener is able to perceive speech sounds and
understand the meaning of an utterance, given the massive
variability created by physiological idiosyncrasies and contextual variation? The various answers to this question involve
positing some sort of perceptual units, be they individual segments, subsegmental features, coordinated speech gestures, or
higher-level units like syllables, morphemes, or words.
It is worth noting here that the transmission of linguistic
information does not necessarily rely exclusively on the auditory channel; the visible articulators, the lips and to a lesser
degree the tongue and jaw, also transmit information. A listener
presented with both auditory and visual stimuli will integrate
the two signals in the perceptual process (e.g., Massaro 1987).
When the information in the visual signal is unambiguous (as
when the lips are the main articulators), the visual signal may
even dominate the acoustic one (e.g., McGurk and Macdonald
1976). Moreover, writing systems convey linguistic information, albeit in a low-dimensional fashion. Most strikingly, sign
languages are fully as powerful as speech-based communication systems and are restricted to the visual domain. Despite
the differences between signed and spoken languages in terms
of the articulators and their perceptual modalities, they draw
on the same sorts of linguistic constituents, at least so far as
the higher-level units are concerned: syllable, morpheme,
word, sentence, and prosodic phrase (e.g., Brentari 1998). Some
linguists have also proposed the decomposition of signed languages into smaller units using manual analogs of phonological features, despite the obvious differences in the articulators
and the transmission media (for a review see Emmory 2002).
The parallel of signed and spoken language structure despite
the differences in transmission modalities is often interpreted
as evidence for abstract phonological units at the level of the
mental lexicon (Meier, Cormier, and Quinto-Pozos 2002).
THE HISTORY OF THE DEBATE: EARLY PHONOLOGICAL UNITS. The
current debate about how to characterize speech sounds has

its roots in research that dates back over a century. Prior to


the advent of instrumental and experimental methods in the
late nineteenth century, it was commonly accepted that the
basic units of speech were discrete segments that were alphabetic in nature and serially ordered. While it was recognized
that speech sounds varied systematically depending on the
phonetic context, the variants themselves were thought to be
static allophones of an abstract and lexically contrastive sound
unit, that is, a phoneme. Translated into modern terminology,
phonological planning involved two stages: 1) determining the
contextually determined set of discrete surface variants, given
a particular lexical string, and 2) concatenating the resulting
allophones. The physiological implementation of the concatenated string was thought to result in a series of articulatory
steady states, or postures. The only continuous aspects of sound
production were believed to be brief transitional periods created by articulatory transitions from one state to the next. The
transitional movements were thought to be wholly predictable
and determined by the physiology of a particular speakers
vocal tract. Translated again into modern terminology, perception (when considered) was thought to be simply the process
of translating the allophones back into their underlying phonemes for lexical access. The earliest example of the phonemeallophone relationship is attributed to Pini, around 500 b.c.e.
whose sophisticated system of phonological rules and relationships influenced structuralist linguists of the early twentieth
century, as well as generative linguists of the late twentieth
century (for a review, see Anderson 1985; Kiparsky 1979).
The predominant view at the end of the nineteenth century
was typified by Alexander M. Bells (1867) descriptive work on
English pronunciation. In it, he presented a set of alphabet-inspired symbols whose shapes and orientations were intended
to encode both the articulatory steady states and their resulting steady-state sounds. A fundamental assumption in the
endeavor was that all sounds of human language could be
encoded as a sequence of universal articulatory posture complexes whose subcomponents were shared by related sounds.
For example, all labial consonants (p, b, m, f, v, w, etc.) shared
a letter shape and orientation, while all voiced sounds (b, d, g,
v, z, etc.) shared an additional mark to distinguish them from
their voiceless counterparts (p, t, k, f, s, etc.). Bells formalization of a set of universal and invariant articulatory constituents,
aligned as an alphabetic string, influenced other universal
transcription systems such as Henry Sweets (1881) Romic
alphabet, which laid the foundation for the development of the
International Phonetic Alphabet (Passy 1888). It also foreshadowed the use of articulatory features, such as those proposed
by Noam Chomsky and Morris Halle (1968) in modern phonology, in that each speech sound, and therefore each symbol, was
made up of a set of universal articulatory components.
A second way in which Bells work presaged modern research
was the connection between perception and production.
Implicit in his system of writing was the belief that perception
of speech sounds was the process of extracting the articulations
that produced them. Later perceptual models would incorporate this relationship in one way or another (Chistovich 1960;
Dudley 1940; Fowler 1986, 1996; Joos 1948; Ladefoged and
McKinney 1963; Liberman and Mattingly 1985; Stetson 1951).

13

The Cambridge Encyclopedia of the Language Sciences


THE HISTORY OF THE DEBATE: EARLY EXPERIMENTAL
RESEARCH. Prior to the introduction of experimental methods into phonetics, the dominant methodologies were introspection about ones own articulations and careful but
subjective observations of others speech, and the measurement units were letter-based symbols. Thus, the observer and
the observed were inextricably linked while the resolution of
the measurement device was coarse. Th is view was challenged
when a handful of phoneticians and psychologists adopted
the scientific method and took advantage of newly available
instrumentation, such as the kymograph, in the late 1800s.
They discovered that there were no segmental boundaries in
the speech stream and that the pronunciation of a particular
sound varied dramatically from one instance to the next (for
a review of early experimental phonetics, see Khnert and
Nolan 1999 and Minifie 1999). In the face of the new instrumental evidence, some scholars, like Eduard Sievers (1876), P.-J.
Rousselot (1897), and Edward Wheeler Scripture (1902), proposed that the speech stream, and the articulations that produced it, were continuous, overlapping, and highly variable,
rather than being discrete, invariant, and linear. For them, the
fundamental sound units were the syllable or even the word or
morpheme. Rousselots research (18971901) revealed several
articulatory patterns that were confi rmed by later work (e.g.,
Stetson 1951). For example, he observed that when sounds that
are transcribed as sequential are generated by independent
articulators (such as the lips and tongue tip), they are initiated and produced simultaneously. He also observed that one
articulatory gesture may significantly precede the syllable it
is contrastive in, thereby presenting an early challenge to the
notion of sequential ordering in speech.
Laboratory researchers like Raymond H. Stetson (1905,
1951) proposed that spoken language was a series of motor
complexes organized around the syllable. Stetson also fi rst proposed that perception was the process of perceiving the articulatory movements that generate the speech signal. However,
outside of the experimental phonetics laboratory, most speech
researchers, particularly such phonologists as Leonard
Bloomfield (1933), continued to use phonological units that
remained abstract, invariant, sequential, and letter-like. Th ree
events that occurred in the late 1940s and early 1950s changed
this view dramatically. The fi rst event was the application to
speech research of modern acoustic tools like the spectrogram
(Potter 1945), sophisticated models of vocal tract acoustics
(e.g., House and Fairbanks 1953), reliable articulatory instrumentation, such as high-speed X-ray cineflourography (e.g.,
Delattre and Freeman 1968), and electromyographic studies of
muscle activation (Draper, Ladefoged, and Whitteridge 1959).
The second was the advent of modern perception research in
which researchers discovered complex relationships between
speech perception and the acoustic patterns present in the signal (Delattre, Liberman, and Cooper 1955). The third was the
development of distinctive feature theory in which phonemes
were treated as feature matrices that captured the relationships between sounds (Jakobson 1939; Jakobson, Fant, and
Halle 1952).
When researchers began to apply modern acoustic and
articulatory tools to the study of speech production, they

14

rediscovered and improved on the earlier observation that the


speech signal and the articulations that create it are continuous, dynamic, and overlapping. Stetson (1951) can be seen as
responsible for introducing kinematics into research on speech
production. His research introduced the notions of coproduction, in which articulatory gestures were initiated simultaneously, and gestural masking, in which the closure of one
articulatory gesture hides another, giving rise to the auditory
percept of deletion. Stetsons work provided the foundation for
current language models that incorporate articulatory gestures and their movements as the fundamental phonological
units (e.g., Browman and Goldstein 1990; Byrd and Saltzman
2003; Saltzman and Munhall 1989).
In the perceptual and acoustic domains, the identification
of perceptual cues to consonants and vowels raised a series
of questions that remain at the heart of the debate to this day.
The coextensive and covarying movements that produce the
speech signal result in acoustic information that exhibits a
high degree of overlap and covariance with information about
adjacent units (e.g., Delattre, Liberman, and Cooper 1955). Any
single perceptual cue to a particular speech sound can also be a
cue to another speech sound. For example, the onset of a vowel
immediately following a consonant provides the listener with
cues that identify both the consonant and vowel (Liberman et
al. 1954). At the same time, multiple cues may identify a single
speech sound. For example, the duration of a fricative (e.g., s),
the fricatives noise intensity, and the duration of the preceding
vowel all give information about whether the fricative is voiced
(e.g., z) or voiceless (e.g., s) (Soli 1982). Finally, the cues to
one phone may precede or follow cues to adjacent phones. The
many-to-one, the one-to-many, and the nonlinear relationships between acoustic cues and their speech sounds pose a
serious problem for perceptual models in which features or
phones are thought to bear a linear relationship to each other.
More recently, researchers studying perceptual learning have
discovered that listeners encode speaker-specific details and
even utterance-specific details when they are learning new
speech sounds (Goldinger and Azuma 2003). The latest set of
fi ndings poses a problem for models in which linguistic sounds
are stored as abstract units.
In distinctive feature theory, each phoneme is made up
of a matrix of binary features that encodes both the distinctions and the similarities between one class of sounds and
the next in a particular language (Jakobson, Fant, and Halle
1952; Chomsky and Halle, 1968). The features are thought to be
drawn from a language universal set, and thus allow linguists
to observe similarities across languages in the patterning of
sounds. Moreover, segmenting the speech signal into units
that are hierarchically organized permits a duality of patterning of sound and meaning that is thought to give language its
communicative power. That is, smaller units such as phonemes
may be combined according to language-specific phonotactic
(sound combination) constraints into morphemes and words,
and words may be organized according to grammatical constraints into sentences. Th is means that with a small set of
canonical sound units, together with recursion, the talker
may produce and the hearer may decode and parse a virtually
unbounded number of utterances in the language.

The Psychology of Linguistic Form


WORDS
In this section, we focus on those representations of form that
encode meaning and other abstract linguistic content at the
most minimally analyzable units of analysis namely, words
and morphemes. As such, we give a brief overview of the study
of lexical morphology, investigations in morphological processing, and theories about the structure of the mental lexicon.
LEXICAL FORM. What is the nature of a representation at the
level of lexical form? We limit our discussion here largely to
phonological codes, but recognize that a great many of the theoretical and processing issues we raise apply to orthographic
codes as well. It is virtually impossible for the brain to store
exact representations for all possible physical manifestations of
linguistic tokens that one might encounter or produce. Instead,
representations of lexical form are better thought of as somewhat abstract structured groupings of phonemes (or graphemes) that are stored as designated units in long-term memory,
either as whole words or as individual morpheme constituents,
and associated with any other sources of conceptual or linguistic content encoded in the lexical entries that these form representations map onto. As structured sequences of phonological
segments, then, these hypothesized representational units
of lexical form must be able to account for essentially all the
same meaning-to-form mapping problems and demands that
individual phonological segments themselves encounter during on-line performance, due to idiosyncratic variation among
speakers and communicative environments. More specifically,
representations of morphemes and words at the level of form
must be abstract enough to accommodate significant variation
in the actual physical energy profi les produced by the motor
systems of individual speakers/writers under various environmental conditions. Likewise, in terms of language production,
units of lexical form must be abstract enough to accommodate random variation in the transient shape and status of the
mouth of the language producer.
FORM AND MEANING: INDEPENDENT LEVELS OF LEXICAL REPRESENTATION. The previous description of words and morphemes
to some degree rests on the assumption that lexical form is represented independently from other forms of cognitive and linguistic information, such as meaning and lexical syntax (e.g.,
lexical category, nominal class, gender, verbal subcategory,
etc.). Many theories of the lexicon have crucially relied on the
assumption of separable levels of representation within the
lexicon. In some sense, as explained by Allport and Funnell
(1981), this assumption follows naturally from the arbitrariness of mapping between meaning and form, and would thus
appear to be a relatively noncontroversial assumption.
The skeptical scientist, however, is not inclined to simply
accept assumptions of this sort at face value without considering alternative possibilities. Imagine, for example, that the
various types of lexical information stored in a lexical entry
are represented within a single data structure of highly interconnected, independent, distributed features. Th is sort of
arrangement is easy to imagine within the architecture of a
connectionist model (McClelland and Rumelhart 1986).

Using the lexical entry cat as an example, imagine a connectionist system in which all the semantic features associated
with cat, such as [whiskers], [domestic pet], and so on (which
are also shared with all other conceptual lexical entities bearing those features, such as <lion>, <dog>, etc.), are directly
associated with the phonological units that comprise its word
form /k/, /ae/, /t/ (which are likewise shared with all other
word forms containing these phonemes) by means of individual association links that directly tie individual semantic features with individual phonological units (Rueckl et al. 1997).
One important consequence of this hypothetical arrangement
is that individual word forms do not exist as free-standing representations. Instead, the entire lexical entry is represented as
a vector of weighted links connecting individual phonemes to
individual lexical semantic and syntactic features. It logically
follows from this model, then, that if all or most of the semantic
features of the word cat, for example, were destroyed or otherwise made unavailable to the processor, then the set of phonological forms /k/ /ae/ /t/, having nothing to link to, would
have no means for mental representation, and would therefore
not be available to the language processor. We will present
experimental evidence against this model that, instead, favors
models in which a full phonological word (e.g., /kaet/) is represented in a localist fashion and is accessible to the language
processor, even when access to its semantic features is partially
or entirely disrupted.
Several of the most prominent theories of morphology and
lexical structure within formal linguistics make explicit claims
about modularity of meaning and form. Ray Jackendoff (1997),
for example, presents a theory that has a tripartite structure, in
which words have separate identities at three levels of representation form, syntax, and meaning and that these three levels
are sufficient to encode the full array of linguistic information
encoded by each word. His model provides further details in
which it is proposed that our ability to store, retrieve, and use
words correctly, as well as our ability to correctly compose
morphemes into complex words, derives from a memorized
inventory of mapping functions that picks out the unique representations or feature sets for a word at each level and associates these elements with one another in a given linguistic
structure.
While most psycholinguistic models of language processing have not typically addressed the mapping operations
assumed by Jackendoff, they do overlap significantly in terms
of addressing the psychological reality of his hypothetical tripartite structure in the mental lexicon. Although most experimental treatments of the multilevel nature of the lexicon have
been developed within models of language production, as will
be seen, there is an equally compelling body of evidence for
multilevel processing from studies of language comprehension
as well.
The most influential lexical processing models over the
last two decades make a distinction between at least two levels: the lemma level, where meaning and syntax are stored,
and the lexeme level, where phonological and orthographic
descriptions are represented. These terms and the functions
associated with them were introduced in the context of a computational production model by Gerard Kempen and Pieter

15

The Cambridge Encyclopedia of the Language Sciences


Huijbers (1983) and receive further refi nement with respect to
human psycholinguistic performance in the foundational lexical production models of Merrill F. Garrett (1975) and Willem
Levelt (1989). Much compelling evidence for a basic lemma/
lexeme distinction has come from analyses of naturally occurring speech errors generated by neurologically unimpaired
subjects, including tip-of-the-tongue phenomena (Meyer and
Bock 1992), as well as from systematic analyses of performance
errors observed in patients with acquired brain lesions. A more
common experimental approach, however, is the pictureword
interference naming paradigm, in which it has been shown
that lemma- and lexeme-level information can be selectively
disrupted during the course of speech production (Schriefers,
Meyer, and Levelt 1990).
In terms of lexical comprehension models, perhaps the
most straightforward sources of evidence for a meaning/form
distinction have come from analyses of the performance of
brain-damaged patients. A particularly compelling case for
the independence of meaning and form might be demonstrated if an individual with acquired language pathology
were to show an intact ability to access word forms in his/
her lexicon, yet remain unable to access meaning from those
form representations. Th is is precisely the pattern observed in
patients designated as suffering from word meaning deafness.
These patients show a highly selective pattern of marked deficit in comprehending word meanings, but with perfect or nearperfect access to word forms. A good example is patient WBN
as described in Mark D. Allen (2005), who showed an entirely
intact ability to access spoken word-form representations. In
an auditory lexical decision task, WBN scored 175/182 (96%)
correct, which shows that he could correctly distinguish real
words from nonwords (e.g., flag vs. fl ig), presumably relying
on preserved knowledge of stored lexemes to do so. However,
on tasks that required WBN to access meaning from spoken
words, such as picture-to-word matching tasks, he performed
with only 40%60% accuracy (at chance in many cases).
LEXICAL STRUCTURE: COMPLEX WORDS. A particularly important issue in lexical representation and processing concerns
the cognitive structure of complex words, that is, words composed of more than one morpheme. One of the biggest debates
surrounding this issue stems from the fact that in virtually all
languages with complex word structures, lexical information
is encoded in consistent, rule-like structures, as well as in idiosyncratic, irregular structures. Th is issue can be put more concretely in terms of the role of morphological decomposition in
single-word comprehension theories within psycholinguistics.
Consider the written word wanted, for example. A question for
lexical recognition theories is whether the semantic/syntactic
properties of this word [WANT, Verb, +Past, ] are extracted
and computed in a combinatorial fashion each time wanted
is encountered by accessing the content associated with
the stem want- [WANT, Verb] and combining it with the content extracted from the affi x -ed [+Past] or whether instead,
a single whole-word form wanted is stored at the lexeme level
and associated directly with all of its semantic/syntactic content. To understand the plausibility that a lexical system could
in principle store whole-word representations such as wanted,

16

one must recognize that in many other cases, such as those


involving irregularly inflected words, such as taught, the system cannot store a stem and affi x at the level of form, as there
are no clear morpheme boundaries to distinguish these constituents, but must instead obligatorily store it as a whole word
at the lexeme level.
Many prominent theories have favored the latter, nondecompositional hypothesis for all words, including irregular words like taught, as well as regular compositional words
like wanted (Bybee 1988). Other influential processing models
propose that complex words are represented as whole-word
units at the lexeme level, but that paradigms of inflectionally
related words (want, wants, wanted ) map onto a common representation at the lemma level (Fowler, Napps, and Feldman
1985). In addition to this, another class of models, which has
received perhaps the strongest empirical support, posits full
morphological decomposition at the lexeme level whenever
possible (Allen and Badecker 1999). According to these fully
decompositional models, a complex word like wanted is represented and accessed in terms of its decomposed constituents want- and -ed at the level of form, such that the very same
stem want- is used during the recognition of want, wants, and
wanted. According to these models, then, the recognition routines that are exploited by morphological decomposition at the
level of form resemble those in theoretical approaches to sentence processing, in which meaning is derived compositionally
by accessing independent units of representation of form and
combining the content that these forms access into larger linguistic units, according to algorithms of composition specified
by the grammar.
While there is compelling empirical support for decompositional models of morphological processing, researchers are
becoming increasingly aware of important factors that might
limit decomposition. These factors are regularity, formal and
semantic transparency, and productivity.
Regularity refers to the reliability of a particular word-formation process. For example, the plural noun kids expresses
noun plurality in a regular, reliable way, while the plural noun
children does not.
Formal transparency refers to the degree to which the morpheme constituents of a complex structure are obvious from
its surface form. For example, morpheme boundaries are fairly
obvious in the transparently inflected word wanted, compared to those of the opaquely (and irregularly) inflected word
taught.
Although an irregular form like taught is formally opaque,
as defi ned here, it is nonetheless semantically transparent,
because its meaning is a straightforward combination of the
semantics of the verb teach and the feature [+Past]. In contrast,
an example of a complex word that is formally transparent yet
semantically opaque is the compound word dumbbell, which
is composed of two recognizable morphemes, but the content
associated with these two surface morphemes do not combine
semantically to form the meaning of the whole word.
Productivity describes the extent to which a word-formation
process can be used to form new words freely. For example, the
suffi x -ness is easily used to derive novel nouns from adjectives
(e.g., nerdiness, awesomeness, catchiness), while the ability to

The Psychology of Linguistic Form


form novel nouns using the analogous suffi x -ity is awkward at
best (?nerdity) if not impossible.
Another phenomenon associated with these lexical properties is that they tend to cluster together in classes of morphologically complex word types across a given language, such that
there will often exist a set of highly familiar, frequently used
forms that are irregular, formally opaque, and nonproductive,
and also a large body of forms that are morphologically regular,
formally transparent, and productive. Given the large variety of
complex word types found in human languages with respect to
these dimensions of combinability, as well as the idiosyncratic
nature of the tendency for these dimensions to cluster together
from language to language, it would appear that empirical evidence for morphological decomposition must be established
on a case-by-case basis for each word-formation type within
each language. Th is indeed appears to be the direction that
most researchers have taken.

SENTENCES
On the surface, a sentence is a linear sequence of words. But in
order to extract the intended meaning, the listener or reader
must combine the words in just the right way. That much is
obvious. What is not obvious is how we do that in real time,
as we read or listen to a sentence. Of particular relevance to
this essay are the following questions: Is there a representational level of syntactic form that is distinct from the meaning
of a sentence? And if so, exactly how do we extract the implicit
structure in a spoken or written sentence as we process it? One
can ask similar questions about the process of sentence production: When planning a sentence, is there a planning stage
that encodes a specifically syntactic form? And if so, how do
these representations relate to the sound and meaning of the
intended utterance?
For purely practical reasons, there is far more research on
extracting the syntactic form during sentence comprehension (a process known as parsing ; see parsing , human) than
on planning the syntactic form of to-be-spoken sentences.
Nonetheless, research in both areas has led to substantive
advances in our understanding of the psychology of sentence
form.
SYNTAX AND SEMANTICS. A fundamental claim of a generative grammar is that syntax and semantics are clearly distinct. A fundamental claim of a cognitive grammar is that
syntax and semantics are so entwined that they cannot be easily
separated. Th is debate among linguists is mirrored by a similar
debate among researchers who study language processing. A
standard assumption underlying much psycholinguistic work
is that a relatively direct mapping exists between the levels of
knowledge posited within generative linguistic theories and
the cognitive and neural processes underlying comprehension
(Bock and Kroch 1989). Distinct language-specific processes
are thought to interpret a sentence at each level of analysis,
and distinct representations are thought to result from these
computations. But other theorists, most notably those working
in the connectionist framework, deny that this mapping exists
(Elman et al. 1996). Instead, the meaning of the sentence is

claimed to be derived directly, without an intervening level of


syntactic structure.
The initial evidence of separable syntactic and semantic processing streams came from studies of brain-damaged patients
suffering from aphasia , in particular the syndromes known as
Brocas and Wernickes aphasia. Brocas aphasics typically produce slow, labored speech; their speech is generally coherent
in meaning but very disordered in terms of sentence structure.
Many syntactically important words are omitted (e.g., the, is),
as are the inflectional morphemes involved in morphosyntax
(e.g., -ing, -ed, -s). Wernickes aphasics, by contrast, typically
produce fluent, grammatical sentences that tend to be incoherent. Initially, these disorders were assumed to reflect deficits in sensorimotor function; Brocas aphasia was claimed to
result from a motoric deficit, whereas Wernickes aphasia was
claimed to reflect a sensory deficit. The standard assumptions
about aphasia changed in the 1970s, when theorists began to
stress the ungrammatical aspects of Brocas aphasics speech;
the term agrammatism became synonymous with Brocas
aphasia. Particularly important in motivating this shift was evidence that some Brocas aphasics have a language-comprehension problem that mirrors their speech-production problems.
Specifically, some Brocas aphasics have trouble understanding
syntactically complex sentences (e.g., John was finally kissed by
Louise) in which the intended meaning is crucially dependent
on syntactic cues in this case, the grammatical words was
and by (Caramazza and Zurif 1976). Th is evidence seemed to
rule out a purely motor explanation for the disorder; instead,
Brocas aphasia was viewed as fundamentally a problem constructing syntactic representations, both for production and
comprehension. By contrast, Wernickes aphasia was assumed
to reflect a problem in accessing the meanings of words.
These claims about the nature of the aphasic disorders are
still quite influential. Closer consideration, however, raises
many questions. Pure functional deficits affecting a single linguistically defi ned function are rare; most patients have a mixture of problems, some of which seem linguistic but others of
which seem to involve motor or sensory processing (Alexander
2006). Many of the Brocas patients who produce agrammatic
speech are relatively good at making explicit grammaticality
judgments (Linebarger, Schwartz, and Saff ran 1983), suggesting that their knowledge of syntax is largely intact. Similarly, it
is not uncommon for Brocas aphasics to speak agrammatically
but to have relatively normal comprehension, bringing into
question the claim that Brocas aphasia reflects damage to an
abstract syntax area used in production and comprehension
(Miceli et al. 1983). Taken together, then, the available evidence
from the aphasia literature does not provide compelling evidence for distinct syntactic and semantic processing streams.
Another source of evidence comes from neuroimaging
studies of neurologically normal subjects. One useful method
involves the recording of event-related brain potentials (ERPs)
from a persons scalp as he or she reads or listens to sentences.
ERPs reflect the summed, simultaneously occurring postsynaptic activity in groups of cortical pyramidal neurons. A particularly fruitful approach has involved the presentation of
sentences containing linguistic anomalies. If syntactic and
semantic aspects of sentence comprehension are segregated

17

The Cambridge Encyclopedia of the Language Sciences


into distinct streams of processing, then syntactic and semantic anomalies might affect the comprehension system in distinct ways. A large body of evidence suggests that syntactic
and semantic anomalies do in fact elicit qualitatively distinct
ERP effects, and that these effects are characterized by distinct
and consistent temporal properties. Semantic anomalies (e.g.,
The cat will bake the food ) elicit a negative wave that peaks
at about 400 milliseconds after the anomalous word appears
(the N400 effect) (Kutas and Hillyard 1980). By contrast, syntactic anomalies (e.g., The cat will eating the food ) elicit a large
positive wave that onsets at about 500 milliseconds after presentation of the anomalous word and persists for at least half a
second (the P600 effect [Osterhout and Holcomb 1992]). In some
studies, syntactic anomalies have also elicited a negativity over
anterior regions of the scalp, with onsets ranging from 100 to
300 milliseconds. These results generalize well across types
of anomaly, languages, and various methodological factors.
The robustness of the effects seems to indicate that the human
brain does in fact honor the distinction between the form and
the meaning of a sentence.
SENTENCE COMPREHENSION. Assuming that sentence processing involves distinct syntactic and semantic processing
streams, the question arises as to how these streams interact
during comprehension. A great deal of evidence indicates that
sentence processing is incremental, that is, that each successive word in a sentence is integrated into the preceding sentence material almost immediately. Such a strategy, however,
introduces a tremendous amount of ambiguity that is,
uncertainty about the intended syntactic and semantic role of a
particular word or phrase. Consider, for example, the sentence
fragment The cat scratched. There are actually two ways to
parse this fragment. One could parse it as a simple active sentence, in which the cat is playing the syntactic role of subject
of the verb scratched and the semantic role of the entity doing
the scratching (as in The cat scratched the ratty old sofa). Or one
could parse it as a more complex relative clause structure, in
which the verb scratched is the start of a second, embedded
clause, and the cat is the entity being scratched, rather than
the one doing the scratching (as in The cat scratched by the raccoon was taken to the pet hospital ). The ambiguity is resolved
once the disambiguating information (the ratty sofa or by the
raccoon) is encountered downstream, but that provides little
help for a parser that assigns roles to words as soon as they are
encountered.
How does an incremental sentence-processing system
handle such ambiguities? An early answer to this question
was provided by the garden-path (or modular) parsing models
developed in the1980s. The primary claim was that the initial
parse of the sentence is controlled entirely by the syntactic cues
in the sentence (Ferreira and Clifton 1986). As words arrive in
the linguistic input, they are rapidly organized into a structural
analysis by a process that is not influenced by semantic knowledge. The output of this syntactic process then guides semantic
interpretation. Th is model can be contrasted with interactive
models, in which a wide variety of information (e.g., semantics and conceptual/world knowledge) influences the earliest
stages of sentence parsing. Initial results of numerous studies

18

(mostly involving the measurement of subjects eye movements


as they read sentences) indicated that readers tended to read
straight through such syntactically simple sentences as The cat
scratched the ratty old sofa but experienced longer eye fi xations
and more eye regressions when they encountered by the raccoon in the more complex sentences. When confronted with
syntactic uncertainty, readers seemed to immediately choose
the simplest syntactic representation available (Frazier 1987).
When this analysis turned out to be an erroneous choice (that
is, when the disambiguating material in the sentence required
a more complex structure), longer eye fi xations and more
regressions occurred as the readers attempted to reanalyze
the sentence.
A stronger test of the garden-path model, however, requires
the examination of situations in which the semantic cues in
the sentence are clearly consistent with a syntactically complex parsing alternative. A truly modular, syntax-driven
parser would be unaffected by the semantic cues in the sentence. Consider, for example, the sentence fragment The
sofa scratched. Sofas are soft and inanimate and therefore
unlikely to scratch anything. Consequently, the semantic cues
in the fragment favor the more complex relative clause analysis, in which the sofa is the entity being scratched (as in The
sofa scratched by the cat was given to Goodwill ). Initial results
seemed to suggest that the semantic cues had no effect on the
initial parse of the sentence; readers seemed to build the syntactically simplest analysis possible, even when it was inconsistent with the available semantic information. Such evidence
led to the hypothesis that the language processor is comprised
of a number of autonomously functioning components, each of
which corresponds to a level of linguistic analysis (Ferreira and
Clifton 1986). The syntactic component was presumed to function independently of the other components.
The modular syntax-fi rst model has been increasingly challenged, most notably by advocates of constraint-satisfaction
models (Trueswell and Tanenhaus 1994). These models propose that all sources of relevant information (including statistical, semantic, and real-world information) simultaneously and
rapidly influence the actions of the parser. Hence, the implausibility of a sofa scratching something is predicted to cause
the parser to initially attempt the syntactically more complex
relative-clause analysis. Consistent with this claim, numerous studies have subsequently demonstrated compelling
influences of semantics and world knowledge on the parsers
response to syntactic ambiguity (ibid.).
There is, however, a fundamental assumption underlying
most of the syntactic ambiguity research (regardless of theoretical perspective): that syntax always controls combinatory
processing when the syntactic cues are unambiguous. Recently,
this assumption has also been challenged. The challenge centers on the nature of thematic roles, which help to defi ne
the types of arguments licensed by a particular verb (McRae,
Ferretti, and Amyote 1997; Trueswell and Tanenhaus 1994).
Exactly what is meant by thematic role varies widely, especially
with respect to the amount of semantic and conceptual content
it is assumed to hold (McRae, Ferretti, and Amyote 1997). For
most syntax-fi rst proponents, a thematic role is limited to a few
syntactically relevant selectional restrictions, such as animacy

The Psychology of Linguistic Form


(Chomsky 1965); thematic roles are treated as (largely meaningless) slots to be fi lled by syntactically appropriate fi llers. A
second view is that there is a limited number of thematic roles
(agent, theme, benefactor, and so on), and that a verb selects a
subset of these (Fillmore 1968). Although this approach attributes a richer semantics to thematic roles, the required generalizations across large classes of verbs obscure many subtleties
in the meaning and usage of these verbs.
Both of these conceptions of thematic roles exclude knowledge that people possess concerning who tends to do what to
whom in particular situations. Ken McRae and others have proposed a third view of thematic roles that dramatically expands
their semantic scope: Thematic roles are claimed to be rich,
verb-specific concepts that reflect a persons collective experience with particular actions and objects (McRae, Ferretti,
and Amyote 1997). These rich representations are claimed to
be stored as a set of features that defi ne gradients of typicality (situation schemas), and to comprise a large part of each
verbs meaning. One implication is that this rich knowledge
will become immediately available once a verbs meaning has
been retrieved from memory. As a consequence, the plausibility of a particular word combination need not be evaluated by
means of a potentially complex inferential process, but rather
can be evaluated immediately in the context of the verbs meaning. One might therefore predict that semantic and conceptual
knowledge of events will have profound and immediate effects
on the way in which words are combined during sentence processing. McRae and others have provided evidence consistent
with these claims, including semantic influences on syntactic
ambiguity resolution.
The most compelling argument against the absolute primacy of syntax, however, would be evidence that semantic and
conceptual knowledge can take control of sentence processing even when opposed by contradicting and unambiguous
syntactic cues. Recent work by Ferranda Ferreira (2003) suggests that this might happen on some occasions. She reported
that when plausible sentences (e.g., The mouse ate the cheese)
were passivized to form implausible sentences (e.g., The mouse
was eaten by the cheese), participants tended to name the wrong
entity as do-er or acted-on, as if coercing the sentences to
be plausible. However, the processing implications of these
results are uncertain, due to the use of postsentence ruminative responses, which do not indicate whether semantic influences reflect the listeners initial responses to the input or some
later aspect of processing.
Researchers have also begun to explore the influence of
semantic and conceptual knowledge on the on-line processing of syntactically unambiguous sentences. An illustrative
example is a recent ERP study by Albert Kim and Lee Osterhout
(2005). The stimuli in this study were anomalous sentences that
began with a active structure, for example, The mysterious crime
was solving. The syntactic cues in the sentence require that
the noun crime be the Agent of the verb solving. If syntax drives
sentence processing, then the verb solving would be perceived
to be semantically anomalous, as crime is a poor Agent for the
verb solve, and therefore should elicit an N400 effect. However,
although crime is a poor Agent, it is an excellent Theme (as in
solved the crime). The Theme role can be accommodated simply

by changing the inflectional morpheme at the end of the verb


to a passive form (The mysterious crime was solved ).
Therefore, if meaning drives sentence processing in this situation, then the verb solving would be perceived to be in the wrong
syntactic form (-ing instead of ed ), and should therefore elicit a
P600 effect. Kim and Osterhout observed that verbs like solving
elicited a P600 effect, showing that a strong semantic attraction between a predicate and an argument can determine
how words are combined, even when the semantic attraction
contradicts unambiguous syntactic cues. Conversely, in anomalous sentences with an identical structure but with no semantic attraction between the subject noun and the verb (e.g., The
envelope was devouring ), the critical verb elicited an N400
effect rather than a P600 effect. These results demonstrate that
semantics, rather than syntax, can drive word combinations
during sentence comprehension.
SENTENCE PRODUCTION. Generating a sentence requires the
rapid construction of novel combinations of linguistic units,
involves multiple levels of analysis, and is constrained by a variety of rules (about word order, the formation of complex words,
word pronunciation, etc). Errors are a natural consequence of
these complexities (Dell 1995). Because they tend to be highly
systematic, speech errors have provided much of the data upon
which current models of sentence production are based. For
example, word exchanges tend to obey a syntactic category rule,
in that the exchanged words are from the same syntactic category
(for example, two nouns have been exchanged in the utterance
Stop hitting your brick against a head wall). The systematicity of
speech errors suggests that regularities described in theories of
linguistic form also play a role in the speech-planning process.
The dominant model of sentence production is based
on speech error data (Dell 1995; Garrett 1975; Levelt 1989).
According to this model, the process of preparing to speak a
sentence involves three stages of planning: conceptualization,
formulation, and articulation, in that order. During the conceptualization stage, the speaker decides what thought to express
and how to order the relevant concepts sequentially. The formulation stage begins with the selection of a syntactic frame to
encode the thought; the frame contains slots that act as place
holders for concepts and, eventually, specific words. The phonological string is translated into a string of phonological features,
which then drive the motor plan manifested in articulation.
Th is model, therefore, posits the existence of representations of syntactic structure that are distinct from the representations of meaning and sound. Other evidence in support of this
view comes from the phenomenon of syntactic priming : Having
heard or produced a particular syntactic structure, a person
is more likely to produce sentences using the same syntactic
structure (Bock 1986). Syntactic priming occurs independently
of sentence meaning, suggesting that the syntactic frames are
independent forms of representation that are quite distinct
from meaning.

CONCLUSIONS
Collectively, the evidence reviewed in this essay indicates that
psychologically relevant representations of linguistic form

19

The Cambridge Encyclopedia of the Language Sciences


exist at all levels of language, from sounds to sentences. At each
level, units of linguistic form are combined in systematic ways
to form larger units of representation. For the most part, these
representations seem to be abstract; that is, they are distinct
from the motor movements, sensory experiences, and episodic
memories associated with particular utterances. However, it
is also clear that more holistic (that is, nondecompositional)
representations of linguistic form, some of which are rooted
in specific episodic memories, also play a role in language
processing.
It also seems to be true that linguistic forms (e.g., the morphological structure of a word or the syntactic structure of a
sentence) are dissociable from the meanings they convey.
At the same time, semantic and conceptual knowledge can
strongly influence the processing of linguistic forms, as exemplified by semantic transparency effects on word decomposition and thematic role effects on sentence parsing.
These conclusions represent substantive progress in our
understanding of linguistic form and the role it plays in language processing. Nonetheless, answers to some of the most
basic questions remain contentiously debated, such as the precise nature of the rules of combination, the relative roles of
compositional and holistic representations, and the pervasiveness of interactions between meaning and form.
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alexander, M. P. 2006 . Aphasia I: Clinical and anatomical issues.
In Patient-Based Approaches to Cognitive Neuroscience (2d ed.),
ed. M. J. Farah and T. E. Feinberg , 16582. Cambridge, MA : MIT
Press .
Allen, Mark D. 2005. The preservation of verb subcategory knowledge
in a spoken language comprehension deficit. Brain and Language
95: 255 64.
Allen, Mark , and William Badecker. 1999. Stem homograph inhibition
and stem allomorphy: Representing and processing inflected forms
in a multilevel lexical system. Journal of Memory and Language
41: 105 23.
Allport, D. A., and E. Funnell. 1981. Components of the mental lexicon. Philosophical Transactions of the Royal Society of London B
295: 397410.
Anderson, Stephen R . 1985. Phonology in the Twentieth Century: Theories
of rules and Theories of Representations. Chicago: The University of
Chicago Press.
Bell, Alexander M. 1867. Visible Speech: The Science of Universal
Alphabetics. London: Simpkin, Marshal.
Bloomfield, Leonard. 1933. Language. New York : H. Holt & Co.
Bock , J. Katherine. 1986. Syntactic persistence in language production. Cognitive Psychology 18: 355 87.
Bock, J. K., and Anthony S. Kroch. 1989. The isolability of syntactic processing. In Linguistic Structure in Language Processing, ed..
G. N. Carlson and M. K. Tanenhaus, 15796. Boston: Kluwer
Academic.
Brentari, Dianne. 1998. A Prosody Model of Sign Language Phonology.
Cambridge, MA : MIT Press.
Browman, Catherine P., and Louis Goldstein. 1990. Gestural specification using dynamically-defi ned articulatory structures. Journal
of Phonetics 18: 299 320.
Bybee, Joan . 1988. Morphology as lexical organization. In Th eoretical
Morphology: Approaches in Modern Linguistics, ed. M. Hammond
and M. Noonan, 11941. San Diego, CA : Academic Press.

20

Byrd, Dani, and Elliot Saltzman. 2003. The elastic phrase: Modeling
the dynamics of boundary-adjacent lengthening. Journal of
Phonetics 31: 149 80.
Caplan, David. 1995. Issues arising in contemporary studies of disorders of syntactic processing in sentence comprehension in agrammatic patients. Brain and Language 50: 325 38.
Caramazza, Alfonzo, and Edgar Zurif. 1976. Dissociations of algorithmic and heuristic processes in language comprehension: Evidence
from aphasia. Brain and Language 3: 57282.
Chistovich, Ludmilla A 1960. Classification of rapidly repeated speech
sounds. Akustichneskii Zhurnal 6: 39298.
Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton.
. 1965. Aspects of the Theory of Syntax . Cambridge, MA : MIT
Press.
Chomsky, Noam, and Morris Halle. 1968. The Sound Pattern of English.
New York : Harper and Row.
Delattre, Pierre, and Donald Freeman. 1968. A dialect study of
American Rs by x-ray motion picture. Linguistic s 44: 29 68.
Delattre, Pierre C., Avin M. Liberman, and Franklin S. Cooper. 1955.
Acoustic loci and transitional cues for consonants. Journal of the
Acoustical Society of America 27: 769 73.
Dell, Gary S. 1995. Speaking and misspeaking. In An Invitation to
Cognitive Science: Language. Cambridge, MA : MIT Press.
Draper, M., P. Ladefoged, and D. Whitteridge. 1959. Respiratory muscles in speech . Journal of Speech and Hearing Research 2: 16 27.
Dudley, Homer. 1940. The carrier nature of speech. Bell System
Technical Journal 14: 495 515.
Elman, Jeff rey L. 1990. Representation and structure in connectionist models. In Cognitive Models of Speech Processing, ed. G. T. M.
Altmann, 22760. Cambridge, MA : MIT Press.
Elman, Jeff rey L., Elizabeth A. Bates, A. Karmiloff-Smith, D. Parisi, and
K. Plunkett. 1996. Rethinking innateness. Cambridge, MA: MIT
Press.
Emmory, Karen. 2002. Language, Cognition, and the Brain: Insights
from Sign Language Research. Mahwah, NJ: Lawrence Erlbaum
Associates.
Ferreira, Fernanda. 2003. The misinterpretation of noncanonical sentences. Cognitive Psychology 47: 164 203.
Ferreira, Fernanda, and Charles Clifton, Jr. 1986. The independence of
syntactic processing. Journal of Memory and Language 25: 348 68.
Fillmore, Charles. 1968. The case for case. In Universals of Linguistic
Theory, ed. E. Bach, 180. New York : Holt, Rinehart, & Winston.
Fowler, Carol A. 1986. An event approach to the study of speech perception from a direct-realist perspective. Journal of Phonetics
14: 328.
. 1995. Speech production. In Speech, Language, and Communication, ed. J. L. Miller and P. D. Eimas, 2961. New York : Academic
Press.
. 1996. Listeners do hear sounds not tongues. Journal of the
Acoustical Society of America 99: 1730 41.
Fowler, Carol, Susan Napps, and Laurie Feldman. 1985. Relations
among regular and irregular morphologically related words in the
lexicon as revealed by repetition priming. Memory and Cognition
13: 24155.
Franklin, S., J. Lambon Ralph, J. Morris, and P. Bailey. 1996. A distinctive case of word meaning deafness? Cognitive Neuropsychology
13: 1139 62.
Frazier, L. 1987. Sentence processing: A tutorial review. In Attention
and Performance XII: The Psychology of Reading, ed. M. Coltheart,
330. Hillsdale, NJ: Lawrence Erlbaum Associates.
. 1995. Representation in psycholinguistics. In Speech,
Language, and Communication, ed. J. L. Miller and P. D. Eimas, 127.
New York : Academic Press.

The Psychology of Linguistic Form


Garrett , Merrill F. 1975. The analysis of sentence production. In Th e
Psychology of Learning and Motivation , ed. G. Bower, 13377. New
York : Academic Press .
Goldinger, Stephen D., and Tamiko Azuma . 2003. Puzzle-solving
science: The quixotic quest for units of speech perception . Journal
of Phonetics 31: 305 20.
Goldinger, Stephen D., David B. Pisoni, and John S. Logan . 1991. On
the nature of talker variability effects on recall of spoken word
lists . Journal of Experimental Psychology: Learning, Memory and
Cognition 17: 152 62.
Hall, D. A., and M. J. Riddoch . 1997. Word meaning deafness: Spelling
words that are not understood . Cognitive Neuropsychology
14: 113164.
Hillis , Argye E. 2000. The organization of the lexical system. In
What Defi cits Reveal about the Human Mind/Brain: Handbook
of Cognitive Neuropsychology, ed. B. Rapp, 185210. New
York : Psychology Press .
House, Arthur S., and Grant Fairbanks. 1953. The influence of consonant enviroment upon the secondary acoustical characteristics
of vowels . Journal of the Acoustical Society of America 25: 105 13.
Jackendoff , Ray. 1997. Th e Architecture of the Language Faculty.
Cambridge, MA : MIT Press .
Jakobson, Roman . 1939. Observations sur le classment phonologique des consonnes . Proceedings of the 3rd International
Conference of Phonetic Sciences, 34 41. Ghent.
Jakobson, Roman, Gunnar Fant , and Morris Halle. 1952 . Preliminaries
to Speech Analysis . Cambridge, MA : MIT Press .
Joos , Martin . 1948 . Acoustic Phonetics . Language Monograph 23,
Supplement to Language 24: 136.
Kempen, Gerard, and Pieter Huijbers . 1983. The lexicalization
process in sentence production and naming: Indirect election of
words . Cognition 14: 185 209.
Kim , Albert, and Lee Osterhout . 2005. The independence of combinatory semantic processing: Evidence from event-related potentials . Journal of Memory and Language 52: 205 25.
Kiparsky, Paul. 1979. Panini as a Variationist . Cambridge, MA : MIT
Press .
Khnert , Barbara , and Francis Nolan . 1999. The origin of coarticulation. In Coarticulation: Th eory, Data and Techniques , ed. B.
Rapp, 730. Cambridge: Cambridge University Press .
Kutas , Marta, and Steven A. Hillyard . 1980. Reading senseless
sentences: Brain potentials reflect semantic anomaly. Science
207: 203 5.
Ladefoged, P., and N. McKinney. 1963. Loudness, sound pressure,
and subglottal pressure in speech . Journal of the Acoustical
Society of America 35: 454 60.
Lambon, Ralph M., K. Sage, and A. Ellis . 1996 . Word meaning blindness: A new form of acquired dyslexia . Cognitive Neuropsychology
13: 61739.
Levelt , Willem . 1989. Speaking: From intention to articulation .
Cambridge, MA : MIT Press .
Liberman, Alvin M., Pierre C. Delattre, Franklin S. Cooper, and
Lou J. Gerstman . 1954 . The role of consonant vowel transitions
in the perception of the stop and nasal consonants . Journal of
Experimental Psychology 52: 12737.
Liberman, Alvin M., and Ignatius G. Mattingly. 1985. Th e motor
theory of speech perception revised . Cognition 21: 136.
Linebarger, Marcia, Myrna Schwartz , and Eleanor Saff ran . 1983.
Sensitivity to grammatical structure in so-called agrammatic
aphasics . Cognition 13: 36193.
Massaro, Dominic W. 1987. Speech Perception by Ear and Eye: A
Paradigm for Psychological Inquiry. Hillsdale, NJ: Lawrence
Erlbaum Associates .

McClelland, Jamesv, and David Rumelhart. 1986. Parallel Distributed


Processing: Explorations in the Microstructure of Cognition . Vol. 1.
Cambridge, MA : MIT Press.
McGurk , Harry, and John Macdonald. 1976. Hearing lips and seeing
voices. Nature 264: 746 8.
McRae, Ken, Todd R. Ferretti, and Liane Amyote. 1997. Thematic roles
as verb-specific concepts. Language and Cognitive Processes 12.2:
13776.
Meier, Richard P., Kearsy Cormier, and David Quinto-Pozos. 2002.
Modularity in Signed and Spoken Languages. Cambridge: Cambridge
University Press.
Meyer, Antje, and Kathryn Bock . 1992. Tip-of-the-tongue phenomenon: Blocking or partial activation? Memory and Cognition
20: 715 26.
Miceli, G., L. Mazzuchi, L. Menn, and H. Goodglass. 1983. Contrasting
cases of Italian agrammatic aphasia without comprehension disorder. Brain and Language 19: 65 97.
Miller, George A., and Patricia E. Nicely. 1955. An analysis of perceptual confusions among some English consonants. Journal of the
Acoustical Society of America 27: 329 35.
Minifie, Fred D. 1999. The history of physiological phonetics in the
United States. In A Guide to the History of the Phonetic Sciences in
the United States: Issued on the Occasion of the 14th International
Congress of Phonetic Sciences, San Francisco, 17 August 1999, ed.
J. Ohala, A. Bronstein, M. Bus, L. Grazio, J. Lewis, and W. Weigel.
Berkeley : University of California Press..
Moore, Brian C. J. 1989. An Introduction to the Psychology of Hearing. 3d
ed. London: Academic Press.
Nygaard, Lynn C., and David B. Pisoni. 1995. Speech
Perception: New directions in research and theory. In Speech,
Language, and Communication, ed. J. Miller and P. Eimas, 6396.
New York : Academic Press.
Osterhout, Lee, and Philip J. Holcomb. 1992. Event-related brain
potentials elicited by syntactic anomaly. Journal of Memory and
Language 31: 785 806.
Passy, Paul. 1888. Our revised alphabet. The Phonetic Teacher, 5760.
Pinker, Steven. 1999. Words and Rules: The Ingredients of Language.
New York : Basic Books.
Potter, Ralph K. 1945. Visible patterns of sound. Science 102: 46370.
Rousselot, P.-J. 18971901. Principes de Phontique Experimentale.
Paris: H. Welter.
Rueckl, Jay, Michelle Mikolinski, Michal Raveh, Caroline Miner, and F.
Mars. 1997. Morphological priming, fragment completion, and connectionist networks. Journal of Memory and Language 36: 382405.
Saltzman, Elliot L., and Kevin G. Munhall. 1989. A dynamical
approach to gestural patterning in speech production. Ecological
Psychology 1: 33382.
Saussure, Ferdinand de. [2002] 2006. crits de linguistique gnrale, ed.
Simon Bouquet and Rudolf Engler. Paris: Gallimard. English translation: Writings in General Linguistics, Oxford: Oxford University
Press.
Scharenborg , O., D. Norris, L. ten Bosch, and J. M. McQueen. 2005.
How should a speech recognizer work? Cognitive Science
29: 867918.
Schriefers, Herbert, Antje Meyer and Willem Levelt. 1990. Exploring
the time course of lexical access in language production: Pictureword interference studies. Journal of Memory and Language
29: 86 102.
Scripture, Edward Wheeler. 1902. The Elements of Experimental
Phonetics. New York : Charles Scribners Sons.
Sievers, Eduard. 1876. Grundzuge der Lautphysiologie zur Einfuhrung
in das Studium der Lautlehere der Indogermanischen Sprachen.
Leipzig: Breitkopf and Hartel.

21

The Cambridge Encyclopedia of the Language Sciences


Soli, Sig D. 1982. Structure and duration of vowels together specify fricative voicing. Journal of the Acoustical Society of America 72: 366 78.
Stetson, Raymond H. 1905. A motor theory of rhythm and discrete
succession II. Psychological Review 12: 293350.
. 1951. Motor Phonetics: A Study of Speech Movement in Action. 2d
ed. Amsterdam: North-Holland.
Studdert-Kennedy, Michael. 1980. Speech perception. Language and
Speech 23: 45 65.

22

Sweet, Henry. 1881. Sound notation. Transactions of the Philological


Society: 177235.
Trueswell, John C., and Michael K. Tanenhaus. 1994. Toward a lexicalist framework of constraint-based syntactic ambiguity resolution. In Perspectives on Sentence Processing , ed. C. Clifton, L.
Frazier, and K. Rayner, 15580. Hillsdale, NJ: Lawrence Erlbaum
Associates.

(2)

a. The girl laughed and sang.


b. The girl laughed.

Th is is an example of structural entailment, because it is the


structure itself that allows the inference (i.e., if someone does
both A and B, then someone does A). Th is particular rule is
essentially the classical inference rule of conjunction elimination from propositional logic; that is,

3
THE STRUCTURE OF MEANING

(3)

James Pustejovsky

AB
A

While this relies on a largely syntactic notion of entailment,


semantics should also explain how (4b) is a legitimate inference from (4a).
(4)

1 INTRODUCTION
Semantics is the systematic study of meaning in language. As a
discipline, it is directed toward the determination of how humans
reason with language, and more specifically, discovering the patterns of inference we employ through linguistic expressions.The
study of semantics has diverse traditions, and the current literature is quite heterogeneous and divided on approaches to some
of the basic issues facing the field (cf. semantics). While most
things in the world have meaning to us, they do not carry meaning in the same way as linguistic expressions do. For example,
they do not have the properties of being true or false, or ambiguous or contradictory. (See Davis and Gillon [2004] for discussion
and development of this argument.) For this and other reasons,
this overview essay addresses the question of how linguistic
expressions carry meaning and what they denote in the world.
Where syntax determines the constituent structure of a sentence along with the assignment of grammatical and thematic
relations, it is the role of semantics to compute the deeper
meaning of the resulting expression. For example, the two sentences in (1) differ in their syntactic structures (through their
voice), but they mean essentially the same thing; that is, their
propositional content is identical.
(1)

a. The child ate a cookie.


b. A cookie was eaten by the child.

Early on, such observations led philosophers and linguists to


distinguish meaning from the pure structural form of a sentence (Saussure [1916] 1983; Russell 1905). Semantic theories in
linguistics assume that some sort of logical form is computed
from the constituent structure associated with a sentence, and
it is this meaning representation that allows us to make categorical and truth-conditional judgments, such as the equivalence in meaning of the two sentences in (1).
Another role played by semantics is in the computation of
inferences from our utterances, such as entailments, implicatures, and presuppositions. For example, consider the various notions of entailment. From the logical form (LF) of the
sentence in (2a), semantics enables us to infer (2b) as a legitimate inference.

a. The drought killed the crops.


b. The crops died.

Such lexical entailments involve an inference that is tied directly


to the meaning of a word, namely, the verb kill; that is, when
something is killed, it dies. Hence, the role of lexical information in the construction of logical forms and the inferences we
can compute from our utterances is an important area of linguistics, and one we return to in Section 3.5 below.
There is an important distinction in semantics among propositions, sentences, and utterances. We can think of an
utterance as a speech-act, situated in time and space, that is,
which happens at a particular time and location. A sentence, on
the other hand, is a expression that is inherently linguistic, and
can be expressed on multiple occasions by multiple utterances.
The notion of a proposition is more complex and contentious, but
it is that object that is traditionally taken as being true or false,
expressed by the sentence when uttered in a specific context.

1.1 Historical Remarks


The study of meaning has occupied philosophers for centuries,
beginning at least with Platos theory of forms and Aristotles
theory of meaning. Locke, Hume, and Reid all pay particular
attention to the meanings of words in composition, but not until
the late nineteenth century do we see a systematic approach to
the study of logical syntax emerge, with the work of Bertrand
Russell and Gottlob Frege. Russell and Frege were not interested
in language as a linguistic phenomenon, but rather as a medium
through which judgments can be formed and expressed. Freges
focus lay in formulating the rules that create meaningful expressions in a compositional manner, while also introducing an
important distinction between an expressions sense and its reference (cf. sense and reference, reference and extension). Russells work on the way in which linguistic expressions
denote introduced the problem of definite descriptions
and referential failure, and what later came to be recognized as
the problem of presupposition (cf. pragmatics).
Ferdinand de Saussure ([1916] 1983), working within an
emerging structuralist tradition, developed relational techniques for linguistic analysis, which were elaborated into a
framework of componential analysis for language meaning.
The idea behind componential analysis is the reduction of a
words meaning into its ultimate contrastive elements. These

23

The Cambridge Encyclopedia of the Language Sciences


contrastive elements are structured in a matrix, allowing for
dimensional analysis and generalizations to be made about
lexical sets occupying the cells in the matrix.
Th is technique developed into a general framework for linguistic description called distinctive FEATURE ANALYSIS. Th is is
essentially the inspiration for J. Katz and J. Fodors 1963 theory of lexical semantics within transformational grammar. On
this theory, usually referred to as markerese, a lexical entry in
the language consists of grammatical and semantic markers,
and a special feature called a semantic distinguisher. In the
subsequent discussion by U. Weinreich (1972) and many others, this model was demonstrated to be far too impoverished to
characterize the compositional mechanisms inherent in language. In the late 1960s and early 1970s, alternative models of
word meaning emerged (Fillmore 1968 [frame semantics];
Lakoff [1965] 1970 [generative semantics]; Gruber 1976;
Jackendoff 1972), which respected the relational structure of
sentence meaning while encoding the named semantic functions in lexical entries. In D. R. Dowty (1979), a model theoretic
interpretation of the decompositional techniques of G. Lakoff,
J. D. McCawley, and J. R. Ross was developed.
In the later twentieth century, montague grammar
(Montague 1973, 1974) was perhaps the most significant development in the formal analysis of linguistic semantics, as it brought
together a systematic, logically grounded theory of compositionality, with a model theoretic interpretation. Subsequent work
enriched this approach with insights from D. Davidson (1967),
H. P. Grice (1969), Saul Kripke ([1972] 1980), David Lewis (1976),
and other philosophers of language (cf. Partee 1976; Davidson
and Harman 1972).
Recently, the role of lexical-syntactic mapping has become
more evident, particularly with the growing concern over projection from lexical semantic form, the problem of verbal alternations
and polyvalency, and the phenomenon of polysemy. The work of
R. Jackendoff (1983, 1997) on conceptual semantics has come to
the fore, as the field of lexical semantics has developed into a more
systematic and formal area of study (Pustejovsky and Boguraev
1993; Copestake and Briscoe 1995, 1567).
Finally, one of the most significant developments in the
study of meaning has been the dynamic turn in how sentences
are interpreted in discourse. Inspired by the work of Irene Heim
(1982) and H. Kamp (1981), the formal analysis of discourse has
become an active and growing area of research, as seen in the
works of Jeroen Groenendijk and Martin Stokhof (1991), Kamp and
U. eyle (1993), and Nicholas Asher and Alex Lascarides (2003).
In the remainder of this essay, we examine the basic principle of how meanings are constructed. First, we introduce
the notion of compositionality in language. Since words are
the building blocks of larger meanings, we explore various
approaches to lexical semantics. Then, we focus on how units
of meaning are put together compositionally to create propositions. Finally, we examine the meaning of expressions above
the level of the sentence, within a discourse.

1.2 Compositionality
Because semantics focuses on how linguistic expressions come
to have meaning, one of the most crucial concepts in the field

24

is the notion of compositionality (cf. compositionality). As


speakers of language, we understand a sentence by understanding its parts and how they are put together. The principle
of compositionality characterizes how smaller units of meaning are put together to form larger, more meaningful expressions in language. The most famous formulation of this notion
comes from Frege, paraphrased as follows:
The meaning of an expression is a function of the meanings of
its parts and the way they are syntactically combined. (Partee
1984)

Th is view has been extremely influential in semantics


research over the past 40 years. If one assumes a compositional
approach to the study of meaning, then two things immediately
follow: 1) One must specify the specific meaning of the basic
elements of the language, and 2) one must formulate the rules
of combination for how these elements go together to make
more complex expressions. The fi rst aspect includes determining what words and morphemes mean, that is, lexical semantics, which we address in the next section. The second aspect
entails defi ning a calculus for how these elements compose to
form larger expressions, that is, argument selection and modification. Needless to say, in both of these areas, there is much
divergence of opinion, but semanticists generally agree on the
basic assumptions inherent in compositionality.

2 LEXICAL MEANING
Semantic interpretation requires access to knowledge about
words. The lexicon of a grammar must provide a systematic and
efficient way of encoding the information associated with words
in a language. lexical semantics is the study of what words
mean and how these meanings are structured. The lexicon is
not merely a collection of words with their semantic forms, but
rather a set of structured objects that participate in larger operations and compositions, both enabling syntactic environments
and acting as signatures to semantic entailments and implicatures in the context of larger discourse.
There are four basic questions in modeling the semantic
content and structure of the lexicon: 1) What semantic information goes into a lexical entry? 2) How do lexical entries
relate semantically to one another? 3) How is this information exploited compositionally by the grammar? 4) How is this
information available to semantic interpretation generally?
The lexicon and lexical semantics have traditionally been
viewed as the most passive modules of language, acting in the service of the more dynamic components of the grammar. This view
has its origins in the generative tradition (Chomsky [1955] 1975)
and has been an integral part of the notion of the lexicon ever
since. While the Aspects-model of selectional features (Chomsky
1965) restricted the relation of selection to that between lexical
items, work by McCawley (1968) and Jackendoff (1972) showed
that selectional restrictions must be available to computations
at the level of derived semantic representation rather than at
deep structure. Subsequent work by Joan Bresnan (1982), Gerald
Gazdar et al. (1985), and C. Pollard and I. Sag (1994) extend the
range of phenomena that can be handled by the projection and
exploitation of lexically derived information in the grammar.

The Structure of Meaning


and alternations, among other relations (cf. Pustejovsky and
Boguraev 1993).

Natural Entity

Physical

Abstract

2.2 Argument Structure


Mass

Individuated

inanimate

animate

rock

human

Mental

Experiential

Figure 1.

Recently, with the convergence of several areas in linguistics (lexical semantics, computational lexicons, type theories),
several models for the determination of selection have emerged
that put even more compostional power in the lexicon, making
explicit reference to the paradigmatic systems that allow for
grammatical constructions to be partially determined by selection. Examples of this approach are generative lexicon theory
(Pustejovsky 1995; Bouillon and Busa 2001), and construction grammar (Goldberg, 1995; Jackendoff 1997, 2002). These
developments have helped to characterize the approaches to
lexical design in terms of a hierarchy of semantic expressiveness. There are at least three such classes of lexical description,
defi ned as follows: sense enumerative lexicons, where lexical
items have a single type and meaning, and ambiguity is treated
by multiple listings of words; polymorphic lexicons, where lexical items are active objects, contributing to the determination of
meaning in context, under well-defi ned constraints; and unrestricted sense lexicons, where the meanings of lexical items are
determined mostly by context and conventional use. It seems
clear that the most promising direction seems to be a careful
and formal elucidation of the polymorphic lexicons, and this will
form the basis of our subsequent discussion.
Lexical items can be systematically grouped according to
their syntactic and semantic behavior in the language. For this
reason, there have been two major traditions of word clustering, corresponding to this distinction. Broadly speaking, for
those concerned mainly with grammatical behavior, the most
salient aspect of a lexical item is its argument structure; for those
focusing on a words entailment properties, the most important
aspect is its semantic class. In this section, we examine these
two approaches and see how their concerns can be integrated
into a common lexical representation.

Once the basic semantic types for the lexical items in the
language have been specified, their subcategorization and
selectional information must be encoded in some form. The
argument structure for a word can be seen as the simplest specification of its semantics, indicating the number and type of
parameters associated with the lexical item as a predicate. For
example, the verb die can be represented as a predicate taking
one argument, and kill as taking two arguments, while the verb
give takes three arguments.
(5)

What originally began as the simple listing of the parameters


or arguments associated with a predicate has developed into
a sophisticated view of the way arguments are mapped onto
syntactic expressions. E. Williamss (1981) distinction between
external (the underlined arguments for kill and give) and
internal arguments and J. Grimshaws proposal for a hierarchically structured representation (cf. Grimshaw 1990) provide
us with the basic syntax for one aspect of a words meaning.
Similar remarks hold for the argument list structure in HPSG
(head-driven phrase structure grammar) and LFG
(lexical-functional grammar).
One influential way of encoding selectional behavior has
been the theory of thematic relations (cf. thematic roles;
Gruber 1976; Jackendoff 1972). Thematic relations are now generally defi ned as partial semantic functions of the event being
denoted by the verb or noun, and behave according to a predefi ned calculus of role relations (e.g., Carlson 1984; Dowty
1991; Chierchia 1989). For example, semantic roles, such as
agent, theme, and goal, can be used to partially determine
the meaning of a predicate when they are associated with the
grammatical arguments to a verb.
(6)

One of the most common ways to organize lexical knowledge


is by means of type or feature inheritance mechanisms (Evans
and Gazdar 1990; Carpenter 1992; Copestake and Briscoe
1992; Pollard and Sag 1994). Furthermore, T. Briscoe, V. de
Paiva, and A. Copestake (1993) describe a rich system of types
for allowing default mechanisms into lexical type descriptions. Similarly, type structures, such as that shown in Figure
1, can express the inheritance of syntactic and semantic features, as well as the relationship between syntactic classes

a. put< AGENT,THEME,LOCATION>
b. borrow<RECIPIENT,THEME,SOURCE>

Thematic roles can be ordered relative to each other in terms


of an implicational hierarchy. For example, there is considerable use of a universal subject hierarchy such as shown in the
following (cf. Fillmore 1968; Comrie 1981).
(7)

2.1 Semantic Classes

a. die(x)
b. kill(x,y)
c. give(x,y,z)

AGENT > RECIPIENT/BENEFACTIVE > THEME/PATIENT >


INSTRUMENT > LOCATION

Many linguists have questioned the general explanatory


coverage of thematic roles, however, and have chosen alternative methods for capturing the generalizations they promised. Dowty (1991) suggests that theta-role generalizations
are best captured by entailments associated with the predicate itself. A theta-role can then be seen as the set of predicate
entailments that are properties of a particular argument to the
verb. Characteristic entailments might be thought of as prototype roles, or proto-roles; this allows for degrees or shades of

25

The Cambridge Encyclopedia of the Language Sciences


meaning associated with the arguments to a predicate. Others
have opted for a more semantically neutral set of labels to
assign to the parameters of a relation, whether it is realized as
a verb, noun, or adjective. For example, the theory of argument
structure as developed by Williams (1981), Grimshaw (1990),
and others can be seen as a move toward a more minimalist description of semantic differentiation in the verbs list of
parameters.
The interaction of a structured argument list and a rich
system of types, such as that presented previously, provides
a mechanism for semantic selection through inheritance.
Consider, for instance the sentence pairs in (8).
(8)

a. The man / the rock fell.


b. The man / *the rock died.

Now consider how the selectional distinction for a feature


such as animacy is modeled so as to explain the selectional
constraints of predicates. For the purpose of illustration, the
arguments of a verb will be identified as being typed from the
system shown previously.
(9)

a. x :physical[fall(x)]
b. x :animate[die(x)]

In the sentences in (8), it is clear how rocks cant die and men
can, but it is still not obvious how this judgment is computed,
given what we would assume are the types associated with
the nouns rock and man, respectively. What accomplishes this
computation is a rule of subtyping, , that allows the type associated with the noun man (i.e., human) to also be accepted as
the type animate, which is what the predicate die requires of its
argument as stated in (9b) (cf. Gunter 1992; Carpenter 1992):
(10) [human animate]: human animate

The rule applies since the concept human is subtyped under


animate in the type hierarchy. Parallel considerations rule out
the noun rock as a legitimate argument to die since it is not subtyped under animate. Hence, one of the concerns given for the
way that syntactic processes can systematically keep track of
which selectional features are entailed and which are not is partially addressed by such lattice traversal rules as the one presented here.

2.3 Decomposition
The second approach to the aforementioned lexical specification is to defi ne constraints internally to the predicate itself.
Traditionally, this has been known as lexical decomposition.
Since the 1960s, lexical semanticists have attempted to formally model the semantic relations between such lexical items
as the adjective dead and the verbs die and kill (cf. Lakoff [1965]
1970; McCawley 1968) in the sentences that follow.
(11) a. John killed Bill.
b. Bill died.
c. Bill is dead.

Assuming that the underlying form for a verb like kill directly
encodes the stative predicate in (11c) and the relation of causation,
generative semanticists posited representations such as (12).

26

12 CAUSE x, BECOME NOT ALIVE y

Here, the predicate CAUSE is represented as a relation between


an individual causer x and an expression involving a change
of state in the argument y. R. Carter ([1976] 1988) proposes a
representation quite similar, shown here for the causative verb
darken:
13 x CAUSE y BE.DARK CHANGE

Although there is an intuition that the cause relation


involves a causer and an event, neither Lakoff nor Carter makes
this commitment explicitly. In fact, it has taken several decades
for Davidsons (1967) observations regarding the role of events
in the determination of verb meaning to fi nd their way convincingly into the major linguistic frameworks. Recently, a new
synthesis has emerged that attempts to model verb meanings
as complex predicative structures with rich event structures
(cf. Parsons 1990; Pustejovsky 1991b; Tenny 1992; Krifka 1992).
Th is research has developed the idea that the meaning of a verb
can be analyzed into a structured representation of the event
that the verb designates, and has furthermore contributed to
the realization that verbs may have complex, internal event
structures. Recent work has converged on the view that complex events are structured into an inner and an outer event,
where the outer event is associated with causation and agency
and the inner event is associated with telicity (completion) and
change of state (cf. Tenny and Pustejovsky 2000; Levin and
Rappaport Hovav 2005).
Jackendoff (1990) develops an extensive system of what he
calls Conceptual Representations, which parallel the syntactic
representations of sentences of natural language. These employ
a set of canonical predicates, including CAUSE, GO, TO, and ON,
and canonical elements, including Thing, Path, and Event. These
approaches represent verb meaning by decomposing the predicate into more basic predicates. Th is work owes obvious debt to
the innovative work within generative semantics, as illustrated
by McCawleys (1968) analysis of the verb kill. Recent versions
of lexical representations inspired by generative semantics can
be seen in the Lexical Relational Structures of K. Hale and S.
J. Keyser (1993), where syntactic tree structures are employed
to capture the same elements of causation and change of state
as in the representations of Carter, Levin and T. Rapoport,
Jackendoff, and Dowty. The work of Levin and Rappaport, building on Jackendoff s Lexical Conceptual Structures, has been
influential in further articulating the internal structure of verb
meanings (see Levin and Rappaport 1995).
J. Pustejovsky (1991b) extends the decompositional approach
presented in Dowty (1979) by explicitly reifying the events and
subevents in the predicative expressions. Unlike Dowtys treatment of lexical semantics, where the decompositional calculus builds on propositional or predicative units (as discussed
earlier), a syntax of event structure makes explicit reference
to quantified events as part of the word meaning. Pustejovsky
further introduces a tree structure to represent the temporal
ordering and dominance constraints on an event and its subevents. For example, a predicate such as build is associated with
a complex event such as that shown in the following (cf. also
Moens and Steedman 1988).

The Structure of Meaning


(14) [transition [e1:PROCESS ] [e2:STATE ] ]

The process consists of the building activity itself, while the State
represents the result of there being the object built. Grimshaw
(1990) adopts this theory in her work on argument structure,
where complex events such as break are given a similar representation. In such structures, the process consists of what x does to
cause the breaking, and the state is the resultant state of the broken item. The process corresponds to the outer causing event as
discussed earlier, and the state corresponds in part to the inner
change of state event. Both Pustejovsky and Grimshaw differ
from earlier authors in assuming a specific level of representation for event structure, distinct from the representation of other
lexical properties. Furthermore, they follow J. Higginbotham
(1989) in adopting an explicit reference to the event place in the
verbal semantics. Recently, Levin and Rappaport (2001, 2005)
have adopted a large component of the event structure model for
their analysis of verb meaning composition.

2.4 Noun Meaning


Thus far, we have focused on the lexical semantics of verb
entries. All of the major categories, however, are encoded with
syntactic and semantic feature structures that determine their
constructional behavior and subsequent meaning at logical
form. In Generative Lexicon Theory (Pustejovsky, 1995), it is
assumed that word meaning is structured on the basis of four
generative factors (qualia roles) that capture how humans
understand objects and relations in the world and provide the
minimal explanation for the linguistic behavior of lexical items
(these are inspired in large part by Moravcsiks (1975, 1990)
interpretation of Aristotelian aitia). These are: the formal
role: the basic category that distinguishes the object within
a larger domain; constitutive role: the relation between an
object and its constituent parts; the telic role: its purpose and
function; and the agentive role: factors involved in the objects
origin or coming into being.
Qualia structure is at the core of the generative properties of
the lexicon, since it provides a general strategy for creating new
types. For example, consider the properties of nouns such as
rock and chair. These nouns can be distinguished on the basis
of semantic criteria that classify them in terms of general categories, such as natural kind or artifact object. Although very
useful, this is not sufficient to discriminate semantic types in a
way that also accounts for their grammatical behavior. A crucial distinction between rock and chair concerns the properties
that differentiate natural kinds from artifacts : Functionality
plays a crucial role in the process of individuation of artifacts, but not of natural kinds. Th is is reflected in grammatical
behavior, whereby a good chair or enjoy the chair are wellformed expressions reflecting the specific purpose for which
an artifact is designed, but good rock or enjoy a rock are
semantically ill-formed since for rock the functionality (i.e.,
telic) is undefi ned. Exceptions exist when new concepts are
referred to, such as when the object is construed relative to a
specific activity, for example, as in The climber enjoyed that
rock; rock itself takes on a new meaning, by virtue of having
telicity associated with it, and this is accomplished by integration with the semantics of the subject noun phrase (NP).

Although chair and rock are both physical objects, they differ
in their mode of coming into being (i.e., agentive): Chairs are
man-made; rocks develop in nature. Similarly, a concept such
as food or cookie has a physical manifestation or denotation,
but also a functional grounding pertaining to the relation of
eating. These apparently contradictory aspects of a category
are orthogonally represented by the qualia structure for that
concept, which provides a coherent structuring for different
dimensions of meaning.

2.5 The Problem of Polysemy


Given the compactness of a lexicon relative to the number of
objects and relations in the world, and the concepts we have
for them, lexical ambiguity is inevitable. Add to this the cultural, historical, and linguistic blending that contributes to the
meanings of our lexical items, and ambiguity can appear arbitrary as well. Hence, homonymy where one lexical form has
many meanings is to be expected in a language. Examples of
homonyms are illustrated in the following sentences:
(15) a. Mary walked along the bank of the river.
b. She works for the largest bank in the city.
(16) a. The judge asked the defendant to approach the bar.
b. The defendant was in the pub at the bar.

Weinreich (1964) calls such lexical distinctions contrastive


ambiguity, where it is clear that the senses associated with
the lexical item are unrelated. For this reason, it is generally
assumed that homonyms are represented as separate lexical
entries within the organization of the lexicon. Th is accords with
a view of lexical organization that has been termed a sense enumeration lexicon (cf. Pustejovsky 1995). Words with multiple
senses are simply listed separately in the lexicon, but this does
not seem to compromise or complicate the compositional process of how words combine in the interpretation of a sentence.
Th is model becomes difficult to maintain, however, when
we consider the phenomenon known as polysemy. Polysemy is
the relationship that exists between different senses of a word
that are related in some logical manner, rather than arbitrarily,
as in the previous examples. It is illustrated in the following
sentences (cf. Apresjan 1973; Pustejovsky 1991a, 1998).
(17) a. Mary carried the book home.
b. Mary doesnt agree with the book.
(18) a.
b.
(19) a.
b.

Mary has her lunch in her backpack.


Lunch was longer today than it was yesterday.
The fl ight lasted three hours.
The fl ight landed on time in Los Angeles.

Notice that in each of these pairs, the same nominal form


is assuming different semantic interpretations relative to its
selective context. For example, in (17a), the noun book refers
to a physical object, while in (17b), it refers to the informational content. In (18a), lunch refers to the physical manifestation of the food, while in (18b), it refers to the eating event.
Finally, in (19a), flight refers to the fl ying event, while in
(19b), it refers to the plane. Th is phenomenon of regular (or
logical) polysemy is one of the most challenging in semantics

27

The Cambridge Encyclopedia of the Language Sciences


and has stimulated much research recently (Bouillon 1997;
Bouillon and Busa 2001; Cooper 2006). The determination
of what such lexical items denote will of course have consequences for ones theory of compositionality, as we will see in
a later section.

3 BUILDING SENTENCE MEANINGS


3.1 Function Application
The principle of compositionality follows the view that syntax
is an initial guide to the interpretation process. Hence, there
would appear to be a strong relationship between the meaning
of a phrase and where it appears in a sentence, as is apparent
from grammatical function in the following sentences.
(20) a. The woman loves the child.
b. The child loves the woman.

However, this is not always a reliable association, as seen


in languages that have freer word order restrictions, such as
German.
(21) a. Die Frau liebt das Kind.
The woman loves the child.
b. Das Kind liebt die Frau.
The child loves the woman.

In German, both word orders are ambiguous, since information about the grammatical case and gender of the two NPs
is neutralized.
Although there is often a correlation between the grammatical relation associated with a phrase and the meaning
assigned to it, this is not always a reliable association. Subjects
are not always doers and objects are not always undergoers
in a sentence. For example, notice how in both (22a) and (22b),
the NP the watch is playing the same role; that is, it is undergoing a change, even though it is the subject in one sentence and
the object in the other.
(22) a. The boy broke the watch.
b. The watch broke.

To handle such verbal alternations compositionally requires


either positing separate lexical entries for each syntactic construction associated with a given verb, or expressing a deeper
relation between different verb forms.
For most semantic theories, the basic mechanism of compositionality is assumed to be function application of some
sort. A rule of application, apply, acts as the glue to assign (or
discharge) the argument role or position to the appropriate
candidate phrase in the syntax. Thus, for a simple transitive
sentence such as (23a), two applications derive the propositional interpretation of the sentence in (23d).
(23) a.
b.
c.
d.

John loves Mary.


love(Arg1,Arg2)
APPLY love(Arg1,Arg2) to Mary = love(Arg1,Mary)
APPLY love(Arg1,Mary) to John = love(John,Mary)

One model used to defi ne the calculus of compositional


combinations is the -calculus (Barendregt 1984). Using the

28

Figure 2.
language of types, we can express the rule of APPLY as a property associated with predicates (or functions), and application
as a relationship between expressions of specific types in the
language.
(24) Function Application:
If is of type a, and is of type a b, then () is of type b.

Viewed as typed expressions, the separate linguistic units


in (23a) combine as function application, as illustrated in
Figure 2.
As one can see, the -calculus is an expressive mechanism
for modeling the relation between verbs and their arguments
interpreted as function application.
One important extension to the type language used here
provides a compositional analysis of the semantics of propositional attitude verbs, such as believe and think (Montague 1973).
The sentential complements of such verbs, as is well known,
create opaque contexts for substitutions under identity. For
example, if Lois is unaware of Supermans true identity, then
the belief statement in (25b) is false, even though (25a) is true.
(25) a. Lois believes Superman rescued the people.
b. Lois believes Clark Kent rescued the people.

On this view, verbs such as believe introduce an intensional context for the propositional argument, instead of an
extensional one. In such a context, substitution under identity is
not permitted without possibly affecting the truth value (truth
conditional semantics). Th is is an important contribution
to the theory of meaning, in that a property of opacity is associated with specific types within a compositional framework.
One potential challenge to a theory of function application
is the problem of ambiguity in language. Syntactic ambiguities
arise because of the ways in which phrases are bracketed in a
sentence, while lexical ambiguity arises when a word has multiple interpretations in a given context. For example, in the following sentence, the verb treat can mean one of two things:
(26) The doctor treated the patient well.

Either 1) the patient is undergoing medical care, or 2) the doctor was kind to the patient. More often than not, however, the
context of a sentence will eliminate such ambiguities, as shown
in (27).
(27) a. The doctor treated the patient with antibiotics. (Sense 1)
b. The doctor treated the patient with care. (Sense 2)

In this case, the interpretation is constructed from the appropriate meaning of the verb and how it combines with its
arguments.

The Structure of Meaning


3.2 Quantifiers and Scope
Another type of ambiguity, one that is not associated with
the constituent structure of the sentence or lexical senses in
any obvious way, involves quantified noun phrases (e.g., every
cookie, some cake, and most pies). It is interesting that when a
sentence has more than one of these phrases, one often sees
more than one interpretation possible because of the ways the
quantified NPs relate to each other. Th is is not the case in the
following sentence, however, where there is only one interpretation as to what happened with the cookie.
(28) Some student ate a cookie.

Now consider the sentences in (29), where there is a combination of a some-NP and an every-NP.
(29) a. Every student saw a movie.
b. Every cookie was eaten by a student.

(32) a. Every woman sang a song.


b. xy[woman(x) [song ( y) & sang (x, y)]]
c. yx[[song ( y) & woman(x)] sang (x, y)]]

An alternative treatment for handling such cases is to posit a


rule of quantifier raising, where the scope ambiguity is reduced
to a difference in syntactic structures associated with each
interpretation (May 1985).

3.3 Semantic Modification


In constructing the meaning of expressions, a semantic theory must also account for how the attribution of properties
to an entity is computed, what is known as the problem of
modification. The simplest type of modification one can imagine is intersective attribution. Notice that in the phrases in (33),
the object denoted correctly has both properties expressed in
the NP:

The sentence in (29a) can mean one of two things: 1) that


there was one movie, for example, Star Wars, that every student saw; or 2) that everyone saw a movie, but it didnt have to
be the same one. Similarly, for (29b), there could be one student who ate all the cookies, or each cookie that was eaten
by a different student. Th is kind of quantifier scope ambiguity
has to be resolved in order to determine what kind of inferences one can make from a sentence. Syntax and semantics
must interact to resolve this kind of ambiguity, and it is the
theory of sentence meaning that defi nes this interaction (cf.
quantification).
One of the roles of semantic theory is to correctly derive the
entailment relations associated with a sentences logical form,
since this has an obvious impact on the valid reasoning patterns in the language. How these interpretations are computed
has been an area of intense research, and one of the most influential approaches has been the theory of generalized quantifiers (cf. Barwise and Cooper 1981). On this approach, the
denotation of an NP is treated as a set of sets of individuals, and
a sentence structure such as [NP VP] is true if and only if the
denotation of the VP is a member of the family of sets denoted
by the NP. That is, the sentence in (30) is true if and only if singing (the denotation of the VP) is a member of the set of properties denoted by every woman.

(33) a. black coffee x[black (x) & coffee (x)]


b. Italian singer x[Italian(x) & singer (x)]
c. metal cup x[metal(x) & cup (x)]

(30) Every woman sang.

In each of these sentences, good is a manner modifier whose


interpretation is dependent on the noun it modifies; in (35a), it
means to teach well; in (35b), it means a tasty meal; and in
(35c), it means something keeping you dry. Similar remarks
hold for the adjective dangerous.

On this view, quantifiers such as most, every, some, and so


on are actually second-order relations between predicates,
and it is partly this property that allows for the compositional
interpretation of quantifier scope variation seen previously.
The intended interpretation of (30) is (31b), where the subject
NP every woman is interpreted as a function, taking the VP as
its argument.
(31) a. P x[woman(x) P (x)](sang)
b. x[woman(x) sang (x)]

When combined with another quantified expression, as in


(32a), the relational interpretation of the generalized quantifiers is crucial for being able to determine both scope interpretations shown in (32).

There are two general solutions to computing the meaning


of such expressions: a) Let adjectives be functions over common noun denotations, or b) let adjectives be normal predicates, and have a semantic rule associated with the syntax of
modification.
Computing the proper inferences for relative clauses will
involve a similar strategy, since they are a sort of intersective
modification. That is, for the relative clause in (34), the desired
logical form will include an intersection of the head noun and
the relation predicated in the subordinated clause.
(34) a. writer who John knows
b. x[writer (x) & know ( j, x)]

Unfortunately, however, most instances of adjectival modification do not work so straightforwardly, as illustrated in (35).
Adjectives such as good, dangerous, and fast modify polysemously in the following sentences.
(35) a. John is a a good teacher.
b. A good meal is what we need now.
c. Mary took a good umbrella with her into the rain.

(36) a. Th is is a dangerous road at night.


b. She used a dangerous knife for the turkey.

That is, the road is dangerous in (36a) when one drives on it,
and the knife is dangerous in (36b) when one cuts with it.
Finally, the adjective fast in the following sentences acts as
though it is an adverb, modifying an activity implicit in the
noun, that is, programming in (37a) and driving in (37b).
(37) a. Mary is the fastest programmer we have on staff.
b. The turnpike is a faster road than Main Street.

29

The Cambridge Encyclopedia of the Language Sciences


To account for such cases, it is necessary to enrich the mode of
composition beyond simple property intersection, to accommodate the context dependency of the interpretation. Analyses
taking this approach include Borschev and Partee (2001),
Bouillon (1997), and Pustejovsky (1995).

3.4 Arguments versus Adjuncts


In our discussion thus far of how predicates select arguments to
create compositionally complex expressions, we have assumed
that the matrix predicate (the main verb of the sentence) acts
as the only function over other phrases. In fact, what an argument of the verb is and what an adjunct is are questions just as
much of meaning as of syntax. In this section, we examine the
semantic issues involved.
In this overview, we have adopted the position that language reflects the workings of our deeper conceptual systems in some direct and nonidiosyncratic manner. Lexical
choice as well as specific grammatical phenomena can be
constrained by underlying conceptual bias. Well-known
examples of this transparency include count/mass noun
distinctions in the lexicon, and case marking and valence
distinctions in the syntax. For example, concepts entailing unindividuated stuff or material will systematically be
semantically typed as mass nouns in the grammar, whereas
naturally individuating (countable) substances will assume
the status of count nouns, with their respective grammatical
consequences, as illustrated in (38). (Some mass terms are
not shared by all languages, such as the concept of paper
or furniture.)
(38) a. {not much/all/lots of } gold/water/dirt/sand
b. {every/two/several} chairs/girls/beaches

Similarly, as presented in previous sections, the classification of verbs appears to reflect their underlying relational
structure in fairly obvious ways.
(39) a. Mary arrived.
b. John greeted Mary.
c. Mary gave a book to John.

That is, the argument structure of each verb encodes the


semantics of the underlying concept, which in turn is reflected
in the projection to the specific syntactic constructions, that
is, as intransitive, transitive, and ditransitive constructions,
respectively. For unary, binary, and ternary predicates, there
is a visible or transparent projection to syntax from the underlying conceptual structure, as well as a predictable compositional derivation as function application.
So, the question arises as to what we do with nonselected
arguments and adjuncts within the sentence. It is well known,
for example, that arguments not selected by the predicate
appear in certain contexts (cf. Jackendoff 1992; Levin and
Rappaport Hovav 2005).
(40) a. The man laughed himself sick.
b. The girl danced her way to fame.
c. Mary nailed the window shut.

30

Each of the italicized phrases is an argument of something,


but is it selected by the matrix predicate? Jackendoff has proposed a solution that relies on the notion of construction,
as introduced by A. E. Goldberg (1995) (cf. construction
grammars).
Another problem in compositionality emerges from the
interpretation of adjuncts. The question posed by the examples in (41) is this: Which NPs are arguments semantically and
which are merely adjuncts?
(41) a.
b.
c.
d.

Mary ate the soup.


Mary ate the soup with a spoon.
Mary ate the soup with a spoon in the kitchen.
Mary ate the soup with a spoon in the kitchen at 3:00 p.m.

For Davidson (1967), there is no semantic distinction between


arguments and adjuncts in the logical form. Under his proposal, a two-place predicate such as eat contains an additional
argument, the event variable, e, which allows each event participant a specific role in the interpretation (cf. Parsons 1990;
event structure and grammar).
(42) yxe[eat(e, x, y)]

Then, any additional adjunct information (such as locations,


instruments, etc.) is added by conjunction to the meaning
of the main predicate, in a fashion similar to the interpretation of intersective modification over a noun. In this manner, Davidson is able to capture the appropriate entailments
between propositions involving action and event expressions
through conventional mechanisms of logical entailment. For
example, to capture the entailments between (41bd) and (41a)
in the following, each more specifically described event entails
the one above it by virtue of conjunction elimination (already
encountered) on the expression.
(43) a. e[eat(e, m, the-soup)]
b. e[eat(e, m, the-soup) & with(e, a spoon)]
c. e[eat(e, m, the-soup) & with(e, a spoon) & in(e, the
kitchen)]
d. e[eat(e, m, the-soup) & with(e, a spoon) & in(e, the
kitchen) & at (e, 3:00 p.m.)]

Th is approach has the advantage that no special inference


mechanisms are needed to derive the entailment relations
between the core propositional content in (43a) and forms
modified through adjunction. Th is solution, however, does not
extend to cases of verbs with argument alternations that result
in different meanings. For example, how do we determine what
the core arguments are for a verb like sweep?
(44) a.
b.
c.
d.
e.
f.

John swept.
John swept the floor.
John swept the dirt.
John swept the dirt off the sidewalk.
John swept the floor clean.
John swept the dirt into a pile.

The semantics of such a verb should determine what its arguments are, and how the different possible syntactic realizations
relate to each other semantically. These cases pose an interest-

The Structure of Meaning


ing challenge for the theory of compositionality (cf. Jackendoff
2002).

3.5 Presupposition
In computing the meaning of a sentence, we have focused
on that semantic content that is asserted by the proposition.
Th is is in contrast to what is presupposed. A presupposition
is that propositional meaning that must be true for the sentence containing it to have a proper semantic value (Stalnaker
1970; Karttunen 1974; Potts 2005). (Stalnaker makes the distinction between what a speaker says and what a speaker
presupposes.)
Such knowledge can be associated with a word, a grammatical feature, or a syntactic construction (so-called presupposition triggers). For example, in (45) and (46), the complement
proposition to each verb is assumed to be true, regardless of the
polarity assigned to the matrix predicate.
(45) a. Mary realized that she was lost.
b. Mary didnt realize that she was lost.
(46) a. John knows that Mary is sick.
b. John doesnt know that Mary is sick.

There are similar presuppositions associated with aspectual


predicates, such as stop and finish, as seen in (47).
(47) a. Fred stopped smoking.
b. John fi nished painting his house.

In these constructions, the complement proposition is assumed


to have been true before the assertion of the sentence.
Such conventional presuppositions are also triggered by
interrogative contexts, such as seen in (48).
(48) a. Why did you go the store?
b. When did you see Mary?

As with all presuppositions, however, they are defeasible, as


the answer to (48b) in (49) illustrates.
(49) But I didnt see Mary.

Conversational presuppositions, on the other hand, are implicated propositions by virtue of a context and discourse situation.
The response in (50b) conversationally implicates that I am not
hungry (Recanati 2002); conversational implicature).
(50) a. Are you hungry?
b. Ive had a very large breakfast.

The meaning of such implicatures is not part of the asserted


content of the proposition, but computed within a conversational context in a discourse. We will return to this topic in a
later section.

3.6 Noncompositionality
While semantic theory seems to conform to the principles of
compositionality in most cases, there are many constructions
that do not fit into the conventional function application paradigm. A phrase is noncompositional if its meaning cannot

be predicted from the meaning of its parts. We have already


encountered modification constructions that do not conform to
simple intersective interpretations, for example, good teacher.
There are two other constructions that pose a problem for the
principle of compostionality in semantics:
(51) a. Idioms: hear it through the grapevine, kick the bucket ;
b. Coercions: begin the book , enjoy a coffee.

The meaning of an idiom such as leave well enough alone is


in no transparent way composed of the meanings of its parts.
Although there are many interesting syntactic properties and
constraints on the use of idiomatic expressions in languages,
from a semantic point of view its meaning is clearly associated
with the entire phrase. Hence, the logical form for (52),
(52) Every person kicked the bucket.

will make reference to quantification over persons, but not


over buckets (cf. [53]).
(53) x[ person(x) & kick.the.bucket (x)]

We confront another kind of noncompositionality in


semantics when predicates seem to appear with arguments of
the wrong type. For example, in (54a), a countable individual entity is being coerced into the food associated with that
animal, namely, bits of chicken, while in (54b), the mass terms
water and beer are being packaged into unit measures (Pelletier,
1975). In (55), the aspectual verbs normally select for an event,
but here are coercing entities into event denotations. Similarly,
in (56), both object NPs are being coerced into propositional
interpretations. (Cf. Pustejovsky 1995 and Jackendoff 2002 for
discussions of coercion phenomena and their treatment.)
(54) a. Theres chicken in the salad.
b. Well have a water and two beers.
(55) a. Roser fi nished her thesis.
b. Mary began the novel.
(56) a. Mary believes Johns story.
b. Mary believes John.

These examples illustrate that semantics must accommodate


specific type-shifting and coercing operations in the language
in order to remain compositional. In order to explain just such
cases, Pustejovsky (2007) presents a general theory of composition that distinguishes between four distinct modes of argument selection: a) function application, b) accommodation,
c) coercion by introduction, and d) coercion by exploitation.

4 DISCOURSE STRUCTURE
Thus far we have been concentrating on the meaning of single
sentences. But no sentence is really ever uttered outside of a
context. Language is used as a means of communication and is
as much a way of acting as a means of representing (Austin 1975;
Searle 1969). In this section, we briefly survey the major areas
of research in discourse semantics. We begin by examining the semantic models that have emerged to account for
dynamic phenomena in discourse, such as intersentential

31

The Cambridge Encyclopedia of the Language Sciences


anaphora. We then look at how discourse relations can be used
to model larger units of meaning.
From our previous discussion, we have assumed the sentence as the unit for semantic interpretation, including the
level for the interpretation of quantifier scope and anaphoric
binding, as in (57).
(57) a. Every actress said she was happy.
b. Every actress came in and said hello.

Notice that the anaphoric link between the quantifier and the
pronoun in (57a) is acceptable, while such a binding is not possible within a larger discourse setting, as in (58) and (59).
(58) a. Every actress came in.
b. *She said she was happy.
(59) a. Every actress came in.
b. *She said hello.

So, in a larger unit of semantic analysis, a bound variable interpretation of the pronoun does not seem permitted.
Now notice that indefi nites do in fact allow binding across
the level of the sentence.
(60) a. An actress came in.
b. She said hello.

The desired interpretation, however, is one that the semantic


model we have sketched out is unable to provide.
(61) a. x[actress (x) & come.in(x)]
b. [& say.hello (x)]

What this example points out is that the view of meaning we


have been working with so far is too static to account for phenomena that are inherently dynamic in nature (Chierchia 1995;
Groenendijk and Stokhof 1991; Karttunen 1976). In this example, the indefi nite NP an actress is being used as a discourse
referent, and is available for subsequent reference as the story
unfolds in the discourse.
Following Kamp and Reyles (1993) view, an indefi nite NP
introduces a novel discourse referent, while a pronoun or definite description says something about an existing discourse
referent. Using the two notions of novelty and familiarity, we
can explain why she in (60b) is able to bind to the indefi nite;
namely, she looks for an accessible discourse referent, the
indefi nite. The reason that (58) and (59) are not good discourses
is due to the universally quantified NP every actress, which is
inaccessible as an antecedent to the pronoun.
One influential formalization of this approach is Dynamic
Predicate Logic (Groenendijk and Stokhof, 1991), which combines conventional interpretations of indefi nites as existentials
with the insight from incremental interpretations, mentioned
previously. On this view, the interpretation of a sentence is a
function of an ordered pair of assignments, rather than a static
single assignment. The output condition for a sentence with
an indefi nite NP, such as (60a), specifies that a subsequent sentence with a pronoun can share that variable assignment: The
meaning of a sentence lies in the way it changes the representation of the information of the interpreter (ibid.). That is, when
a quantified expression is used in discourse, something new

32

is added to the listeners interpretation state so that the listener can use the quantifier to help understand future utterances. In this way, the meaning of a sentence is interpreted
dynamically.
The dynamics of discourse, of course, involve more than the
binding of anaphors to antecedents across adjacent sentences.
Every utterance is made in the context of a common ground
of shared knowledge (presuppositions), with a communicative intent, and in a particular time and place (cf. discourse
analysis, communicative intention). Just as sentences
have internal structure, with both syntactic and semantic
dependencies, discourse can also be viewed as a sequence of
structured segments, with named dependencies between them.
For example, the sentences in (62) form a discourse structured
by a relation of narration, implying temporal sequence (Dowty,
1986).
(62) a. John entered the room.
b. He sat down.

In (63), on the other hand, the two sentences are related by the
dependency of explanation, where (63b) temporally precedes
and explains (63a).
(63) a. Max fell.
b. John pushed him.

Theories of discourse relations, such as rhetorical structure theory (Mann and Thompson 1986), segmented discourse representation theory (SDRT) (Asher and Lascarides 3),
and that of Hobbs (1985) attempt to model the rhetorical functions of the utterances in the discourse (hence, they are more
expressive of discourse structure and speaker intent than discourse representation theory [DRT], which does not model
such parameters). For the simple discourses above, SDRT, for
example extends the approach from dynamic semantics with
rhetorical relations and their semantic values, while providing a more complex process of discourse updates. Rhetorical
relations, as used in SDRT, carry specific types of illocutionary force (cf. Austin 1975; Searle 1969, 9), namely, explanation,
elaboration, giving backgrounds, and describing results.

5 CONCLUSION
In this essay, I have attempted to outline the basic components
for a theory of linguistic meaning. Many areas of semantics were
not touched on in this overview, such as issues relating to the
philosophy of language and mind and the psychological consequences of various semantic positions. Many of the accompanying entries herein, however, address these issues directly.
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Apresjan, J. D. 1973. Synonymy and synonyms. In Trends in Soviet
Theoretical Linguistics, ed. F. Kiefer, 17399. Dordrecht, the
Netherlands: Reidel.
Asher, Nicholas, and Alex Lascarides. 2003. Logics of Conversation.
Cambridge: Cambridge University Press.
Austin, J. L. 1975. How to Do Things with Words. Cambridge: Harvard
University Press.

The Structure of Meaning


Barendregt, Henk. 1984. The Lambda Calculus, Its Syntax and
Semantics. Amsterdam: North-Holland.
Barwise, Jon, and Robin Cooper. 1981. Generalized quantifiers and
natural language. Linguistics and Philosophy 4.1: 159 219.
Borschev, Vladimir, and Barbara H. Partee. 2001. Genitive modifiers,
sorts, and metonymy. Nordic Journal of Linguistics 24.2: 140 60.
Bouillon, P. 1997. Polymorphie et semantique lexicale: Le cas des adjectifs. Lille: Presses Universitaires du Spetentrion.
Bouillon, P., and F. Busa, eds. 2001. The Language of Word Meaning.
Cambridge: Cambridge University Press.
Bresnan, J., ed. 1982. The Mental Representation of Grammatical
Relations. Cambridge, MA : MIT Press.
Briscoe, T., V. de Paiva, and A. Copestake, eds. 1993. Inheritance,
Defaults, and the Lexicon. Cambridge: Cambridge University Press.
Busa, F. 1996. Compositionality and the semantics of nominals. Ph.D.
diss., Brandeis University.
Carlson, G. 1984. Thematic roles and their role in semantic interpretation. Linguistics 22: 259 79.
Carpenter, B. 1992. The Logic of Typed Feature Structures.
Cambridge: Cambridge University Press.
Carter, R. [1976] 1988. On Linking: Papers by Richard Carter. Ed. B.
Levin and C. Tenny. MIT Lexicon Project Working Papers 25, Center
for Cognitive Science. Cambridge, MA : MIT Press.
Chierchia, G. 1989. Structured meanings, thematic roles, and control. In Properties, Types, and Meaning, ed. G. Chierchia, B. Partee,
and R. Turner, II: 13166. Dordrecht, the Netherlands: Kluwer
Academic Publishers.
. 1995. The Dynamics of Meaning. Chicago: University of Chicago
Press.
Chomsky, N. [1955] 1975. The Logical Structure of Linguistic Theory.
Chicago: University of Chicago Press.
. 1965. Aspects of the Theory of Syntax . Cambridge, MA : MIT
Press.
Comrie, Bernard. 1981. Language Universals and Linguistic
Typology: Syntax and Morphology. Chicago: University of Chicago
Press.
Cooper, Robin . 2006 . A record type theoretic account of copredication and dynamic generalized quantification. In Kvantifikator
for en Dag, Essays Dedicated to Dag Westerstahl on His Sixtieth
Birthday. Avilable online at: http://www.phil.gu.se/posters/
festskrift3/.
Copestake, A. 1992. The Representation of Lexical Semantic Information .
Cognitive Research Paper CSRP 280, School of Cognitive and
Computing Science, University of Sussex , Brighton, England.
. 1993. Defaults in Lexical Representation. In Inheritance,
Defaults, and the Lexicon, ed. Ted Briscoe, Valeria de Paiva, and Ann
Copestake, 22345. Cambridge: Cambridge University Press.
Copestake, A., and E. Briscoe. 1992. Lexical operations in a unification-based framework. In Lexical Semantics and Knowledge
Representation, ed. J. Pustejovsky and S. Bergler, 10119. New
York : Springer Verlag.
Copestake, A., and T. Briscoe. 1995. Semi-productive polysemy and
sense extension. Journal of Semantics 12: 15 67.
Davidson, D. 1967. The logical form of action sentences. In The Logic
of Decision and Action, ed. N. Rescher, 8195. Pittsburgh: Pittsburgh
University Press.
Davidson, D., and G. Harman, eds. 1972. Semantics of Natural
Language. Dordrecht, the Netherlands: Reidel.
Davis, Steven, and Brendan Gillon. 2004. Semantics: A Reader.
Oxford: Oxford University Press.
Dolling , J. 1992. Flexible Interpretationen durch Sortenverschiebung.
In Fgungspotenzen, ed. Ilse Zimmermann and Anatoli Strigen,
2362. Berlin: Akademie Verlag.

Dowty, D. R . 1979. Word Meaning and Montague Grammar. Dordrecht,


the Netherlands: D. Reidel.
. 1986. The effects of aspectual class on the temporal structure
of discourse: Semantics or pragmatics. Linguistics and Philosophy
9.1: 3761.
. 1991. Thematic proto-roles and argument selection. Language
67: 547619.
Evans, R., and G. Gazdar. 1990. The DATR papers: February 1990.
Cognitive Science Research Paper CSRP 139, School of Cognitive
and Computing Science, University of Sussex, Brighton, England.
Fillmore, C. 1968. The case for case. In Universals in Linguistic
Theory, ed. E. Bach and R. Harms, 188. New York : Holt, Rinehart,
and Winston.
Gazdar, G., E. Klein, G. Pullum, and I. Sag. 1985. Generalized Phrase
Structure Grammar. Cambridge: Harvard University Press.
Goldberg , A. E. 1995. Constructions: A Construction Grammar Approach
to Argument Structure. Chicago: University of Chicago Press.
Grice, H. P. 1969. Utterers meaning and intentions. Philosophical
Review 78: 14777.
.1989. Studies in the Way of Words. Cambridge: Harvard University
Press.
Grimshaw, J. 1979. Complement selection and the lexicon Linguistic
Inquiry 10: 279 326.
. 1990. Argument Structure. Cambridge, MA : MIT Press.
Groenendijk , Jeroen, and Martin Stokhof. 1991. Dynamic predicate
logic. Linguistics and Philosophy 14.1: 39100.
Gruber, J. S. 1976. Lexical Structures in Syntax and Semantics.
Amsterdam: North-Holland.
Gunter, C. 1992. Semantics of Programming Languages. Cambridge,
MA : MIT Press.
Hale, K. and S. J. Keyser. 1993. On argument structure and the lexical expression of syntactic relations. In The View from Building
20: Essays in Honor of Sylvain Bromberger, ed. K. Hale and S. J. Keyser,
53109. Cambridge, MA : MIT Press.
Heim, Irene. 1982. The semantics of definite and indefinite noun phrases.
Ph.D. thesis, University of Massachussets, Amherst.
Higginbotham, J. 1989. Elucidations of meaning. Linguistics and
Philosophy 12: 465 517.
Hobbs, Jerry R. 1985. On the coherence and structure of discourse.
Report No. CSLI-8537, Center for the Study of Language and
Information, Stanford University.
Jackendoff, R. 1972. Semantic Interpretation in Generative Grammar.
Cambridge, MA : MIT Press.
. 1983. Semantics and Cognition. Cambridge, MA : MIT Press.
. 1990. Semantic Structures. Cambridge, MA: MIT Press.
. 1992. Babe Ruth homered his way into the hearts of America.
In Syntax and the Lexicon, ed. T. Stowell and E. Wehrli, 15578. San
Diego: Academic Press.
. 1997. The Architecture of the Language Faculty. Cambridge, MA:
MIT Press.
. 2002. Foundations of Language. Oxford: Oxford University
Press.
Kamp, H. 1981. A theory of truth and semantic representation. In
Formal Methods in the Study of Language, ed. J. A. G. Groenendijk ,
T. M. V. Janssen, and M. B. J. Stokhof. Mathematical Centre Tracts
135, 277322. Amsterdam: Mathematical Centre.
Kamp, H., and U. Reyle. 1993. From Discourse to Logic. Dordrecht, the
Netherlands: Kluwer Academic Publishers.
Karttunen, L. 1974. Presupposition and linguistic context. Theoretical
Linguistics 1: 18193.
. 1976. Discourse referents. In Syntax and Semantics, ed. James
D. McCawley. Vol. 7: Notes from the Linguistic Underground , 36385.
New York : Academic Press.

33

The Cambridge Encyclopedia of the Language Sciences


Katz, J., and J. Fodor. 1963. The structure of a semantic theory.
Language 39: 170 210.
Krifka, M. 1992. Thematic relations as links between nominal reference
and temporal constitution. In Lexical Matters, CSLI Lecture Notes, ed.
I. Sag and A. Szabolcsi, 2953. Chicago: University of Chicago Press.
Kripke, Saul. [1972] 1980. Naming and necessity. In Semantics
of Natural Language, ed. D. Davidson and G. Harman, 253355.
Dordrecht and Boston: Reidel.
Lakoff, G. [1965] 1970. Irregularity in Syntax . New York : Holt, Rinehart,
and Winston.
Levin, B., and M. Rappaport Hovav. 1995. Unaccusativity: At the SyntaxSemantics Interface. Cambridge, MA : MIT Press.
. (2005. Argument Realization. Cambridge: Cambridge University
Press.
Lewis, David. 1976. General semantics. In Montague Grammar, ed.
Barbara H. Partee, 150. New York : Academic Press.
Lyons, John. 1968. Introduction to Theoretical Linguistics.
Cambridge: Cambridge University Press.
Mann, William C., and Sandra A. Thompson. 1986. Rhetorical
Structure Theory: Description and Construction of Text Structures.
ISI/RS-86174, 115. Nijmegen, the Netherlands: Information
Sciences Institute.
May, Robert. 1985. Logical Form: Its Structure and Derivation.
Cambridge, MA : MIT Press.
McCawley, J. D. 1968. The role of semantics in a grammar. In
Universals in Linguistic Theory, ed. E. Bach and R. T. Harms, 12469.
New York : Holt, Rinehart, and Winston.
Moens, M., and M. Steedman. 1988. Temporal ontology and temporal
reference. Computational Linguistics 14: 15 28.
Montague, Richard. 1973. The proper treatment of quantification in ordinary English. In Approaches to Natural Language, ed. Jaakko Hintikka,
Julius Matthew, Emil Moravcisk, and Patrick Suppes, 22142. Dordrecht,
the Netherlands: D. Reidel. Repr. in Montague 1974, 24770.
. 1974. Formal Philosophy: Selected Papers of Richard Montague.
New Haven, CT: Yale University Press.
Moravcsik , J. M. 1975. Aitia as generative factor in Aristotles philosophy. Dialogue 14: 62236.
. 1990. Thought and Language. London: Routledge.
Nunberg , G. 1979. The non-uniqueness of semantic solutions: Polysemy. Linguistics and Philosophy 3: 14384.
Parsons, T. 1990. Events in the Semantics of English. Cambridge, MA:
MIT Press.
Partee, Barbara. 1984. Compositionality. In Varieties of Formal
Semantics, ed. Fred Landman and Frank Veltman, 281311.
Dordrecht, the Netherlands: Foris.

34

Partee, Barbara , ed. 1976 . Montague Grammar. New York : Academic


Press .
Partee, B., and M. Rooth. 1983. Generalized conjunction and type
ambiguity. In Meaning, Use, and Interpretation of Language, ed.
Rainer Buerle, Christoph Schwarze, and Arnim von Stechow,
36183. Berlin: Walter de Gruyter.
Pelletier, F. J. 1975. Non-singular reference: Some Preliminaries. In
Mass Terms: Some Philosophical Problems, ed. F. J. Pelletier, 114.
Dordrecht, the Netherlands: Reidel.
Pollard, C., and I. Sag. 1994. Head-Driven Phrase Structure Grammar.
Chicago: University of Chicago Press and Stanford CSLI.
Potts, Christopher. 2005. The Logic of Conventional Implicatures.
Oxford: Oxford University Press.
Pustejovsky, J. 1991a. The generative lexicon. Computational
Linguistics 17: 409 41.
. 1991b. The syntax of event structure. Cognition 41: 4781.
. 1995. The Generative Lexicon. Cambridge, MA: MIT Press.
. 1998. The semantics of lexical underspecification. Folia
Linguistica 32: 32347.
. 2007. The Multiplicity of Meaning. Cambridge, MA : MIT Press.
Pustejovsky, J., and B. Boguraev. 1993. Lexical knowledge representation and natural language processing. Artificial Intelligence
63: 193223.
Recanati, Francois. 2002. Unarticulated constituents. Linguistics
and Philosophy 25: 299 345.
Russell, Bertrand, 1905. On Denoting. Mind 14: 479 93.
Saussure, Ferdinand de. [1916] 1983. Course in General Linguistics.
Trans. R. Harris. London: Duckworth.
Searle, John. 1969. Speech Acts: An Essay in the Philosophy of Language.
Cambridge: Cambridge University Press.
. 1979. Expression and Meaning. Cambridge: Cambridge
University Press.
Stalnaker, Robert. 1970. Pragmatics. Synthese 22.1/2: 27278.
Strawson, P. F. 1971. Logico-Linguistic Papers. London: Methuen.
Tenny, C. 1992. The Aspectual Interface Hypothesis. In Lexical
Matters, CSLI Lecture Notes, ed. I. Sag and A. Szabolcsi, 127.
Chicago: University of Chicago Press.
Tenny, C., and J. Pustejovsky. 2000. Events as Grammatical Objects.
Stanford, CA: CSLI Publications. Chicago: University of Chicago
Press.
Weinreich, U. 1964. Websters Th ird: A Critique of its Semantics.
International Journal of American Linguistics 30: 405 9.
. 1972. Explorations in Semantic Theory. The Hague: Mouton.
Williams, E. 1981. Argument structure and morphology. Linguistic
Review 1: 81114.

diglossia, culture and language, digital media, literacy). This essay is focused on the issue of the media of com-

4
SOCIAL PRACTICES OF SPEECH AND WRITING
Florian Coulmas

INTRODUCTION
Language is constitutive for human society. As a social fact
it cannot be thought of in the abstract, for the medium of
communication is what allows it to serve social functions. The
nature of the social relationship that exists by virtue of language
partially depends on the externalization of language, that is, on
how it is transmitted from one actor to another as speech, writing, sign, or Braille. The anatomy of speech organs (cf. Liberman
and Blumstein 1988) provides the biological foundation of
human society in the most general sense, which is why oral
speech is considered fundamental for socialization both in the
phylogenetic and ontogenetic sense. But unless we study human
society like that of other primates from the point of view of physical anthropology, other forms of language externalization must
also be taken into account as communication potential from
the beginning. There are two reasons for this. One is that the
invention of writing (sign language, Braille) cannot be undone.
The other, which follows therefrom, is that writing has brought
about basic changes in the nature of human communication. It
brought in its wake a literate mindset that cannot be reversed.
Research about language in literate societies is carried out by
researchers who, growing up, were socialized into a literate
world organized by and large on the basis of literate principles.
It is not fortuitous, therefore, that social practices of speech and
writing are dealt with here under one heading.
The scientific enterprise in general, linguistics in particular, is
a social practice involving speech and writing. Even the investigation of unwritten languages happens against the background
of literate society and by means of the tools developed for what
Goody (1977, 151) felicitously called the technology of the intellect. For the language sciences, it is important to keep in mind
that it is not just the technicians who use a tool to do what they
need to do and want to do, but that the tool restricts what can
be done. This holds true for the hardware, that is, the writing
implements, as well as for the software, the code or the writing
systems.
The social aspects of speech and writing encompass a wide
range of topics many of which are dealt with in other entries of
this encyclopedia (sociolinguistics, discourse analysis,

munication and the social conditions and consequences of their


evolution. The reason is that the social practices of speech and
writing both depend on the available technology and lead to technological and social innovation. As has been argued by Marshall
McLuhan (1964), Elizabeth L. Eisenstein (1979), Jan Assmann
(1991), David R. Olson (1994), and Nicholas Negroponte (1995)
among others, civilizations are characterized by the media they
predominantly use and which shape the way they exchange,
store, and administer information, thus exercising a profound
influence on social practice.
The nexus between speech and writing is variable and more
obvious in some cases than in others. For instance, when the
lyrics of a song are read on the monitor and sung in a karaoke
bar, speech and writing are joined together in one activity. On
the other hand, the songs of bards handed down by word of
mouth from one generation to another are part of the culture of
spoken, as opposed to written, language (Ong 1982, Olson 1991;
oral culture, oral composition). However, the very idea
of orality is predicated on literacy and would not have become
an object of study without it. Just as there is no silence without
sound, illiteracy exists but in a literate society. On the face of it,
many kinds of verbal behavior, such as speech contests, bidding
at an auction, and election campaign addresses, do not involve
writing. The institutional frameworks in which they take place in
modern society, school, trade and government, though, rely to a
very large extent on written texts. To analyze social practices of
speech and writing, then, it is necessary to consider technological aspects of writing and institutional aspects of literacy.

TECHNOLOGICAL ASPECTS OF WRITING


Many social practices and ligatures of contemporary society
would be impossible without writing. This does not imply that the
externalization of language by technological means is the only
force that shaped modern society. The assumption of an unmediated cause-and-effect relationship between writing and social
organization, of a watershed between primitive oral life and complex literate civilization, is a simplification that fails to do justice
to the complexity of the interaction. It is surely tempting to argue
that what all great early civilizations had in common was writing
and that it was hence writing that caused complex societies to
come into existence. However, if we look at the uses of writing
in early civilizations, many differences are apparent. For example, economic administration was preeminent in Mesopotamia
(Nissen, Damerow, and Englund 1990), whereas cult stood out
in Egypt (Assmann 1991). In both cases, it is untenable to argue
that accounting and the cult of the dead, respectively, were an
outflow of the invention of writing. Yet the opposite proposition,
claiming that the demands of bookkeepers and priests led to the
creation of writing, is no less simplistic.
Similarly, the invention of the printing press and movable type has often been seen as a technological breakthrough
with vast social consequences (Febvre and Martin [1958] 1999;
Eisenstein 1979). In our day, the digital turn (Fischer 2006),
described as the third media revolution after chirographic
culture (Schmandt-Besserat 1992) and print culture

35

The Cambridge Encyclopedia of the Language Sciences


(Olson 1994), is regarded as a driving force of globalization
(Kressel 2007). Both of these propositions are defensible, but
not in a unidirectional, monocausal sense. Equally true are the
opposite propositions, that socioeconomic developments led to
the emergence of a larger reading public, thus paving the way for
a simpler and more efficient reproduction technology than the
copying of manuscripts, and that modern industrial society with
mass participation generated pressure for the development of a
technology of mass dissemination of information. The invention
of writing facilitated complex social organization, and the printing press was conducive to the spread of education. However,
writing has been a common possession of humanity for more
than 5,000 years and the printing press for half a millennium, if
we disregard the use of cast-metal movable type in Korea in the
early thirteenth century.
Yet we are living in a world with hundreds of millions of adult
illiterates, even, or rather particularly, where writing first emerged,
that is, in Mesopotamia, in Egypt, in China, and in Mesoamerica.
According to unesco (2006), there were 781 million adult illiterates worldwide and 100 million school-age children not attending school in 2005. In spite of the uneven distribution of illiterates
in the world, these figures suffice to discredit the notion that a
new technology of communication of and by itself brings about
social change. Economic development, social structure, ethnic
and linguistic composition, fecundity, ideology, and tradition
are intervening variables that determine how a society makes use
of and adjusts to a new technology. It is necessary, therefore, to
reckon with the contemporaneity of different levels of development, different technologies, and different literacies. Assuming
a dialectic relationship of mutual influence between writing and
social change is a more promising approach for understanding
the transition from oral to literate society.
New technologies both respond to practical needs and create
new practices. Any technology is an artifact, but to conclude that
its creators rule over it is a fallacy, for the applications of technological innovations are often recognized not in advance but after
the fact, when they have been used for some time. Like the genie
let out of the bottle, they may have unplanned and sometimes
unwelcome consequences. The material and functional properties of writing technologies determine their potential uses, which,
however, are not necessarily evident at the outset.
The locus of writing is the city. Even a superficial look at
present-day urban environments reveals that city dwellers are
surrounded by written messages wherever they go. Of late, this
has given rise to a new branch of scholarship known as linguistic landscape research (Landry and Bourhis 1997; Backhaus
2007), as it were, a social epigraphy for posterity. The variety
of writing surfaces on which the literate culture of modern
cityscapes manifests itself is striking. It testifies to the traces of
history in the present and to the contemporaneity of different
stages of development, for it includes some of the oldest materials used for writing side by side with the most recent devices.
This contemporaneity is one of the foremost characteristics
of writing. For writing arrests change and enables accumulation of information. Some genuine monuments from antiquity
speak to us today, such as the Egyptian obelisk of Ramses II of
the 19th Dynasty, 13041237 b.c.e., re-erected on the Place de la
Concorde in the center of Paris. Around the corner, the passerby

36

can read the latest stock quotes off a scrolling light-emitting


diode (LED) display. Brand-new buildings are adorned with the
old technique of cutting messages in stone. Stelae with commemorative inscriptions, gravestones, and buildings bearing
the names of their owners or occupants are still being put up,
much as in ancient times. There are hardly any material objects
to which writing cannot be affixed. Since the earliest times of
literate culture, few have been discarded and many added. The
choice continues to expand. Hard surfaces made for endurance
are stone, marble, metal, ceramics, wood, and, today, plastics.
Inscriptions are incised, engraved, etched, carved, and chiseled
into them as they were in the past, and malleable surfaces such
as moist clay and molten metal are impressed or molded into
shape.
In addition to monumental inscriptions, writing is found on
various other surfaces, such as whitewashed walls, street signs,
posters, billboards, handbills, notice boards, memorial plaques,
cloth, clothing, commercials carried around by sandwichmen
and mounted on trucks, advertising pillars, buses and other
vehicles covered with commercials, shop windows, and digital
display panels. These and some other surfaces, such as palm
leaves, papyrus, parchment, and wax tablets that have gone
out of fashion, are variously suitable for realizing the functional
potential of writing.
Two fundamental functions of writing are memory support
and communication. They are not mutually exclusive, but different surfaces lend themselves better to one than to the other.
Hard surfaces answer the requirement of durability. They are
inscribed only once, but with a lasting trace that can be recovered after years, decades, even millennia. Baked clay tablets,
the hallmark of cuneiform civilization, and mural inscriptions
on Egyptian monuments embody this type. Memory is in time
turned into history, the recording of the past and the collection
of knowledge, which are the cardinal functional characteristics of this technology. Inscriptions on hard surfaces are, of
course, also communicative but stationary. Clay tablets can be
transported in limited numbers only, and monumental inscriptions have to be visited to be read. In order to allow written
signs to travel and thus to realize a potential that fundamentally distinguishes writing from speech, freeing the message
from the copresence of sender and receiver, lighter materials
are needed. In antiquity, three main writing surfaces met this
requirement: papyrus, parchment, and paper.
For millennia, Egypt was practically the only producer of
papyrus because the reed of which it is made grows in abundance along the banks of the Nile. The papyrus scroll hieroglyph
is attested in the most ancient known Egyptian inscriptions, and
the oldest papyrus fragments covered with writing date from the
third millennium b.c.e. Papyrus came to be commonly used for
documentary and literary purposes throughout Greece, Asia
Minor, and the Roman Empire. As of the fourth century c.e.,
parchment (processed animal hide), a more durable writing
material than the brittle papyrus, began to be more widely used in
Europe, where the scroll was gradually edged out by the book in
codex form (Roberts and Skeat 1983). The word paper is derived
from papyrus, but paper making is quite different from papyrus
making. It was invented by the Chinese some 1,900 years ago
(Twitchett 1983). The earliest Chinese documents on paper date

Social Practices of Speech and Writing


from the second century c.e. In the wake of the Islamic expansion to Central Asia, the Arabs acquired the paper-making technology in the eighth century c.e., which they in turn introduced
to Europe in the eleventh century. Relatively cheap, flexible, and
convenient to carry, paper replaced parchment as the principal
writing surface in Europe and in other parts of the world.
Since its invention in China, paper, which Pierre-Marc De
Biasi (1999) called the greatest invention of all time, gave a
boost to the production of written text wherever it was introduced. In China, it was used for block printing as of the seventh
century. In the tenth century, the entire Buddhist scripture was
printed using 130,000 printing blocks (Taylor and Taylor 1995,
156). Paper was the first writing material that spread around the
world. In the West, Johannes Gutenbergs invention of printing with movable type would hardly have had the same impact
without it. Of the 180 copies of the Bible he printed, 150 were on
paper and only 30 on parchment, one indication of the importance of paper for the dissemination of written text. Its position
in this regard is undiminished. The paperless office is far from a
reality even in the most advanced countries; rather, many ancillary devices that presuppose writing on paper form the mainstay
of the thriving office machines industry: printers, scanners, copiers, and fax machines. Although nowadays paper holds only a
tiny fraction of all new information released, it is still the unchallenged surface for the formal publication of information. World
paper consumption for information storage and distribution is at
an all-time high. Notwithstanding the shift of many periodicals
and scholarly journals to online publication most continue to
be printed on paper for archival purposes, for paper has a much
longer duration of life than can be guaranteed for any digital
storage medium.
This brings to light a more general trade-off of information
processing. Weight and storage capacity are inversely related. A
tablet measuring about 10 10 cm is the ideal size for writing on
wet clay. It holds about 300 characters. Depending on the thickness of the tablet, this yields an information/weight ratio of .2 kg
to 1 kg per 1,000 characters. A text of 300,000 characters would
weigh between 200 kg and 1000 kg. In modern terms, that would
be a short book of fewer than 190 pages, assuming an information density of 1,600 characters per page. Give it a solid cover
and it comes to a total of 250 g. With respect to the information/
weight ratio, paper thus outperforms clay by a factor of 4,000.
Such a rough-and-ready calculation may suffice to illustrate
the point. Papyrus was similarly superior to clay with regard to
storage capacity and transportability; however, many more clay
tablets than papyrus documents have come down to us through
the millennia. How many papyrus rolls were lost in the legendary
blaze of the library of Alexandria is not known, but when in 2004
a fire broke out in the Anna Amalia Library in Weimar, 50,000
books were destroyed, many of them unique or rare. Another
65,000 volumes were severely damaged by fire and water. Baked
clay tablets would have withstood the flames and the water used
to put them out.
This line of thought can be extended into the digital age
by another calculation. Computer technology has exponentially increased information storage density. The 50,000 burnt
books of the Anna Amalia Library took up some 1,660 meters of
shelf space. Assuming an average of 300 pages per book, their

digitalized contents would require some 750 gigabyte (GB) storage space, which easily fits on an external hard disk the size of
a small book. As compared to print, digital information storage
thus reduces the necessary physical space by a factor of 50,000.
Again, this is a coarse measure only. There is considerable
variation in the bytes per book page, both in actual fact and
in model calculations, but the correlation between information amount and storage space of print media and digital media
transpires from it.
In sum, as clay was followed by paper and paper by digital storage media, information density per square centimeter
increased exponentially, while the weight and size of written
records decreased. It became, accordingly, ever easier to store
and to transmit written text with many consequences for reading and writing behavior, for reproducing text, and for the role
texts play in everyday life. What the history of writing shows is
that new technologies do not always replace old ones. Rather,
the new supplements the old and often transforms its use. For
instance, parchment was marginalized by paper but for centuries never completely driven out; print never replaced handwriting and has not become obsolete by texts typed on a cell phone
keypad. Advances in writing technology have greatly expanded
the repertoire of tools that humanity has acquired for handling
information in the form of written language. The fact that old
technologies continue to be used side by side with new ones
testifies not just to inertia and path dependency but also to the
different properties of the materials used. For centuries after
the introduction of paper, it was considered too feeble a material to hold contracts and other important documents, which
were preferably executed on parchment. Similarly, although it
is technically possible to keep birth registers as electronic files
only, birth certificates on paper continue to be issued. One of the
reasons is that digital information storage is subject to physical
decay and technical obsolence not less but much more than predecessor media.
Writing has made record keeping for posterity and accumulation of knowledge possible. However, with the introduction of
every new writing material, the storage problem that it seemed to
solve became more acute. An archive of records on paper takes
up much less space than one for clay tablets, but it is beset by
dangers that pose no threat to baked clay: dust, humidity, insects,
fire, and water. Theft, too, is a greater threat to libraries than to
clay tablet archives and a greater threat to computers than to
libraries. Keeping books in a usable physical state requires more
work than keeping clay tablets. The same kind of relationship
holds between libraries and digital data archives. Much more
can be stored, but preservation for future use becomes ever more
difficult as time intervals of technical innovation shrink. Only a
few specialists are able to handle data stored with software that
ceased to be produced 20 years ago, whereas a book would last
for centuries. The problem of preserving and organizing the everswelling flood of information remains unsolved, and at the same
time many traditional libraries and collections of documents fall
into decay. Technology has hugely expanded human memory,
but it has not yet eliminated the risk that many parts of the heritage committed to writing will disappear forever.
To guard against collective memory loss, the United Nations
Educational, Scientific and Social Orgnization (UNESCO) has

37

The Cambridge Encyclopedia of the Language Sciences


launched the Memory of the World Programme to assist in the
preservation of archive holdings and library collections all over
the world. For the time being, this is the endpoint of a development begun with the advent of literacy in ancient civilizations: the
institutionalization of writing and the bureaucratization of society. The more serviceable writing became to human society, the
more it penetrated social relations and the more attention it
came to require on the part of society.

INSTITUTIONAL ASPECTS OF LITERACY


From its inception, writing has been an instrument of power. In
ancient civilizations of restricted literacy, its mastery was jealously guarded by the elite. It was indispensable for the workings of the temple-centered economies of ancient Near Eastern
city states (Nissen 1988), symbolized the rule of the pharaohs in
Egypt (Posener 1956), became the bedrock of Chinas Confucian
bureaucratic state (Lewis 1999), and was a sine qua non of
Athenian democracy (Thomas 1992). Certainly, literacy levels
varied widely as did the uses of literacy, but the general tendency of the extension of the human mind by means of writing to
engender and necessitate institutions is unmistakable. The most
important institutions produced by literate culture have to do
with government, cult, schooling, and economic organization.

Government
Writing was used early on to extend the reach of authority and to
mediate the relationship between ruler and ruled. Monuments
that embody power, such as the Rosetta Stone inscribed with a
decree to commemorate the reception of rulership by Pharaoh
Ptolemy V on March 27, 197 b.c.e. (Parkinson 1999, 29), as well
as stelae appealing with regulations to the literate public, were
erected throughout the Ancient Near East. Their inscriptions
were drafted by scribes who created the first bureaucratic states.
The Egyptian vizier was responsible for the collection of taxes,
the maintenance of archives, and the appointment of officials.
The skills of the scribe afforded privilege in Egyptian society. As
one of them put it, the scribe is released from manual tasks; it
is he who commands (Goody and Watt 1968, 37). A thousand
years later, Mencius (372289 b.c.e.) made the same point in
China: Some labour with their hearts and minds; some labour
with their strength. Those who labour with their hearts and
minds govern others. Those who labour with their strength are
governed by others (Book of Mencius, quoted from Lloyd and
Sivin 2002, 16). These observations bring to light the connection
between literacy and social hierarchy that persists to this day.
Wherever societies obtained the art of writing, the literati
were close to the powers that be, but the Chinese institutionalized literacy like no other culture. In Confuciuss day, literacy was
already the preeminent mark of the gentleman, and as of the second century b.c.e., the study of the Confucian classics gradually
became institutionalized as the key to attaining public office and
influence. The civil service examination system was employed
with few adjustments from the Han Dynasty (206 b.c.e.220 c.e.)
until the closing stages of the Qing Dynasty at the beginning of
the twentiethth century. It was based entirely on the study of
texts. To prepare for the highest degree, students spent up to 20

38

years memorizing the Confucian classics and commentaries.


They were then able to recite, understand, and interpret every
clause of the five canonical works Book of Changes, Book of
Documents, Book of Poetry, Records of Rites, and Spring
and Autumn Annals said to have been redacted by Confucius
himself, as well as a collection of commentaries by subsequent
scholars.
That such an education was an adequate preparation for
bureaucrats charged with administering the country was rarely
called into question. It was firmly rooted in the past, and the classics were thought to hold a solution to any problem that might
arise. The authority of writing and the conservatism of literature
were never more effective. The strength of the system lay in the
fact that it encouraged respect for learning and provided the
emperor with a bureaucracy educated in one standard curriculum. Its weakness was its emphasis on commentary that stifled
inquisitiveness and deviation from the trodden path. The civil
service exam system institutionalized the transmission of texts
down a lineage and was, thus, inherently averse to change. In its
early days, it helped to loosen the hereditary aristocracys grip
on power by rewarding merit rather than birth for recruiting
bureaucrats. In actual fact, however, learning remained largely
a prerogative of aristocratic families out of reach for most commoners. Women were not permitted to sit for the exams. In the
end, the civil service examinations served as a system to perpetuate the power of the thin elite of literati bureaucrats.
Controlling literacy has always been the other side of its relation to authority. The powerful have lived in fear of the power
of the pen and have had little interest in promoting the art of
writing among the masses. Illiterates are powerless, unable to
challenge the letter of the law or to invoke the laws on the books
to their own advantage. That which is written down acquires
authority in its own right as a reference source independent of
the ruler. While helping to project power far beyond earshot, it
gains a measure of objectivity, thereby reducing the arbitrariness of rule. But only the literate can hold the ruler accountable
to his own decrees. The institutionalization of writing to this end
occurred in fifth-century Greece where government was established in the polis through written laws. These were not Godgiven but man-made laws, aiding the development of a division
between cosmic order and human society (Stratton 1980), so
characteristic of the Greek Weltanschauung and so different
from the Chinese.
From the objectification of language in writing follows
another function with important implications for the exercising and curbing of power. Writing detaches the author from
his message, which makes it easier and less risky to express the
unspeakable. Two examples suffice to illustrate. In Athens, ostracism was institutionalized as a procedure to protect democratic
rule. In the event that a charismatic politician (a demagogue)
became too influential or was otherwise disruptive, the demos
(people) were entitled to exile him from the polis. To this end,
every citizen was given a potsherd (ostrakon) on which to write
the name of the man to be sent into exile. The degree of literacy in
Athens that can be inferred from this practice is a question that
has given rise to much research and speculation (Burns 1981;
Harvey 1966; Havelock 1982; W. Harris 1989; Thomas 1992). It is
unlikely that it will be possible ever to quote even approximately

Social Practices of Speech and Writing


correct figures, but what we know of the literate culture of the
Greek city-states is that, unlike in China, there was no scribal
class. Minimal competence such as was necessary to scratch a
name on a potsherd was relatively widespread. Both written law
as the basis of social order and ostracism as a check on power
exemplify the institutionalization of writing as a precondition
for political participation and as a seed of the public sphere. It
enabled the people to express discontent with a leader without
raising their hand individually or speaking out. Anonymity was
a protection from reprisals.
The other example is from Babylon, as reported in the
Old Testament. Writing means empowerment and, therefore, has to be controlled. The proverbial writing on the wall,
mene-tekel-parsin, an Aramaic phrase that, according to Daniel
5:25, mysteriously appeared on the wall of King Belshazzars palace, cautioning him that his days were numbered and his empire
was doomed, exemplifies the potential of criticism. A message
by an unknown author exposing abuse of power to the public is
a direct challenge to authority. While it would be problematic to
voice disapproval in the presence of others, writing affords the
originator the protection of anonymity.
The dialectics of technological innovation come to bear
here. While the mighty use writing to establish, exercise, and
consolidate their power, it also lends itself to embarrassing and
undermining them. This was clearly understood in antiquity.
Power holders always tried to curtail the power of the written
word. Confuciuss Analects were burned in 200 b.c.e. on order
of Emperor Pinyin Qin Shi Huang Di. Plato was convinced that
the state had to control the content of what pupils read. Tacitus
refers to banned books (Anales 4,34), and his own books were
later banned in return. Censorship is ubiquitous throughout
history (Jones 2001). Many of the finest literary works were
at one time or another put on a list of forbidden books, such
as the papal Index Auctorum et Librorum Prohibitorum, first
published in 1559. It took another 500 years for censorship to
be universally censured as an illegitimate means of exercising
power. Article 19 of the Universal Declaration of Human Rights
adopted by the United Nations in 1948 states that everyone has
the right to freedom of expression and information, regardless
of frontiers.
The struggle against censorship worldwide is far from over.
With every new writing technology it is rekindled, as the debate
about controlling the Internet illustrates. The power elites try to
control the new media, which they perceive as a threat, although,
ironically, they are often involved in its development. The battlefield shifts with technological advance. What were scriptoria and
publishing houses in the past are servers and the flow of data
through cyberspace today. In the long run, attempts on the part
of governments to defend their lead and keep new information
technology out of reach of their adversaries have failed because
it is the very nature of these technologies that they can be utilized
to uphold and to counter the claim to power.

Cult
In nonliterate societies religious, order and social order are
merged without clear division. Writing tends to introduce a
fracture between spheres, although the differentiation of the

sacred and the profane may take a long time to complete. The
Ten Commandments, written by God and given to Moses, were
a code of conduct regulating the relations between God and the
people, as well as the people among themselves. The spheres of
spiritual and worldly power were only beginning to be separated
in antiquity, a process to which writing and the ensuing interpretation of texts contributed a great deal. Of Moses legendary
Ten Commandments stone tablets no archaeological evidence
remains, but the Aoka edicts are a tangible example of the
closeness of cult and social order. Engraved in rocks and stone
pillars that have been discovered in dozens of sites throughout
northern India, Nepal, Pakistan, and Afghanistan, these edicts
served to inform the subjects of King Aokas (304232 b.c.e.)
reforms and make them lead a moral life according to the truth
of Buddhism. Moral precepts hewn in stone on public display
once again raise the question of the degree of literacy. It must
have been high enough to disseminate the message throughout the vast realm. In the rich spiritual world of the Indian subcontinent, much emphasis was always given to the importance
of oral tradition. Competing with other religious movements,
Buddhism established itself by rigidly regulating monastic life
and assembling a canon of scriptures. At the time of Aoka,
Buddhism was an institutionalized religion that spread from
northern India to other parts of the subcontinent, first and subsequently to South and Southeast Asia, as well as Central Asia
and China.
The history of Buddhism and its emergence as a major world
religion is a history of translation. Translation means writing, turning one text into another and all it involves: exegesis,
doctrine, scholasticism, and schism. As Ivan Illich and Barry
Sanders (1988: 52) have argued, there is no translation in orality, but only the collaborative endeavor to understand ones
partner in discourse. Writing eternalizes spiritual enlightenment (e.g., the word of God), which must be preserved with
the greatest care and does not allow alteration at will. Of all
religions, Buddhism has produced by far the largest corpus
of sacred texts. Significantly, different canons resulted from
translation, the Pali canon, the Tibetan canon, and the Chinese
canon. With translation came schism, and with that the delimitation of religious sects and districts. In the evolution of other
book religions, translation had similar consequences. These
other religions share with Buddhism the vital importance they
attach to scriptures. The holy book is what distinguishes world
religions from other cults. Their legitimacy is derived from the
revelation embodied in the scriptures, which are considered the
true source of enlightenment. The major world religions vary in
the role they assign sacred texts, in how they make use of them
for purposes of propaganda and the regulation of life, but the
reverence accorded to writing is a feature they share, as is the
question of translation.
Translation is a problem for two reasons that can be labeled
authenticity and correspondence. First, the idea of authenticity of the word of God does not agree with its transposition into
another form. Some book religions are very strict in this regard.
According to believers, God himself chose the Arabic language
for his final testament, the Quran, and created every one of the
letters of the Arabic alphabet. Consequently, the only true version of the holy book of Islam is the Quran in Classical Arabic.

39

The Cambridge Encyclopedia of the Language Sciences


The Christian tradition until the Reformation movement knew
similar limitations, recognizing only the three sacred tongues of
Hebrew, Greek, and Latin as legitimate languages of the Bible.
In many cases, other languages were considered unfit for the
expression of divine revelations, if only because they lacked the
fixity that comes with being reduced to writing. Indeed, translations of religious scriptures, when they were eventually produced, were for many languages their first literary texts, serving
as an anchor in the fluidity of oral discourse and as a starting
point of literacy and textual transmission.
The other reason for the problematic nature of translation
is that through it, a stable correspondence is to be established
between two codes that though obviously different in lexical,
grammatical, and phonetic makeup, must express the same
contents. In order to translate, the meaning of the text has to
be established unequivocally, the assumption being that this
is possible. The text is thus elevated to the source of meaning.
Authority that formerly accrued to the sage, the reciter, and the
soothsayer was relocated in text. This transition from utterance
to text (Olson 1977) implies that in order to understand a message, at issue no longer is what the speaker means but what the
text contains. This is what translation is all about. Language itself
in objectified form thus becomes a force in its own right with farreaching implications in the domains of knowledge production
and social regulation. Preservation of the word of God in text
has provided an objectified reference plane incorporating the
true meaning waiting to be extracted from it. Olsons notion
of autonomous text that results from the transition from utterance to text as he conceptualizes it has been criticized because
it ignores the readers involvement in constructing meaning
(Nystrand 1986) and because it underestimates the oral elements in literate culture (Wiley 1996). The soundness of these
criticisms in detail, however, does not invalidate the general idea
that writing gives language a measure of stability that it does not
have in speech, and brings with it a shift from the intentional to
the conventional aspects of linguistic meaning, a shift from I
mean to the word means. The high prestige that the written
word acquired through its association with and instrumentalization by organized religion has greatly contributed to the coming
into existence of autonomous text.
The reverence for holy books had various consequences for
language attitudes and practices, two of which can be mentioned here: code choice and script choice (Coulmas 2005).
Writing introduces functional domain differentiation into a
communitys linguistic ecology. That the language of cult differs
from that of everyday pursuits has always been the rule rather
than the exception, but with writing the distinction becomes
more pronounced. The important position of religious texts in
combination with restricted literacy encouraged the perpetuation of the split between spoken and written language. While
the codification of the latter was aided by the desire to transmit
the sacred texts inviolately to later generations, the former was
subject to perpetual change. The result was a situation of coexisting codes, called diglossia in modern scholarship (Ferguson
1959; Krishnamurti 1986; Schiffman 1996). Although every case
of diglossia is different, the defining characteristic is a domainspecific usage of varieties that coincides by and large with the
spoken/written language divide. These varieties can be different

40

languages or varieties of the same stock. In multilingual environments like medieval Europe, where Latin was the only written language, or in present-day India, cultivated languages are
often reserved for writing and formal communication, while vernacular varieties are used in informal settings. A similar division
is found between linguistic varieties of the same stock, where
one is defined by reference to a corpus of classical texts, such as
Classical Arabic, whereas the others fluctuate without artificial
restriction.
Writing introduces an element of art and artificiality into the
history of language. Every language is the collective product of
its speakers, but a written language is more clearly an artefact
than a vernacular, and the script that it uses more clearly yet.
Historically, the diffusion of scripts coincided in large measure
with that of religions, a connection that is still visible today.
Chinese characters arrived in Japan together with Buddhism.
The spread of Roman traces the expansion of both the Roman
Empire and the Catholic Church, while Orthodox Christianity
uses Cyrillic. Armenian Christians have their own alphabet
designed in the fifth century by St. Mesrob. Estrangela is the script
of the Syrian Church. The Hebrew square script is the script of
Judaism, the Arabic alphabet that of Islam. Many other examples
could be added; clerks were churchmen (Coulmas 1996, 435 f.).
The historical interconnectedness of writing and religion is one
of the reasons that scripts tend to function as symbols of identity, but ethnic, national, and political identity are also readily
expressed by means of a distinct script or even slightly different
orthographic conventions. As Peter Unseth (2005) has pointed
out, there are clear sociolinguistic parallels between choosing scripts and languages. Because of the visibility and the artificial nature of writing, however, the choice of scripts is generally a
more deliberate departure from tradition in that it involves conscious planning.

Schooling
Language is a natural faculty, writing an artifact. That is the reason why children acquire language, but not writing, without
guidance. The difficult art of writing requires skills that must be
taught, memorized, and laboriously practiced. The place to do it
is school. For writing to be useful to the community, conventions
have to be established, individual variation curtailed, norms set.
Collective instruction following a curriculum is a more efficient
way to achieve this than is private tutoring. Already in antiquity,
school became, and still is, the institution that most explicitly
exercises authority over the written language by controlling its
transmission from one generation to the next.
With schooling came the regimentation and the decontextualization of language. Because in writing the continuous flow
of speech has to be broken down into discrete units, analytic
reflection about the units of language was fostered. As writing,
language became an object of investigation and normalization.
Both grammar and lexicon are products of writing. This is not to
deny the grammaticality of oral speech or that oral people have
a mental lexicon. It just means that the notions of grammar and
lexicon as we know them are entirely dependent upon writing.
At school, units of writing had to be practiced with the stylus, the
brush, or the pen, mechanically through repetition without any

Social Practices of Speech and Writing


communicative intent. These units could be given a phonetic
interpretation; they could be pronounced and thus acquired
as words, an existence as units of language. In time, the image
became the model. Since the correct form of the written sign was
a matter to which the scribe and his pupils had to pay attention,
standards of correctness first developed with reference to writing
and written language.
Only much later, and as an effect of schooling, did these
notions come to be applied to speech. The twin questions of
what the units of writing were and how they were to be conjoined
led to the more general and profound question What is a language? Right up to the present, answers to this question exhibit
what Roy Harris (1980) has called a scriptist bias. Only trained
linguists readily recognize unwritten vernaculars as languages,
and even they have to admit that while it is easy to count written
languages, the question of how many speech forms on this planet
qualify as distinct languages is impossible to answer without laying down analytic criteria that are no less arbitrary than decisions
as to what dialects, varieties, idioms, and speech forms should
be reduced to writing. Languages, as well as the units into which
they are analyzed, are a product of writing, for only in writing can
the flow of speech be arrested and broken down into independent stable components with a presumed inherent, rather than
vague and contextually determined meaning.
Among the first results of school mastering the language
in the Ancient Near East were word lists, the paradigm case of
decontextualized language. These lists were the foundation of
lexicography (Civil 1995), the science of words. In China, lexicography began with lists of characters, and characters are still
the basic units of dictionaries. Dictionaries provide entries for
lexical items. A lexical item is a word contained in a dictionary.
More refined and less circular definitions of the orthographic
word, as distinct from the phonological word and the abstract
lexeme have been proposed in great number, but it remains
difficult if not impossible to define word without reference to
writing. The word stands at the beginning of grammatical scholarship, which was, as the word grammatical itself indicates,
exclusively concerned with written language. Grammatike,
combining the Greek words grammata (letters) and techne
(art), was the art of knowing letters. These beginnings of the
systematic study of language left a lasting imprint which, as
Per Linell (2005) has convincingly shown, still informs modern
linguistics. The word, the sentence, and even the phoneme
are analytic concepts derived from the discrete segments of
writing, not vice versa. The conceptualization of writing as
a representation of speech is therefore highly problematic
(R. Harris 1980, 2000). To sum up this section, the institutionalization of writing in school resulted in a changed attitude to
language. It became an object of study and regulation. Both of
these concepts were not in the first instance developed for, and
applied to, speech.
Under conditions of restricted literacy and diglossia, a wide
divide between spoken and written language was taken for
granted. Speech and writing were two modes of communication
involving varieties or languages that were both grammatically
and stylistically quite separate from each other. It was only when
literacy became accessible to wider sections of the population
that the relationship between speech and writing became an

issue. In medieval Europe, the ad litteras reform during the reign


of Charlemagne aimed at unifying spoken and written language,
as the widening gap between both was perceived as a problem.
It was eventually reduced, not so much by enforcing a uniform
standard for pronouncing Latin than by dethroning it as the only
written language and transforming lingua illiteratae (BlancheBenveniste 1994), that is, Romance, Germanic, and Slavonic
vernaculars, into written languages in their own right. Literacy
in these emerging national languages was bolstered by the
Reformation movement that wrested the interpretation monopoly of Christian scriptures from the Catholic clergy. Write as you
speak, a maxim that can be traced back to antiquity, became
an increasingly important principle for teaching writing
(Mller 1990). Although it unrealistically denies the speech/
writing distinction, generations of teachers have repeated it to
their pupils. It never meant that their writing should be as elliptical, situation-bound, and variable as their speech. The implication is that if you cannot write as you speak, something must
be wrong with your speech. Universal education on the basis of
this maxim resulted in a conceptual reduction of the distance
between speech and writing, with some notable consequences.
Mass literacy through schooling led to the disappearance of
diglossia from most European speech communities, although
the dichotomy of speech and writing continued to be expressed
in stylistic differences.
In other parts of the world, where universal education was
realized later or is still only a desideratum, the split between
spoken and written language remained. In todays world, the
1953 UNESCO declaration recommending mother tongue literacy notwithstanding, literacy education in the language of
the nation-state often means learning to read and write in a
second language. The extent to which progress in the promotion of literacy depends on the language of instruction is still
a matter of controversy, as is whether the writing system is a
significant variable. To some extent, this can be explained by
the fact that definitions of literacy are shifting with changing
socioeconomic needs and technical innovations, and because
the range of what are considered varieties of a given language
is variable (as is evident, for example, in the context of decreolization and discussions about elevating certain varieties,
such as black English in the United States, to language status).
There is, however, wide agreement that the crucial variable
is the effectiveness of the educational system. Mastering the
written language is a difficult task, which is best executed by
the institution that at the same time administers the written
language: school.
Since the time of the French Revolution, schools have been
charged with establishing the national language and, by way of
spreading the national language ideology, a link between language and nationalism. As a result, the demands of multilingual education are often at variance with the state-sponsored
educational system. Because of the nationalization of languages
in the modern nation-state and their privileged position in the
school system, however, the language of literacy training became
a political issue. Minority speech communities in many industrialized countries aspired to the prestige for their language that
comes with a written standard and started to lobby for the inclusion of their language in the school curriculum. Fueled by the

41

The Cambridge Encyclopedia of the Language Sciences


growing awareness of the importance of minority protection,
such movements have met with a measure of success, leading to
a highly complex situation of multiple and multilingual literacies
in modern societies, which of late has attracted scholarly attention (Martin-Jones and Jones 2000; Daswani 2001; Cook and
Bassetti 2005).
The prevailing view sees the establishment of a single national
language with a unified written standard as facilitating universal
literacy. International comparisons of literacy rates are notoriously difficult (Gurin-Pace and Blum 1999), but there is little
doubt that Europe, where national language academies first
implemented the national language ideology, led the way. Today,
however, the monolingual model of literacy is called into question by developments that, on the one hand, favor English as a
supplementary universal written language in many non-Englishspeaking countries, and on the other, allow minority languages
to make inroads into the domains of writing. The question of
whether the diversification of literacy will help achieve the goal
of eradicating illiteracy or whether it will compromise the alleged
economic advantage of having one written standard language
continues to be discussed by academics and politicians, while
the complementary developments of globalization of markets
and (re)localization of cultures unfold.

Economic Organization
The critical importance of writing for economic processes
in early civilizations is best documented for Mesopotamia.
It is widely agreed now that in Sumer, number concepts and
numerical representation stood at the beginning of writing
that evolved into cuneiform (Schmandt-Besserat 1992). The
overwhelming majority of archaeological finds from Ur, Uruk,
Nineveh, and Babylon are records of economic transactions
kept in clay tablet archives by accountants of the palace administration (Nissen, Damerow, and Englund 1990). The Sumerians
and their Akkadian successors were the first to develop bookkeeping into a sophisticated technique of balancing income and
expenditure. Hammurabi, King of Babylon (17281686 b.c.e.),
created the first commercial code, 282 articles written on a large
stone monument, which was erected in a public place for all to
observe. Every business transaction was to be in writing and
signed, usually with a seal, by the contracting parties. At the
time, the public-sector economy was far too highly developed
and too complex to function without writing. Tribute quota lists
had to be kept, rations for laborers involved in public works calculated, inventories recorded. Deeds were issued in duplicate
and receipts stored for future reference. Large-scale trading,
often involving credit and futures, had to be regulated and
overseen by the bureaucracy, consisting of a huge scribal class
charged with creating and servicing these documents.
Ancient Mesopotamia is the paradigm case of the close interconnectedness of economy and the institutionalization of writing, but if economic behavior is understood in the wide sense of
human adaptations to the needs and aspirations of society at a
given moment in history, this interconnectedness can be identified in every literate society. Complex socioeconomic systems of
managed production, distribution and trade, taxation, and credit
did not evolve in cultures that had no writing. Yet the nature of

42

the relationship between writing and economy is not a simple


one. For one thing, the degree of literacy necessary for economic
growth is a matter of controversy and depends on how economic
growth is measured. Illiteracy is considered a strong indicator of
economic underdevelopment, correlating as it does with infant
mortality, low life expectancy, and poverty; but the question of
whether high literacy drives economic growth or is a function
thereof remains unresolved. Functional illiteracy rates of up to
20 percent in Western countries, notably in the United States,
suggest that at least in the developed world, literacy rates are
more indicative of economic disparity than of overall national
economic development. Similarly, in developing countries,
illiterates are largely rural and without access to public services
(Varghese 2000). The distribution of wealth and the division of
labor in todays national economies are such that they allow for,
or perhaps even sustain, substantial residues of illiteracy. Both in
the developed and the developing world, people who partake of
the practice of writing live side by side with others who continue
to conduct their life in the oral mode.
It is fair to say that the evolution of writing in antiquity both
happened in response to and propelled economic development. Yet although writing technology has been available for
economic pursuits for more than five millennia, fully literate societies remain an ideal. This shows that the relationship
between economy and institutionalized writing is subject to
interference by other variables, notably those discussed in the
previous sections, that is, government, religion, culture, and
education. In antiquity, these spheres and the economy were
not separate. It was a long process that led to their differentiation in modern times. As the chief instrument of knowledge
management and growth, writing was indispensable for rationalization and what Max Weber (1922) called the disenchantment of the world, which, as the religious associations and
emotional values attached to various written languages testify,
is still incomplete.
The interaction of writing with the economy has been studied
from various points of view. Writing can accordingly be understood as
- a tool,
- social choice,
- a common good,
- human capital,
- transaction cost.
Writing is a tool that enables individuals to expand their communication range and communities to increase social integration and differentiation. It is useful and valuable. Since writing
systems, as discussed previously, are artifacts, this raises
the question of how this tool evolves so that it is functional.
Students of writing, notably I. J. Gelb (1963), have invoked
the principle of least effort, which predicts that in time, writing systems become simpler and easier to use. The relative
simplicity of the Greek alphabet, as compared, for example,
with Egyptian hieroglyphs and early cuneiform, lends support
to this hypothesis. However, intuitive though it is, there are
problems. Obvious counterexamples are found outside the
ancient Near Eastern civilizations mainly studied by Gelb. In

Social Practices of Speech and Writing


its long history, the Chinese writing system and its adaptation
in Japan increased rather than decreased in complexity. On
the other hand, it took 500 years for the much simpler Korean
Hangul to be accepted as an alternative to Chinese. If simplicity and adaptability were the forces that drive the spread
of writing systems, Hangul should have supplanted Chinese
characters, not just in Korea but in China and elsewhere long
ago. Looking at the evolution of individual systems, certain
letter forms in the Tibetan script were simplified to the extent
of becoming indistinguishable, rendering the system almost
dysfunctional. One has to conclude that if the principle of
least effort is at work in the evolution of the technology of the
intellect, then it is often compromised by other principles,
such as path dependency, that is, the fact that change in society depends on established ways, identity affirmation (see
ethnolinguistic identity ), and cultural exclusivism.
Considering writing and written language from the point
of view of social choice leads to a similar conclusion. A written language can be understood as an institution with certain
properties that is shaped by the agents involved in its operation. However, a single criterion of optimization of reasonable
target functions cannot explain the diversity of systems that
evolved, the multiplicity of uses, or the actual shape of individual systems, such as, for example, English spelling or the
Japanese usage of Chinese characters (Coulmas 1994). Clearly,
established writing conventions are the result of public choice.
No individual can introduce a better standard, even though its
superior qualities may be indubitable, for conformity with the
existing norm is crucial for the functionality of the institution.
Penalties on norm violations are therefore high, and proposals
for changing established practice invariably meet with strong
resistance. Change is not impossible, but it comes at a cost and
rarely uncompromisingly follows functional optimization.
A written language, if used by a community, has properties of
a common good. Like a public transport system, it is beneficial
for most members of that community and therefore deserving of
the attention of the state. This is the rationale underlying the legal
protection enjoyed by national languages, explicitly or implicitly,
in modern nation-states. A written language not used by everyone is not a common good and treated accordingly. Not serving
anyones interest, dead languages are of interest to the historian
at best. For the same reason, it has always been difficult to elevate a vernacular to the status of written language; not providing
access to information and not being used by many members of
the community in the beginning, it does not count as a common
good. Its claim to written language status is supported not by its
instrumental value but only by the symbolic value for its community. To reconcile the idea of a dominant language as a common
good with the recognition of minority rights is therefore problematic. With Pierre Bourdieu (1982), it can be conceptualized
as a struggle over legitimacy, competence, and access (market,
linguistic; habitus, linguistic). Only if social harmony is
made part of the equation will accommodation be possible.
A written language, the cognitive abilities its mastery implies,
and the information to which it provides access are a resource.
Partaking in it adds to an individuals human capital and hence
to his or her marketability. In the information economy of
the knowledge society, this view on written language is more

pertinent than ever (Levine 1994) and can explain processes such
as the accelerated spread of English. The globalization of markets and the information technology revolution offer ever more
people the opportunity to enhance their human capital and, at
the same time, compel them to do so. However, the commodification of written language (Heller 2003) and the new forms and
uses it takes on in the new media have consequences, some of
which become apparent only as technology spreads.
Conceptualizing written language as transaction cost brings
market forces into view. Reducing transaction costs is considered a quasi-natural precondition of economic growth, which
partly explains the spread of trade languages. Once in place and
controlled by relevant agents, their use is less costly than translation. In todays world, this principle seems strongly to favor the
further spread of English. However, the effects of technological
innovation are, as always, hard to foresee and even to assess
when it unfolds in front of our eyes. When the Internet ceased
to be a communication domain reserved to the U.S. military,
partly due to the available software at the time, it seemed to be
poised to become an English-only medium. But it turned out
that as the technology caught on, the share of English in cyberspace communication rapidly declined. The new information
technology made it much easier for speech communities big and
small around the world to communicate in writing, a possibility
eagerly exploited wherever the hardware became available. For
some communities this meant using their language in writing for
the first time. In others it led to the suspension of norms, blurring, in many ways yet to be explored, the traditional distinctions
between spoken and written language.
David Crystal (2001) suggests that Netspeak or computermediated language is a new medium, neither spoken language
nor written language. The implications of Internet communication for literacy in the electronic age are only beginning to be
explored (Richards 2000). New multilayered literacies are evolving, responding as they do to the complementary and sometimes
conflicting demands of economic rationality, social reproduction through education, ideology, and the technical properties of
the medium. These developments open up a huge new area of
research into how, since the invention of writing, the range of linguistic communication options has been constantly expanding.

CONCLUSION
Writing is a technology that interacts with social practices in
complex ways, exercising a profound influence on the way we
think, communicate, and conceptualize language. Since it is an
artifact, it can be adjusted deliberately to the evolving needs
of society, but it also follows its own inherent rules that derive
from the properties of the medium. This essay has analyzed the
tension between the properties of the medium and the designs
of its users from two points of view, technological advance and
institutionalization. Harnessed by institutions, the technology
of writing is made serviceable to the intellectual advance of
society and modified in the process, sometimes gradually and
sometimes in revolutionary steps. Three consequences of writing remain constant: segmentation, linearization, and accumulation. The linear and discrete-segmental structure that all
writing systems both derive from and superimpose on language

43

The Cambridge Encyclopedia of the Language Sciences


forever informs the perception of language in literate society.
And by making language visible and permanent, it enables and
compels its users to accumulate information far beyond the
capacity of human memory, engendering ever new challenges
for storage and organization.
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Assmann, Jan. 1991. Stein und Zeit: Mensch und Gesellschaft im alten
gypten. Munich: Wilhelm Fink Verlag.
Backhaus, Peter. 2007. Linguistic Landscapes: A Comparative Study of
Urban Multilingualism in Tokyo. Clevedon, UK: Multilingual Matters.
Blanche-Benveniste, Claire. 1994. The construct of oral and written
language. In Functional Literacy: Theoretical Issues and Educational
Applications, ed. Ludo Verhoeven, 6174. Amsterdam and
Philadelphia: John Benjamins.
Bourdieu, Pierre. 1982. Ce que parler veut dire. Paris: Fayard.
Burns, Alfred. 1981. Athenian literacy in the fifth century B.C. Journal of
the History of Ideas 42.3: 37187.
Civil, Miguel. 1995. Ancient Mesopotamian lexicography. In
Civilizations of the Ancient Near East, ed. Jack M. Sasson, 230514. New
York: Charles Scribners Sons.
Cook, Vivian, and Benedetta Bassetti, eds. 2005. Second Language Writing
Systems. Clevedon, UK: Multilingual Matters.
Coulmas, Florian. 1994. Writing systems and literacy: The alphabetic myth revisited. In Functional Literacy: Theoretical Issues and
Educational Implications, ed. Ludo Verhoeven, 30520. Amsterdam
and Philadelphia: John Benjamins.
. 1996. The Blackwell Encyclopaedia of Writing Systems.
Oxford: Blackwell.
. 2003. Writing Systems: An Introduction to Their Linguistic Analysis.
Cambridge: Cambridge University Press.
. 2005. Sociolinguistics: The Study of Speakers Choices.
Cambridge: Cambridge University Press.
Crystal, David. 2001. Language and the Internet. Cambridge: Cambridge
University Press.
Daswani, C. J., ed. 2001. Language Education in Multilingual India. New
Delhi: UNESCO.
De Biasi, Pierre-Marc. 1999. Le papier: Une aventure au quotidien.
Paris: Gallimard.
Eisenstein, Elizabeth L. 1979. The Printing Press as an Agent of Change.
Cambridge: Cambridge University Press.
Febvre, Lucien, and Henri-Jean Martin. [1958] 1999. Lapparition du
livre. Paris: Albin Michel (Bibliothque Evolution Humanit).
Ferguson, Charles A. 1959. Diglossia. Word 15: 32540.
Fischer, Herv. 2006. Digital Shock: Confronting the New Reality.
Montreal: McGill-Queens University Press.
Gelb, I. J. 1963. A Study of Writing. Chicago and London: University of
Chicago Press.
Goody, Jack. 1977. The Domestication of the Savage Mind. Cambridge:
Cambridge University Press.
Goody, Jack, and Ian Watt. 1968. The consequences of literacy. In
Literacy in Traditional Societies, ed. Jack Goody, 2768. Cambridge:
Cambridge University Press.
Gurin-Pace, F., and A. Blum. 1999. Lillusion comparative: Les logiques
dlaboration et dutilisation dune enqute internationale sur
lillettrisme. Population 54: 271302.
Harris, Roy. 1980. The Language Makers. Ithaca, NY: Cornell University
Press.
. 2000. Rethinking Writing. London: Athlon Press.
Harris, William V. 1989. Ancient Literacy. Cambridge: Harvard University
Press.

44

Harvey, F. D. 1966. Literacy in Athenian democracy. Revue des tudes


Grecques 79: 585635.
Havelock, Eric A. 1982. The Literate Revolution in Greece and Its Cultural
Consequences. Princeton, NJ: Princeton University Press.
Heller, Monica. 2003. Globalization, the new economy, and the commodification of language and identity. Journal of Sociolinguistics
7.4: 47392.
Illich, Ivan, and Barry Sanders. 1988. The Alphabetization of the Popular
Mind. San Francisco: North Point.
Jones, Derek, ed. 2001. Censorship: A World Encyclopedia. Vols.14.
London: Fitzroy Dearborn.
Kapitzke, Cushla. 1995. Literacy and Religion: The Textual Politics and
Practice of Seventh-Day Adventism. Amsterdam and Philadelphia: John
Benjamins.
Kressel, Henry, with Thomas V. Lento. 2007. Competing for the Future
How Digital Innovations Are Changing the World. Cambridge:
Cambridge University Press.
Krishnamurti, Bh., ed. 1986. South Asian Languages: Structure,
Convergence and Diglossia. Delhi: Motilal Banarsidass.
Landry, Rodrigue, and Richard Y. Bourhis. 1997. Linguistic landscape
and ethnolinguistic vitality. Journal of Language and Social Psychology
16.1: 2349.
Levine, Kenneth. 1994. Functional literacy in a changing world. In
Functional Literacy: Theoretical Issues and Educational Applications,
ed. Ludo Verhoeven, 11331. Amsterdam and Philadelphia: John
Benjamins.
Lewis, Mark Edward. 1999. Writing and Authority in Early China. SUNY
Series in Chinese Philosophy and Culture. Albany: State University of
New York Press.
Liberman, Philip, and Sheila Blumstein. 1988. Speech Physiology, Speech
Perception, and Acoustic Phonetics. New York: Cambridge University
Press.
Linell, Per. 2005. The Written Language Bias in Linguistics: Its Nature,
Origin and Transformation. London: Routledge.
Lloyd, Geoffrey and Nathan Sivin. 2002. The Way and the Word: Science
and Medicine in Early China and Greece. New Haven and London: Yale
University Press.
Martin-Jones, Marilyn, and Kathryn Jones. 2000. Multilingual
Literacies: Reading and Writing Different Worlds. Amsterdam and
Philadelphia: John Benjamins.
McLuhan, Marshall. 1964. Understanding Media: The Extension of Man.
New York: McGraw-Hill.
Mller, Karin. 1990. Schreibe, wie du sprichst!: Eine Maxime im
Spannungsfeld von Mndlichkeit und Schriftlichkeit. Frankfurt am
Main and Bern: Peter Lang.
Negroponte, Nicholas. 1995. Being Digital. New York: Alfred A. Knopf.
A cyberspace extension is available online at: http://archives.obs-us.
com/obs/english/books/nn/bdintro.htm.
Nissen, Hans J. 1988. The Early History of the Ancient Near East. Chicago
and London: The University of Chicago Press.
Nissen, Hans J., Peter Damerow, and Robert K. Englund. 1990. Frhe
Schrift und Techniken der Wirtschaftsverwaltung im alten Vorderen
Orient: Informationsspeicherung und verarbeitung vor 5000 Jahren.
N.p.: Verlag Franzbecker.
Nystrand, Martin. 1986. The Structure of Written Communication.
Orlando, FL: Academic Press.
Olson, David R. 1977. From utterance to text: The bias of language in
speech and writing. Harvard Educational Review 47.3: 25786.
. 1991. Literacy and Orality. Cambridge: Cambridge University
Press.
. 1994. The World on Paper. Cambridge: Cambridge University
Press.

Social Practices of Speech and Writing


Ong, Walter J. 1982. Orality and Literacy: The Technologizing of the Word.
London: Methuen.
Parkinson, Richard. 1999. Cracking Codes: The Rosetta Stone and
Decipherment. London: British Museum Press.
Posener, Georges. 1956. Littrature et politique dans lgypte de la XIIe
dynastie. Paris: Honor Champion (Bibliothque lcole des hautes
tudes. 3007.)
Richards, Cameron. 2000. Hypermedia, Internet communication, and
the challenge of redefining literacy in the electronic age. Language
Learning and Technology 4.2: 5977.
Roberts, Colin H., and T. C. Skeat. 1983. The Birth of the Codex.
Oxford: Oxford University Press.
Sassoon, Rosemary. 1995. The Acquisition of a Second Writing System.
Oxford: Intellect.
Schiffman, Harold. 1996. Linguistic Culture and Language Policy. London
and New York: Routledge.
Schmandt-Besserat, Denise. 1992. Before Writing. Austin: The University
of Texas Press.
Stratton, Jon. 1980. Writing and the concept of law in Ancient Greece.
Visible Language 14.2: 99121.

Taylor, Insup, and M. Martin Taylor. 1995. Writing and Literacy in


Chinese, Korean and Japanese. Amsterdam and Philadelphia: John
Benjamins.
Thomas, Rosalind. 1992. Literacy and Ortality in Ancient Greece.
Cambridge: Cambridge University Press.
Twitchett, Denis C. 1983. Printing and Publishing in Medieval China.
London: The Wynkyn de Worde Society.
UNESCO Institute for Statistics. 2006. Available online at: http://portal.
unesco.org/education/en/ev.php-URL_ID=40338&URL_DO=DO_
TOPIC&URL_SECTION=201.html.
Unseth, Peter. 2005. Sociolinguistic parallels between choosing scripts
and languages. Written Language and Literacy 8.1: 1942.
Varghese, N. V. 2000. Costing of total literacy campaigns in India. In
Adult Education in India, ed. C. J. Daswani and S. Y. Shah, 22750. New
Delhi: UNESCO.
Weber, Max. 1922. Wirtschaft und Gesellschaft: Grundriss der verstehenden Soziologie. Tbingen: J. C. B. Mohr.
Wiley, T. G. 1996 . Literacy and Language Diversity in the United
States. Washington, DC: Center of Applied Linguistics and Delta
Systems.

45

5
EXPLAINING LANGUAGE: NEUROSCIENCE,
GENETICS, AND EVOLUTION
Lyle Jenkins

Before undertaking a discussion of the explanation of language,


we should point out that we are using this word in a special sense.
As Noam Chomsky has noted (see Jenkins 2000), while we cannot ask serious questions about general notions like vision or
language, we can ask them of specific systems like insect vision
or human language. In what follows, our focus is on the biological foundations of language. As a result, certain areas commonly
referred to as languages are excluded from consideration, for
example, some invented logical systems, computer languages
such as Java, encrypted languages, the language of DNA, and
so on. These are all important and interesting areas of investigation. In fact, significant insights into human language may
be gained from the study of some of these fields. For example,
it has been argued that particular systems of logical form
may shed light on the semantics of human language. Both biological factors and nonbiological factors interact in such areas as
pragmatics and sociolinguistics. In addition, the study of
formal languages (e.g., the Chomsky hierarchy) has also led
to some important contributions. However, these areas cannot
be completely accounted for by a consideration of the biology of
human language. It is important to keep in mind that an account
of cognitive systems like human language (as well as systems of
animal communication) often exhibit a significant degree
of modularity. In this view, biological factors interact with
other factors to provide a unified explanation. Sometimes the
term i-language is used to distinguish the biological object in
the mind-brain that biolinguists study from other uses of word
language.
The same is true of other areas of study that are even more
closely related to human language, such as the role of language
in poems and novels, or the influence of economic factors and
conquests and invasions on language change. historical linguistics may involve factors both within and outside of the
scope of human language biology. Again, although analysis of
the biology of human language may shed some light on these
areas (e.g., the study of phonetics and phonology may be
useful in the analysis of poetry), in general it will not provide an
exhaustive account of these areas. Similarly, there has been great
interest in language as a system of communication. For the
reasons discussed here, there is not much to say about arbitrary

46

systems of communication (semaphores, bee language, etc.),


but the study of the biology of language might shed some light
on the case of human language communication. In this essay, I
consider some diverging ideas about the role of communication
in driving language evolution.
The study of the biology of language (see biolinguistics) is
traditionally divided into the investigation of three questions: 1)
What is knowledge of language? 2) How does language develop in
the child? 3) And how does language evolve in the species? (See
Chomsky 1980, 2006; Jenkins 2000, 2004.) Note that the study of
the neuroscience of language cross-cuts with all three questions.
That is, we can ask: 1) What are the neurological underpinnings
of the faculty of language? 2) How does the language faculty
develop in the nervous system of the individual? 3) How did the
language faculty evolve in the species? (See Chomsky and Lasnik
1995.)
These three questions are sometimes referred to as the what
and how questions of biolinguistics. There is another question,
the why question, which is more difficult to answer. This is the
question of why the principles of language are what they are
(Chomsky 2004). Investigations into these why questions has in
recent years been termed the minimalist program (or minimalism), but interest in and discussion of these questions go back to
the earliest days of biolinguistics (Chomsky 1995; Boeckx 2006).
Properties of the attained language derive from three factors
(Chomsky 2005b): 1) genetic endowment for language, 2) experience, and 3) principles not specific to the faculty of language.
Principles in 3) might even be non-domain-specific or nonorganism-specific principles. Examples of such principles are
principles of efficient computation. Note that similar questions
can be posed about any biological system viruses, protein folding, bacterial cell division, sunflowers, bumblebees, falling cats,
nervous systems, and so on.
Furthermore, similar kinds of questions arise in every linguistics research program (Chomsky 2005b). To make the discussion
manageable, in what follows I draw examples and discussion
from minimalism. However, the issues and problems carry over
to all other research programs concerned with providing explanations for properties of human language and accounting for
them in terms of neurobiology, genetics, and evolution. For
example, any theory of the language faculty will generate infinitely many expressions that provide instructions to the sensorimotor and semantic interfaces. All such generative systems
will have an operation that combines structures (e.g., in minimalism, merge), such as the phrase the boy with the phrase saw
the cat, and so on, formed from the lexical items of the language.
Applying this operation over and over (unbounded Merge), we get
a discrete infinity of expressions, part of property 1), our genetic
endowment, in this particular case of the genetic component of
the language faculty.
Many well-known accounts of the evolution of language propose communication as the primary selective force involved
in the origin of language (see Works Cited). Here, for purposes
of comparison, we present a noncommunicative account of
the origins of language, suggested by Chomsky and based on
work in minimalism. However, we stress again that this kind of
account is also compatible with approaches based on other linguistic research programs. We then discuss another viewpoint

Explaining Language
based on the idea that language evolved from gestures, an idea
that is represented by a number of approaches to the evolution
of language. After that, we present an example of the comparative approach to evolution in biolinguistics and discuss language
and the neurosciences, focusing on leftright asymmetries of the
language areas to illustrate this research. Finally, we discuss the
genetics of language, using the studies of the FOXP2 gene system
to show how studies of language phenotype, neurobiology, and
molecular biology, as well as comparative and evolutionary studies with other animal systems, are being carried out.
Work on the principles of efficient computation governing the
application of the operation of Merge seem to suggest an asymmetry, namely, that the computational principles are optimized
for the semantic interface, not the sensorimotor interface. This is
because conditions of computational efficiency and ease of communication conflict, as is familiar from the theory of language
parsing. This in turn has led Chomsky to suggest that externalization of language, and hence communication, was a secondary
adaptation of language. This would mean that language arose
primarily as an internal language of thought. Supporting
this idea is the existence of sign languages, which develop in
a different modality but in other respects are very similar to spoken language (Kegl 2004; Pettito 2005).
These design features of language have led Chomsky to propose the following scenario for the evolution of language. Several
decades ago Chomsky suggested, on the basis of results from nonhuman primate studies, that the higher apes might well have a
conceptual system with a system of object reference and notions
such as agent, goal, instrument, and so on. What they lack, however, is the central design feature of human language, namely,
the capacity to deal with discrete infinities through recursive
rules (Chomsky 2004, 47). He proposed that when you link that
capacity to the conceptual system of the other primates, you get
human language, which provides the capacity for thought, planning, evaluation and so on, over an unbounded range, and then
you have a totally new organism (Chomsky 2004, 48).
Thus, let us assume that some individual in the lineage to modern humans underwent a genetic change such that some neural
circuit(s) were reorganized to support the capacity for recursion.
This in turn provided the capacity for thought and so on over an
unbounded range. This in itself provided that individual and any
offspring with a selective advantage that then spread through the
group. Thus, the earliest stage of language would have been just
that: a language of thought, used internally (Chomsky 2005a, 6).
At some later stage, there was an advantage to externalization, so
that the capacity would be linked as a secondary process to the
sensorimotor system for externalization and interaction, including communication (7).
The evolutionary scenario just outlined is derived from design
principles suggested from work on human languages. Many other
evolutionary scenarios have been proposed that assume that
communication or other social factors played a more primary
role. These include accounts involving manual and facial gestures (Corballis 2002), protolanguage (Bickerton 1996), grooming (Dunbar 1998), and action and motor control (Rizzolatti
and Arbib 1998). However, the two kinds of accounts are not
incompatible and may represent different aspects or stages in
the evolution of language. We cannot review all of these research

directions here (but see the discussion of population dynamics


in a later secion; Nowak and Komarova 2001).
A number of these accounts attempt to interconnect the evolution of manual gestures, sign language, spoken language, and
motor control in various ways. Some of this work is based on the
discovery of a system of mirror neurons (mirror systems,
imitation, and language) (Gallese, Fadiga, et al. 1996). This
and later work demonstrated the existence of neurons in area F5
of the premotor cortex of the monkey, which are activated when
the monkey executes an action, for example, grasping an object,
and also when the monkey observes and recognizes the action
carried out by another monkey, or even the experimenter. In
addition, a subset of audiovisual mirror neurons were discovered that are activated when the sound of an action is perceived,
for example, the tearing of paper (Kohler et al. 2002). In addition to hand mirror neurons, communicative mouth mirror
neurons were discovered that were activated both for ingestive
actions and for mouth actions with communicative content,
such as lip smacking in the monkey (Ferrari et al. 2003).
Since it had been suggested on the basis of cytoarchitectonic
studies that there was a homology between area F5 of the monkey and area 44 (in brocas area) of the human brain (Petrides
and Pandya 1994), researchers looked for and found mirror
neuron systems in humans, using fMRI (Iacoboni et al. 1999)
and event-related magnetoencephalography (MEG) (Nishitani
and Hari 2000) in place of single neuron studies. Mirror neurons discharge whether the action is executed, observed, or
heard. Moreover, they even discharge in the human system
when subjects are exposed to syntactic structures that describe
goal-directed actions (Tettamanti et al. 2005). In an fMRI study,
17 Italian speakers were asked to listen to sentences describing
actions performed with the mouth (I bite an apple), with the
hand (I grasp a knife), and with the leg (I kick the ball). In
addition, they were presented a control sentence with abstract
content (I appreciate sincerity). In the case of the actionrelated words, the left-hemispheric fronto-parietaltemporal
network containing the pars opercularis of the inferior frontal
gyrus (Brocas area) was activated. Other areas were differentially activated, depending on the body part. They conclude that
the experiment showed that the role of Brocas area was in the
access to abstract action representations, rather than in syntactic
processing per se (277).
On the basis of these outlined findings, it has been suggested that speech may have evolved from gesture rather than
from vocal communication by utilizing the mirror neuron system (Rizzolatti and Arbib 1998; Gentilucci and Corballis 2006;
Fogassi and Ferrari 2007). Leonardo Fogassi and Pier Francesco
Ferrari note that the motor theory of speech perception fits well
with an account in terms of mirror neurons in that the objects
of speech perception are the intended phonetic gestures of the
speaker, represented in the brain as invariant motor commands
(Liberman and Mattingly 1985, 2). However, Fogassi and Ferrari
note that, even if the mirror neuron system is involved in speech,
the currently available evidence does not appear to support a
dedicated mirror-neuron system for language in humans.
Additional accounts of the evolution of language may also
be found in Maynard Smith and Szathmary (1998), Lieberman
(2006), and Christiansen and Kirby (2003). See also adaptation;

47

The Cambridge Encyclopedia of the Language Sciences


grooming, gossip, and language; mirror systems, imitation, and language; evolutionary psychology;
morphology, evolution and; origins of language;
phonology, evolution of; pragmatics, evolution and;
primate vocalizations; semantics, evolution and;
speech anatomy, evolution of; syntax, evolution of;
verbal art, evolution and.

THE COMPARATIVE APPROACH IN BIOLINGUISTICS


Throughout the modern era of biolinguistics, a question that has
been much debated is to what degree the faculty of language
is uniquely human. Marc D. Hauser and colleages (2002) have
stressed the importance of the comparative approach in the
study of this question. Early on, Chomsky, in making a case for
the genetic determination of language, used arguments from
animal behavior (ethology) to note the similarities in learning
between birdsong and animals, in particular, rapidity of learning, underdetermination of data, and so on (Chomsky 1959).
Hauser, Chomsky, and their colleagues have emphasized a
number of methodological points concerning the comparative
approach, using a distinction between the faculty of language
in the broad sense (FLB) and the faculty of language in the narrow sense (FLN). The basic idea is that before concluding that
some property of language is uniquely human, one should study
a wide variety of species with a wide variety of methods. And
before concluding that some property of language is unique to
language, one should consider the possibility that the property
is present in some other (cognitive) domain, for example, music
or mathematics. They tentatively conclude that recursion may
be a property of language that is unique to language and, if so,
belongs to the faculty of language in the narrow sense.
An example of the application of the comparative method is
the investigation of the computational abilities of nonhuman primates by W. Tecumseh Fitch and Hauser (2004), who tested the
ability of cotton-top tamarins, a New World primate species, as
well as human controls, to process different kinds of grammars.
Using a familiarization/discrimination paradigm, they found
that the tamarins were able to spontaneously process finite-state
grammars, which generate strings of syllables of the form (AB)n,
such as ABAB, ABABAB. However, they were unable to process
context-free grammars, which generate strings of syllables of the
form AnBn, such as AABB, AAABBB. It is known from formal language theory (the Chomsky hierarchy) that context-free grammars are more powerful than finite-state grammars. Moreover,
the humans tested were able to rapidly learn either grammar.
The authors conclude that the acquisition of hierarchical processing ability, that is, the ability to learn context-free grammars, may have represented a critical juncture in the evolution
of the human language faculty (380).
In a later study, Timothy Q. Gentner and colleagues (2006)
showed that European starlings, in contrast to tamarins, can
recognize acoustic patterns generated by context-free grammars. Using an operant conditioning paradigm, they found that
9 of 11 starlings were able to learn both finite-state grammar and
context-free grammar sequences accurately. The (AB)n and AnBn
sequences in this case were made up of acoustic units (motifs)
from the song of the starlings. In this case, A corresponded to

48

a rattle and B to a warble. As for possible reasons why the


starlings succeeded with the context-free task while the tamarins
failed, and for alternative explanations of the learning results, see
additional discussion in Fitch and Hauser (2004), M. Liberman
(2004), Gentner et al. (2006), and Hauser et al. (2007).

LANGUAGE AND THE NEUROSCIENCES


Turning to the neurosciences (see questions 13), with the
advent of more tools both experimental, such as imaging and
the methods of comparative genomics, and theoretical, such
as computational theories of linguistics we can look forward
to more informative models of language and the brain. David
Poeppel and Gregory Hickok (2004) note three problems with
classical models of language (involving such concepts as Brocas
area and wernickes area): 1) They are inadequate for aphasic
symptomatology, 2) they are based on an impoverished linguistic model, and 3) there are anatomical problems.
As for aphasic symptomatology, the classical models do not
explain certain subtypes of aphasia, like anomic aphasia. Also,
clusters of aphasic symptoms are highly variable and dissociable, indicating that there is a more complex architecture underlying the syndromes.
As for the second problem, there has been a tendency of some
implementations of classical models to incorporate a monolithic
picture of linguistic components, for example, production versus
comprehension, or semantics versus syntax, without regard for
finer computational subdivisions.
And finally, certain anatomical problems came into view. For
example, Brocas aphasia and Wernickes aphasia did not always
correspond to damage in the areas that bore their names. In addition, these classical areas were not always anatomically or functionally homogenous. In addition, areas outside these regions
were found to be implicated in language processing, including,
for example, the anterior superior temporal lobe, the middle
temporal gyrus (MTG), the temporo-parietal junction, the
basal ganglia, and many right hemisphere homologs
(Poeppel and Hickok 2004, 5). As noted, in the last few decades
many new approaches and tools have been developed in neurolinguistics and neurology, genetics and molecular neurobiology
(examples follow), and these have helped to overcome the kinds
of issues pointed out by Poeppel and Hickok.
For a review of some of the attempts to develop a new functional anatomy of language, see the essays published along
with Poeppel and Hickock (2004) in Cognition 92 (2004). For
new approaches to the study of Brocas area, see Grodzinsky and
Amunts (2006).

LEFTRIGHT ASYMMETRIES OF THE LANGUAGE AREAS


Ever since Paul Pierre Brocas and Carl Wernickes seminal discoveries of areas involved in language processing, questions
about asymmetries in the brain has been a lively area of research.
As early as 1892, Daniel J. Cunningham reports that he found
the left sylvian fissure longer in the chimpanzee and macaque.
Cunningham, in turn, cites earlier work by Oskar Eberstaller on
the Sylvian fissure in humans, who had concluded that it was
longer in the left hemisphere than in the right (on average). He

Explaining Language
postulated that this region held the key to what he called the sensible Sprachcentrum (sensible/cognizant language center).
Claudio Cantalupo and William D. Hopkins (2001) report
finding an anatomical asymmetry in Brocas area in three great
ape species. They obtained magnetic resonance images (MRI)
(neuroimaging) from 20 chimpanzees (P. troglodytes), 5
bonobos (P. paniscus), and 2 gorillas (G. gorilla). In humans,
Brodmanns area 44 corresponds to part of Brocas area within
the inferior frontal gyrus (IFG). This area is larger in the left
hemisphere than in the right. Furthermore, it is known that
Brocas area is vital for speech production (with the qualifications discussed earlier).
Although the great apes were known to have a homolog of
area 44 on the basis of cytoarchitectonic and electrical stimulation studies, no leftright anatomical asymmetry had been
shown. Cantalupo and Hopkins found a pattern of morphological asymmetry similar to that found in the homologous area in
humans. This would place the origin of the asymmetry for the
anatomical substrate for speech production to at least five million years ago.
Since the great apes exhibit only primitive vocalizations, these
authors speculate that this area might have subserved a gestural
system (see earlier discussion). They note the presence in monkeys of mirror neurons in area 44 that subserve the imitation of
hand grasping and manipulation (Rizzolatti and Arbib 1998).
They also observe that captive great apes have a greater righthand bias when gesturing is accompanied by vocalization. Hence
in the great apes, the asymmetry may have subserved the production of gestures accompanied by vocalizations, whereas, for
humans, this ability was selected for the development of speech
systems, accompanied by the expansion of Brodmanns area 45
(which, along with Brodmanns area 44, makes up Brocas area)
(Cantalupo and Hopkins 2001, 505).
However, additional studies of Brodmanns area 44 in
African great apes (P. troglodytes and G. gorilla) call into question whether the techniques used in Cantalupo and Hopkins
study were sufficient to demonstrate the leftright asymmetry.
Chet C. Sherwood and colleagues (2003) found considerable
variation in the distribution of the inferior frontal sulci among
great ape brains. They also constructed cytoarchitectural maps of
Brodmanns area 44, examining myeloarchitecture and immunohistochemical staining patterns. When they studied the IFG
of great ape brains, they found a poor correspondence between
the borders observed in the cytoarchitectural maps and the borders in the surface anatomy (e.g., sulcal landmarks). There were
similar findings for human brains in an earlier study (Amunts et
al. 1999). Sherwood and colleagues conclude that in the study
by Cantalupo and Hopkins, it is unlikely that the sulci used to
define the pars opercularis coincided with the borders of cytoarchitectural area 44 (2003, 284). In general then, macrostructure is a poor predictor of microstructure.
Sherwood and colleagues also point out that even if humanlike asymmetries of the inferior frontal gyrus and of the planum
temporale are confirmed, these gross asymmetries will not suffice to explain the unique neural wiring that supports human
language (284). To that end, comparative studies of microstructure in humans and great apes are needed. For example, a computerized imaging program was used to examine minicolumns

in a region of the planum temporale in human, chimpanzee,


and rhesus monkey brains. It was found that only human brains
exhibited asymmetries in minicolumn morphology, in particular, wider columns and more neuropil space (Buxhoeveden et al.
2001).
It is possible that circuits could be reorganized within a language region without a significant volumetric change so that a
novel function in language could evolve. Sherwood and colleagues conclude: Therefore, it is likely that Brodmanns area
44 homolog in great apes, while similar in basic structure to that
in humans, differs in subtle aspects of connectivity and lacks
homologous function (284).
Allen Braun (2003) notes that MRI could still turn out to be
useful for the study of microstructure at higher field strengths,
with the addition of MR contrast agents, and with the use of
diffusion-weighted MR methods. He also notes that the pars
orbitalis has often been arbitrarily excluded from the definition
of Brocas area, and might be important in the search for antecedents of language in nonhuman primates. In particular, some
studies suggest that the pars orbitalis is selectively activated by
semantic processes (as opposed to phonological or syntactic
processes) (Bookheimer 2002).
It is known that nonhuman primates have structures homologous to the perisylvian areas involved in human language, that
is, support both expressive and receptive language (Galaburda
and Pandya 1983; Deacon 1989). Ricardo Gil-da-Costa and colleagues (2006) presented species-specific vocalizations in rhesus
macaques and found that the vocalizations produced distinct
patterns of brain activity in areas homologous to the perisylvian language areas in humans using H215O positron emission
tomography (PET). Two classes of auditory stimuli were presented to the monkeys. One was species-specific macaque
vocalizations (coos and screams). As a control, nonbiological sounds were presented that matched the species-specific
vocalizations in frequency, rate, scale, and duration. They found,
for example, a greater response to species-specific calls than to
nonbiological sounds in the perisylvian system with homologs
in humans, for example, to the area Tpt of the temporal planum
and to the anterior perisylvian cortex, roughly corresponding to
the areas studied by Wernicke and Broca in humans. However,
they did not find any clear lateralization effects in the macaque
brain comparable to the anatomical and functional asymmetries
documented in humans and anatomical asymmetries in apes
(Gannon et al. 1998).
Gil-da-Costa and colleagues (2006) note that the perisylvian
regions are not performing linguistic computations in the
macaque, but could be performing a prelinguistic function in
associating the sound and meaning of species-specific vocalizations. Furthermore, this would position the perisylvian system to be recruited for use during the evolution of language.
More specifically, it may have been exapted during the emergence of more complex neural mechanisms that couple sound
and meaning in human language (2006, 1070). Although I have
focused here on the perisylvian system, it should be emphasized
that areas outside this system have also been demonstrated to be
involved in language.
K. A. Shapiro and colleagues (2006) provide another example
of the application of imaging studies to investigate how linguistic

49

The Cambridge Encyclopedia of the Language Sciences


categories like nouns, verbs, and adjectives are organized in the
brain. An event-related functional MRI imaging study has found
specific brain sites that are activated by either nouns or verbs,
but not both. In a series of experiments, subjects were asked to
produce nouns and verbs in short phrases as real words (the
ducks, he plays), as well as pseudowords (the wugs, he zibs), both
with regular inflections and irregular inflections (geese, wrote),
including both concrete and abstract words. Specific brain
areas were selectively activated for either verb production (left
prefrontal cortex and left superior parietal lobule) or for noun
production (left anterior fusiform gyrus) across the entire battery of tests. Moreover, the areas were nonoverlapping, leading
the authors to conclude that these regions are involved in representing core conceptual properties of nouns and verbs (2006,
1644).
In recent years, it has become possible to study asymmetries
on a molecular level as well (Sun and Walsh 2006). As discussed
earlier, there are functional, anatomical, and cytoarchitectonic
differences between the left and right cerebral hemispheres in
humans. To determine what the molecular basis for these asymmetries might be, Tao Sun and colleagues (2005) compared left
and right embryonic cerebral hemispheres for leftright differences in gene expression, using serial analysis of gene expression (SAGE). They discovered 27 genes whose transcriptions
were differentially expressed on the left and right sides. In particular, the gene LMO4, which asymmetrically expressed the Lim
Domain Only 4 transcription factor, is more highly expressed in
the perisylvian regions of the right hemisphere than in the left
at 12 weeks and 14 weeks. Further studies are needed to determine how LMO4 expression is regulated by factors still earlier in
development.
Mouse cortex was also examined, and it was found that
although Lmo4 expression was moderately asymmetrical in every
individual mouse brain, the expression was not consistently lateralized to either the left or the right. This may be related to the
fact that asymmetries like paw preference are seen in individual
mice but are not biased in the population as a whole, as hand
preference is in humans.
The results of this study are also consistent with the observation that the genes involved in visceral asymmetries (e.g., of the
heart) are not measurably implicated in cerebral asymmetries. It
had been noted earlier that situs inversus mutations in humans
do not appear to interfere with the left-hemisphere localization of
language and handedness (Kennedy et al. 1999). In these earlier
studies, it had been found that the pattern of language lateralization in patients with situs inversus was identical to that found
in 95 percent of right-handed individuals with normal situs. It
was concluded that the pathway affecting language dominance
and handedness was most likely distinct from that affecting the
asymmetry of the visceral organs.

GENETICS AND SPEECH DISORDERS


Biolinguists would like to understand the wiring of networks
underlying language function at the level of genes. We have seen
that one way to study this question is to use such differential gene
expression methods as SAGE. Another key way of investigating
the genetics of language is by studying language disorders (see

50

dyslexia, specific language impairment, and autism


and language). By understanding how genetic changes can
cause the operation of language to break down, we get an idea of
the genes that are important for language acquisition, processing, and, ultimately, language evolution.
An autosomal-dominant and monogenic disorder of speech
and language was found in the KE family with a 3-generation
pedigree. A monogenic disorder involves a mutation in a single
gene, and here the individuals have one copy of the mutant gene
and one normal gene on two autosomal (non-sex) chromosomes.
The disorder was mapped to the region 7q31 on chromosome 7,
and it was shown that the gene (called FOXP2) had a mutation
in a forkhead-domain of the protein it encoded in the affected
family (Lai et al. 2001).
The individuals were diagnosed as having developmental verbal dyspraxia. The phenotype was found to be quite complex,
affecting orofacial sequencing, articulation, grammar, and cognition and is still incompletely understood and under investigation (see also genes and language; Jenkins 2000).
The FOXP2 gene was found to code for a transcription factor,
that is, a protein that regulates gene expression by turning a gene
on or off or otherwise modulating its activity. It is natural to ask
what other genes FOXP2 may regulate and how it regulates these
genes (turning them on or off, for example), as well as to determine whether any of these genes downstream are involved in
speech and language in a more direct way.
To find the gene targets of FOXP2 in the brain and to determine the effects on those genes, methods are being developed to
identify these neural targets both in vitro and in vivo (Geschwind
and Miller 2001). The laboratory of D. H. Geschwind has developed a genomic screening approach that combines 1) chromatin
immunoprecipitation and 2) microarray analysis (ChIP-Chip).
In chromatin immunoprecipitation, an antibody that recognizes
the protein of interest (e.g., foxp) is used to fish out a proteinDNA complex. The DNA is then hybridized to arrays with DNA
from thousands of human genes. This allows the identification of
binding sites for transcription factors (in this case, FOXP2). The
goal is to discover potential gene candidates involved in the development of neural circuits supporting speech and language.
The homologue to the human FOXP2 gene has been discovered in a number of different species, including mice (Foxp2)
and songbirds, such as the zebra finch (FoxP2). Whatever ones
views on the relationship between human language and other
animal communication systems, it is important to study the evolutionary origin of genes that affect language, such as FOXP2, for
one can learn about the neural pathways constructed by these
genes, which might not otherwise be possible in experiments
with humans.
It has been found that the zebra finch and human protein
sequence is 100 percent identical within the DNA-binding
domain, suggesting a possible shared function (White et al. 2006;
Haesler et al. 2004). In addition, Constance Scharff and Sebastian
Haesler (2005) report that the FoxP2 pattern of expression in the
brain of birds that learn songs by imitation resembles that found
in rodents and humans. In particular, FoxP2 is expressed in the
same cell types, for example, striatal medium spiny neurons.
Moreover, FoxP2 is expressed both in the embryo and in the
adult. To find out whether FoxP2 is required for song behavior in

Explaining Language
the zebra finch, the Scharff laboratory is using RNA to downregulate FoxP2 in Area A, a striatal region important for song learning
(Scharff and Nottebohm 1991). It is known that young male zebra
finches express more FoxP2 bilaterally in Area X when learning
to sing (Haesler et al. 2004). They will then be able to determine
whether the bird is still able to sing normal song as well as copy
the song of an adult male tutor.
Weiguo Shu and colleagues (2005) found that disrupting the
Foxp2 gene in mice resulted in impairing their ultrasonic vocalization. In addition to their more familiar sonic vocalizations,
mice also make ultrasonic sounds, for example, when they are
separated from their mothers.
In order to study the effect of disrupting the Foxp2 gene on
vocalization, these researchers constructed two versions of
knockout mice. One version had two copies of the defective
Foxp2 gene (the homozygous mice) and the other version had
one copy of the defective Foxp2 gene, as well as one gene that
functioned normally (the heterozygous mice). The homozygous
mice (double knockout) suffered severe motor impairment,
premature death, and an absence of ultrasonic vocalizations
that are normally produced when they are separated from their
mother. On the other hand, the heterozygous mice, with a single working copy of the gene, exhibited modest developmental
delay and produced fewer ultrasonic vocalizations than normal.
In addition, it was found that the Purkinje cells in the cerebellum, responsible for fine motor control, were abnormal. It is concluded that the findings support a role for Foxp2 in cerebellar
development and in a developmental process that subsumes
social communication functions (Shu et al. 2001, 9643).
Timothy E. Holy and Zhongsheng Guo (2005) studied the
ultrasonic vocalizations that male mice emit, when they encounter female mice or their pheromones. They discovered that the
vocalizations, which have frequencies ranging from 30100 kHz,
have some of the characteristics of song, for example, birdsong.
In particular, they were able to classify different syllable types
and found a temporal sequencing structure in the vocalizations.
In addition, individual males, though genetically identical, produced songs with characteristic syllabic and temporal structure.
These traits reliably distinguish them from other males. Holy
notes that these discoveries increase the attractiveness of mice
as model systems for study of vocalizations (White et al. 2006,
10378).
We have focused on the FOXP2 gene here because there is no
other gene affecting speech and language about which so much
information is available that bears on questions of neurology and
evolution of language. However, the search is underway for other
additional genes.
For example, genetics researchers have also discovered a
DNA duplication in a nine-year-old boy with expressive language
delay (Patient 1) (Fisher 2005; Somerville et al. 2005). Although
his comprehension of language was at the level of a seven-yearold child, his expression of language was comparable to that of
only a two-and-a-half-year-old.
The region of DNA duplication in Patient 1 was found to be
on chromosome 7, and interestingly, was found to be identical to
the region that is deleted in Williams-Beuren syndrome (WBS).
Patients with WBS have relatively good expressive language but
are impaired in the area of spatial construction. Lucy Osborne,

one of the researchers on the study, noted that, in contrast,


Patient 1 had normal spatial ability but could form next to no
complete words. When asked what animal has long ears and eats
carrots, he could only pronounce the r in the word rabbit but was
able to draw the letter on the blackboard and add features such
as whiskers (McMaster 2005).
The duplicated region on chromosome 7 contains around 27
genes, but it is not yet known which of the duplicate gene copies
are involved in the expressive language delay, although certain
genes have been ruled out.
A gene (SRPX2) responsible for rolandic seizures that are
associated with oral and speech dyspraxia and mental retardation has been identified (Roll et al. 2006). It is located on Xq22.
The SPRX2 protein is expressed and secreted from neurons in the
human adult brain, including the rolandic area. This study characterizes two different mutations.
The first mutation was found in a patient with oro-facial dyspraxia and severe speech delay. The second mutation was found
in a male patient with rolandic seizures and bilateral perisylvian
polymicrogyria. The authors also note that Sprx2 is not expressed
during murine embryogenesis, suggesting that SPRX2 might play
a specific role in human brain development, particularly of the
rolandic and sylvian areas.

RECOVERING ANCIENT DNA


In addition to classical studies of fossils, there is currently
renewed interest in work on ancient DNA. The Neanderthals
(Homo neanderthalensis) were an extinct group of hominids that
are most closely related to modern humans (Homo sapiens). Up
to now, information about Neanderthals has been limited to
archaeological data and a few hominid remains. Comparing the
genetic sequences of Neanderthals and currently living humans
would allow one to pinpoint genetic changes that have occurred
during the last few hundred thousand years. In particular, one
would be able to examine and compare differences in such
genes as FOXP2 in living humans, Neanderthals, and nonhuman
primates.
Partial DNA sequences of the Neanderthal have now been
published by two groups led by Svante Pbo and Edward Rubin
(Green et al. 2006; Noonan et al. 2006). Pbos group identified a 38,000-year-old Neanderthal fossil from the Vindija cave
(the Neanderthals became extinct around 30,000 years ago). The
fossil was free enough from contamination to permit DNA to be
extracted and subjected to large-scale parallel 454 sequencing, a
newer and faster system for sequencing DNA.
Pbos group was able to sequence and analyze about one
million base pairs of Neanderthal DNA. Note that the genomes of
modern humans and Neanderthals each have about three billion
base pairs (3 gigabases). Among the conclusions they reached
on the basis of the comparison to human DNA was that modern
human and Neanderthal DNA sequences diverged on average
about 500,000 years ago. They also expected to produce a draft of
the Neanderthal genome within two years. (For their preliminary
results, see Pennisi 2009.)
Rubins group obtained around 65,000 base pairs of
Neanderthal sequence. They used a combination of the Sanger
method of DNA sequencing and the (faster) pyrosequencing

51

The Cambridge Encyclopedia of the Language Sciences


method. Although the Sanger method yields less amounts of
sequence than pyrosequencing, the error rate is lower. From the
sequence data they estimated that Neanderthals and humans
shared a most recent common ancestor around 706,000 years
ago. There has been interest in whether Neanderthals might
have contributed to the European gene pool. For example, one
study suggests that humans may have acquired the microcephalin gene, which regulates brain size during development, by
interbreeding with another species, possibly the Neanderthals.
However, the data here do not support this possibility, although
more sequence data will be needed to answer the admixture
question definitively (Evans et al. 2006). However, they did
establish the validity of their sequencing approach, which
allows for the rapid recovery of Neanderthal sequences of
interest from multiple independent specimens, without the
need for whole-genome resequencing (Noonan et al. 2006,
1118).

GENETICS AND EVOLUTION


The work on comparative animal studies and comparative neuroanatomy discussed earlier are being increasingly informed by
the rapidly emerging field of comparative genomics. Work in
the field of evolutionary development (evo-devo) has provided
substantial support for the idea that gene regulation is key to
understanding evolution. Among the methods now available to
us to study the evolution of anatomy (including neural circuits)
and behavior at the genome level are comparative genomics,
gene-expression profiling, and population genetics analysis. We
have already seen an example of the profiling of gene expression
in the left and right cerebral hemispheres using serial analysis
of gene expression. Such methods have also been applied to
FOXP2.
As we noted earlier, the FOXP2 gene was discovered to code
for a transcription factor and is therefore involved in regulating
the expression of other genes. In general, transcription factors
are highly conserved. In this case, the FOXP2 protein is different
from the chimpanzee and gorilla sequence at two amino acids,
and from the orangutan sequence at three amino acids. The
human and mouse protein sequence differ at three amino acids.
The question arises whether these amino acid replacements are
of functional significance, that is, whether they played a crucial
role in the evolution of language.
A population genetic study of the FOXP2 locus concluded
that it had been subjected to a selective sweep during the past
200,000 years, correlating closely with estimated age of Homo
sapiens. However, Sean B. Carroll (2005) notes that one cannot
immediately conclude from the fact that the FOXP2 gene has
been a target of selection during human evolution that it is the
amino acid replacements just discussed that were the functionally important targets. Since the FOXP2 gene is 267 kilobases
in size, we should find more than 2,000 differences in DNA
sequence between chimpanzees and humans (assuming an
average base pair divergence of 1.2%). This means that there are
many more possibilities for functionally significant mutations
in noncoding regulatory areas than in coding regions (the
FOXP2 protein is made from coding regions, while the noncoding regions contain the regulatory information). It is, of course,

52

difficult to discover the significance of nucleotide changes in


noncoding regions, since one cannot determine their functional
significance by visual inspection. Nonetheless, until this information is available, there is no reason to favor the idea that the
two changes in the FOXP2 protein are functional. Carroll notes
that while it may be tempting to reach for the low-hanging fruit
of coding sequence changes, the task of unravelling the regulatory puzzle is yet to come (1165).
In fact, some data favor the regulatory sequence hypothesis.
Carroll notes that in evaluating whether FOXP2 is involved in the
evolution of the neural circuits of language, one must ask several
questions. The first is, is the gene product used in multiple tissues? (1164).
It is known that FOXP2 appears to act as a repressor in lung
tissue (Shu et al. 2001). Moreover, studies in mice have revealed
that in addition to Foxp2, two other transcription factors, Foxp1
and Foxp4 are expressed in pulmonary and gut tissues (Lu et al.
2002). Foxp4 is also expressed in neural tissues during embryonic development. It is not at all surprising to find that a gene
that may be involved in language development is also active in
nonlanguage areas. The reason for this is that transcription factors like FOXP2 often act in a combinatorial fashion in conjunction with other transcription factors in different ways in different
tissues.
As another example, suppose that a cell has three regulatory proteins and that with each cell division a new regulatory
protein becomes expressed in one of the daughter cells, but
not the other. Hence, we only need three regulatory proteins
acting in combination to produce eight different cell types.
Combinatorial control is thought to be widespread as a means
of eukaryotic gene regulation. For those familiar with linguistic
models of language acquisition, it may help to think of parameter settings, whereby three binary parameters can specify eight
different language types with respect to some structural feature.
The idea, then, is that FOXP2 can work in conjunction with certain factors in the lung to repress gene activity in the epithelium,
whereas it might work together with other factors in the brain to
regulate genes there directly (or indirectly) involved in speech
and language.
The second question to ask is are mutations in the coding
sequence known or likely to be pleiotropic [i.e., causing multiple
effects]? (1164) It is well known that patients with the FOXP2
mutation have multiple defects in speech, orofacial praxis, language, and cognition. The third question to ask is does the locus
contain multiple cis-regulatory elements? (ibid.) (Cis-elements
are regulatory elements located on the same nucleic acid strand
as the gene they regulate.) Again, since FOXP2 is expressed in
multiple areas in the brain and in other organs, this is a clear
indication that it does. Carroll concludes on this basis that regulatory sequence evolution is the more likely mode of evolution
than coding sequence evolution (ibid.).
Finally, Carroll notes that some of the data from experiments with birdsong learners and nonlearners also support the
idea of regulatory evolution. When FoxP2 mRNA and protein
levels during development were studied, a significant increase
of FoxP2 expression was found in Area X, a center required for
vocal learning. The increase occurred at a time when vocal learning in zebra finches was underway. Moreover, FoxP2 expression

Explaining Language
levels in adult canaries varied with the season and correlated
with changes in birdsong. These facts suggest regulatory control
in development and in the adult brain.

EVOLUTION AND DYNAMICAL SYSTEMS


Another tool used in the study of language evolution is the theory
of dynamical systems (see also self-organizing systems)
(Nowak 2006). However, applications of dynamical systems to
language include studies not only of evolution of language but
of language change and language acquisition as well (Niyogi
2004, 2006). Martin A. Nowak and Natalia Komarova (2001,
292) assume for the case of language acquisition that Universal
Grammar (UG) contains two parts: 1) a rule system that generates a search space of candidate grammars and 2) a [learning]
mechanism to evaluate input sentences and to choose one of
the candidate grammars that are contained in his [the learners]
search space. One of the main questions to be determined is
what is the maximum size of the search space such that a specific learning mechanism will converge (after a number of input
sentences, with a certain probability) to the target grammar.
The question for language evolution then is what makes a
population of speakers converge to a coherent grammatical system. A homogenous population (all individuals have the same
UG) is assumed. Nowak and Komarova derive a set of equations,
which they call the language dynamics equations (293), which
give the population dynamics of grammar evolution.
The equations represent the average payoff (for mutual understanding) for all those individuals who use a particular grammar
and contribute to biological fitness (the number of offspring they
leave), and include a quantity to measure the accuracy of grammar acquisition of the offspring from their parents. Another
variable denotes the relative abundance of individuals who use
a particular grammar. Still another variable denotes the average
fitness or grammatical coherence of the population, the measure for successful communication in a population (293).
Nowak and Komarova use the language dynamics equations
to study the conditions under which UG will result in grammatical coherence. A number of factors can be varied in order to run
computer simulations: population size, assumptions about UGs
search space, and assumptions about the learning mechanism
(e.g., memoryless or batch learning, etc.).
A similar kind of dynamical systems analysis has been proposed by Partha Niyogi for language change (2004, 58), for what
he calls the emerging field of population linguistics. Such concepts as symmetry and stability (stable and unstable equilibria)
are used in the study of the language dynamics equations. Niyogi
uses symmetric and asymmetric nonlinear dynamical models to
study lexical and syntactic change. Nowak and Komarova note
that dynamical systems analysis is compatible with a wide range
of different kinds of linguistic analysis and learning theories.
There are a number of other approaches to the study of language
evolution with dynamical systems and simulation, some of which
may be found in the suggestions for further reading (Christiansen
and Kirby 2003; Cangelosi and Parisi 2001; Lyon et al. 2006).
More than 25 years ago, Chomsky observed that the study of
the biological basis for human language capacities may prove to
be one of the most exciting frontiers of science in coming years

(1980, 216). Not only has that proven to be the case, but with the
explosion of knowledge in many areas, including (comparative)
linguistics, comparative neuroanatomy, evolutionary development, comparative genomics, to take just a few examples,
biolinguistics promises to be a fascinating field for decades to
come.
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Amunts, K., A. Schleicher, et al. 1999. Brocas region revisited: Cytoarchitecture and intersubject variability. Journal of
Comparative Neurology 412: 31941.
Bickerton, Derek. 1996. Language and Human Behavior. Seattle: University
of Washington Press.
Boeckx, Cedric. 2006. Linguistic Minimalism. Oxford: Oxford University
Press.
Bookheimer, Susan. 2002. Functional MRI of language: New approaches
to understanding the cortical organization of semantic processing.
Annual Review of Neuroscience 25 (March): 15188.
Braun, Allen. 2003. New findings on cortical anatomy and implications
for investigating the evolution of language. The Anatomical Record
Part A: Discoveries in Molecular, Cellular, and Evolutionary Biology
271A (March): 27375.
Buxhoeveden, D. P., A. E. Switala, et al. 2001. Lateralization of minicolumns in human planum temporale is absent in nonhuman primate
cortex. Brain, Behavior and Evolution 57 (June): 34958.
Cangelosi, Angelo, and Domenico Parisi, eds. 2001. Simulating the
Evolution of Languge. New York: Springer.
Cantalupo, Claudio, and William D. Hopkins. 2001. Asymmetric Brocas
area in great apes. Nature 414 (November): 505.
Carroll, Sean B. 2005. Evolution at two levels: On genes and form. PLoS
Biology 3 (July): 115966. Available online at: www.plosbiology.org.
Chomsky, Noam. 1959. A review of B. F. Skinners Verbal Behavior.
Language 35.1: 2658.
. 1980. On the biological basis of language capacities. In Rules
and Representations, 185216. New York: Columbia University Press.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
. 2004. The Generative Enterprise Revisited: Discussions with Riny
Huybregts, Henk Van Riemsdijk, Naoki Fukui and Mihoko Zushi.
Berlin: Mouton de Gruyter.
. 2005a. Some simple evo-devo theses: How true might they be for
language? Paper presented at the Morris Symposium, SUNY at Stony
Brook.
. 2005b. Three factors in language design. Linguistic Inquiry
36.1: 122.
. 2006. Language and Mind. Cambridge: Cambridge University
Press.
Chomsky, N., and H. Lasnik. 1995. The theory of principles and parameters. In The Minimalist Program, N. Chomsky, 13127. Cambridge,
MA: MIT Press.
Christiansen, Morten H., and Simon Kirby, eds. 2003. Language Evolution.
New York: Oxford University Press.
Corballis, Michael C. 2002. From Hand to Mouth: The Origins of Language.
Princeton, NJ: Princeton University Press.
Cunningham, Daniel J. 1892. Contribution to the Surface Anatomy of the
Cerebral Hemispheres [Cunningham Memoirs]. Dublin: Royal Irish
Academy of Science.
Deacon, T. W. 1989. The neural circuitry underlying primate calls and
human language. Human Evolution 4 (October): 367401.
Dunbar, Robin. 1998. Grooming, Gossip and the Evolution of Language.
Cambridge: Harvard University Press.
Evans, Patrick D., Nitzan Mekel-Bobrov, et al. 2006. Evidence that the
adaptive allele of the brain size gene microcephalin introgressed

53

The Cambridge Encyclopedia of the Language Sciences


into Homo sapiens from an archaic Homo lineage. PNAS 103
(November): 1817883.
Ferrari, Pier Francesco, Vittorio Gallese, et al. 2003. Mirror neurons
responding to the observation of ingestive and communicative mouth
actions in the monkey ventral premotor cortex. European Journal of
Neuroscience 17: 170314.
Fisher, S. E. 2005. On genes, speech, and language. New England
Journal of Medicine 353.16: 16557.
Fitch, W. Tecumseh, and Marc D. Hauser. 2004. Computational constraints on syntactic processing in a nonhuman primate. Science 303
(January): 37780.
Fogassi, Leonardo, and Pier Francesco Ferrari. 2007. Mirror neurons
and the evolution of embodied language. Current Directions in
Psychological Science 16:13641.
Galaburda, Albert M., and Deepak N. Pandya. 1983. The intrinsic architectonic and connectional organization of the superior temporal
region of the rhesus monkey. Journal of Comparative Neurology 221
(December): 16984.
Gallese, Vittorio, Luciano Fadiga, et al. 1996. Action recognition in the
premotor cortex. Brain 119: 593609.
Gannon, Patrick J., Ralph L. Holloway, et al. 1998. Asymmetry of chimpanzee planum temporale: Humanlike pattern of Wernickes brain
language area homolog. Science 279 (January): 2202.
Gentilucci, Maurizio, and Michael C. Corballis. 2006. From manual gesture to speech: A gradual transition. Neuroscience and Biobehavioral
Reviews 30: 94960.
Gentner, Timothy Q., Kimberly M. Fenn, et al. 2006. Recursive syntactic
pattern learning by songbirds. Nature 440 (April): 12047.
Geschwind, D. H., and B. L. Miller. 2001. Molecular approaches to
cerebral laterality: Development and neurodegeneration. American
Journal of Medical Genetics 101: 37081.
Gil-da-Costa, Ricardo, Alex Martin, et al. 2006. Species-specific calls
activate homologs of Brocas and Wernickes areas in the macaque.
Nature Neuroscience 9 (July): 106470.
Green, Richard E., Johannes Krause, et al. 2006. Analysis of one million
base pairs of Neanderthal DNA. Nature 444 (November): 3306.
Grodzinsky, Yosef, and Katrin Amunts, eds. 2006. Brocas Region.
Oxford: Oxford University Press.
Haesler, Sebastian, Kazuhiro Wada, et al. 2004. FoxP2 expression in
avian vocal learners and non-learners. Journal of Neuroscience 24
(March): 316475.
Hauser, Marc D., David Barner, et al. 2007. Evolutionary linguistics: A
new look at an old landscape. Language Learning and Development
3.2: 10132.
Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch. 2002. The
faculty of language: What is it, who has it, and how did it evolve?
Science 298 (November): 156979.
Holy, Timothy E., and Zhongsheng Guo. 2005. Ultrasonic songs of male
mice. PLoS Biology 3 (December): 217786. Available online at: www.
plosbiology.org.
Iacoboni, Marco, Roger Woods, et al. 1999. Cortical mechanisms of
human imitation. Science 286: 25268.
Jenkins, Lyle. 2000. Biolinguistics: Exploring the Biology of Language.
Cambridge: Cambridge University Press.
Jenkins, Lyle, ed. 2004. Variation and Universals in Biolinguistics.
Amsterdam: Elsevier.
Kegl, Judy. 2004. Language emergence in a language-ready
brain: Acquisition. In Variation and Universals in Biolinguistics, ed.
L. Jenkins, 195236. Amsterdam: Elsevier.
Kennedy, D. N., K. M. OCraven, et al. 1999. Structural and functional
brain asymmetries in human situs inversus totalis. Neurology 53
(October): 12605.

54

Kohler, Evelyne, Christian Keysers, et al. 2002. Hearing sounds, understanding actions: Action representation in mirror neurons. Science
297: 8468.
Lai, Cecilia S. L., Simon E. Fisher, et al. 2001. A forkhead-domain gene
is mutated in a severe speech and language disorder. Nature 413
(October): 51923.
Liberman, Alvin M., and Ignatius G. Mattingly. 1985. The motor theory
of speech perception revised. Cognition 21: 136.
Liberman, Mark. 2004. Hi lo hi lo, Its off to formal language theory we
go. Language Log (January 17). Available online at: http://itre.cis.
upenn.edu/~myl/languagelog/archives/000355.html.
Lieberman, Philip. 2006. Toward an Evolutionary Biology of Language.
Cambridge: Belknap Press of Harvard University Press.
Lu, Min Min, Shanru Li, et al. 2002. Foxp4: A novel member of the Foxp
subfamily of winged-helix genes co-expressed with Foxp1 and Foxp2
in pulmonary and gut tissues. Mechanisms of Development 119,
Supplement 1 (December): S197S202.
Lyon, Caroline, Chrystopher L. Nehaniv, et al., eds. 2006. Emergence of
Communication and Language. New York: Springer.
Maynard Smith, John, and Ers Szathmry. 1998. The Major Transitions
in Evolution. New York: Oxford University Press.
McMaster, Geoff. 2005. Researchers discover cause of speech defect.
University of Alberta ExpressNews (October 25). Available online
at: http://www.expressnews.ualberta.ca.
Nishitani, N., and R. Hari. 2000. Temporal dynamics of cortical representation for action. Proceedings of the National Academy of Sciences
USA 97: 91318.
Niyogi, Partha. 2004. Phase transitions in language evolution. In
Variation and Universals in Biolinguistics, ed. L. Jenkins, 5774.
Amsterdam: Elsevier.
. 2006. The Computational Nature of Language Learning and
Evolution. Cambridge, MA: MIT Press.
Noonan, James P., Graham Coop, et al. 2006. Sequencing and analysis of
Neanderthal genomic DNA. Science 314 (November): 111318.
Nowak, Martin A. 2006. Evolutionary Dynamics: Exploring the Equations
of Life. Cambridge: Harvard University Press.
Nowak, Martin A., and Natalia L. Komarova. 2001. Towards an evolutionary theory of language. Trends in Cognitive Sciences 5 (July):
28895.
Pennisi, Elizabeth. 2009. Neanderthal Genomics: Tales of a Prehistoric
Human Genome. Science 323. 5916: 86671.
Pettito, Laura-Ann. 2005. How the brain begets language. In
Cambridge Companion to Chomsky, ed. J. McGilvray, 84101.
Cambridge: Cambridge University Press.
Petrides, M., and D. N. Pandya. 1994. Comparative architectonic analysis of the human and the macaque frontal cortex. In Handbook of
Neuropsychology, ed. F. Boller and J. Grafman, 1758. Amsterdam:
Elsevier.
Poeppel, David, and Gregory Hickok. 2004. Towards a new functional
anatomy of language. Cognition 92 (May/June): 112.
Rizzolatti, Giacomo, and Michael A. Arbib. 1998. Language within our
grasp. Trends in Neurosciences 21 (May): 18894.
Roll, Patrice, Gabrielle Rudolf, et al. 2006. Srpx2 mutations in disorders of language cortex and cognition. Human Molecular Genetics 15
(April): 11951207.
Scharff, Constance, and Sebastian Haesler. 2005. An evolutionary
perspective on FoxP2: Strictly for the birds? Current Opinion in
Neurobiology 15 (December): 694703.
Scharff, C., and F. Nottebohm. 1991. A comparative study of the behavioral deficits following lesions of various parts of the zebra finch song
system: Implications for vocal learning. Journal of Neuroscience 11
(September): 2896913.

Explaining Language
Shapiro, K. A., L. R. Moo, et al. 2006. Cortical signatures of noun and verb
production. Proceedings of the National Academy of Sciences USA 103
(January): 164449.
Sherwood, Chet C., Douglas C. Broadfield, et al. 2003. Variability
of Brocas area homologue in African great apes: Implications for
language evolution. The Anatomical Record Part A: Discoveries
in Molecular, Cellular, and Evolutionary Biology 271A (March):
27685.
Shu, Weiguo, Julie Y. Cho, et al. 2005. Altered ultrasonic vocalization in
mice with a disruption in the Foxp2 gene. Proceedings of the National
Academy of Sciences USA 102 (July 5): 96438.
Shu, W., H. Yang, et al. 2001. Characterization of a new subfamily of
winged-helix/forkhead (fox) genes that are expressed in the lung and
act as transcriptional repressors. Journal of Biological Chemistry
276.29: 2748897.

Somerville, M. J., C. B. Mervis, et al. 2005. Severe expressive-language


delay related to duplication of the Williams-Beuren locus. New
England Journal of Medicine 353.16: 1694701.
Sun, Tao, Christina Patoine, et al. 2005. Early asymmetry of gene transcription in embryonic human left and right cerebral cortex. Science
308 (June ): 17948.
Sun, Tao, and Christopher A. Walsh. 2006. Molecular approaches to
brain asymmetry and handedness. Nature Reviews Neuroscience 7
(August): 65562.
Tettamanti, Marco, Giovanni Buccino, et al. 2005. Listening to actionrelated sentences activates fronto-parietal motor circuits. Journal of
Cognitive Neuroscience 17: 27381.
White, Stephanie A., Simon E. Fisher, et al. 2006. Singing mice, songbirds,
and more: Models for Foxp2 function and dysfunction in human speech
and language. Journal of Neuroscience 26 (October): 103769.

55

6
ACQUISITION OF LANGUAGE
Barbara Lust

Yet normally, by about three years of age, a child will have


attained the fundamental knowledge of an infinite combinatorial multileveled system as well as essential constraints on this
infinite system, no matter what the language, no matter what
the country or culture the child is born into, no matter what the
limitless contextual, environmental variations across children,
cultures, and countries. This mystery, and the paradox of constrained infinity, drives the field of language acquisition. How
is this fundamental knowledge acquired? How is it represented
and effected in the mind and brain?

THE FIELD OF LANGUAGE ACQUISITION

INTRODUCTION
How does the child, within the first few years of life, come to
understand and to produce Hop on Pop or Cat in the Hat
(Dr. Seuss)?
(1) We all play ball
Up on a wall
Fall off the Wall (Dr. Seuss [1963] 199l, 1013)
(2) We looked!
And we saw him!
The Cat in the Hat!
And he said to us (Dr. Seuss [1957] 1985, 6)

How does the child come to know the precise sound variations that distinguish hop and pop or wall and ball, the
multiple meanings of words like play or in, the reference
of pronouns like we or him, the meaning of quantifiers (see
quantification) like all, the functional categories determining
noun phrases like the, the idioms like play ball, the systematic order and structure necessary for even simple sentences,
the infinite recursion of sentences allowed by the coordinate
connective and, the infinite possibilities for propositions based
on the manipulation of even a few sound units and word units,
and the infinite possibilities for meaning and truth, which
Dr. Seuss continually demonstrated?
Even more stunning, how does the child both attain the infinite
set of possibilities and, at the same time, know the infinite set of
what is impossible, given this infinitely creative combinatorial system? Just as there is no limit to what is possible, there is no limit to
what is impossible in language, for example, (3), or an infinite set
of other impossible word and structure combinations, for example,
(4). This provides an essential paradox for language learners: How
do they both acquire the infinite possibilities and, at the same time,
the constraints on the infinite set of impossible expressions?
(3) *Ylay
*lfla
(4) *Play all we ball
*in cat the hat

56

In many ways, the study of language acquisition stands at the


center of the language sciences, subsumes all of its areas, and
thus, perhaps, supersedes all in its complexity. It entails the
study of linguistic theory in order that the systems and complexities of the end state of language knowledge be understood,
so that the status of their acquisition can be evaluated. This
involves all areas of linguistics, in addition to phonetics: phonology, morphology, syntax, semantics, and pragmatics (cf. Chierchia 1999). It entails language typology
because acquisition of any specific language is possible, and
because this acquisition reflects any possible typological variation. Barbara Grimes (2000) proposes that between 6,000 and
7,000 languages now exist in the end state. The study of language
acquisition also entails the study of language typology because
language acquisition at any state of development represents a
form of manifestation of language knowledge, itself subject to
typological variation. It entails language change, because
the course of acquisition of language over time reveals changing
manifestations of language; language change must be related to
language acquisition, in ways still not understood (although see
Baron 1977, deGraff 1999, and Crain, Gloro, and Thornton 2006
for initial attempts to relate these areas; see also age groups.).
It involves psycholinguistics because any particular manifestation of language during language acquisition is revealed
through the cognitive and psychological infrastructure of the
mind and brain during hearing and speaking in real time,
potentially within a visual modality as in sign languages,
thus always involving language processing integrated with language knowledge. It involves neurolinguistics, not only because
the knowledge and use of language must in some way relate
to its biological and neural instantiation in the brain, but also
because explanations of the gradual course of language acquisition over time must critically evaluate the role of biological
change. (See brain and language and biolinguistics.)
At the same time, the study of language acquisition also stands
at the center of Cognitive Science. It cannot be divorced from other
closely related disciplines within Cognitive Science. The relation
between language and thought is never dissociated in natural
language, thus crucially calling for an understanding of cognitive
psychology and philosophy; and the formal computation involved
in grammatical systems is not fruitfully divorced from computer
science. Any use of language involves possibilities for vagueness
and ambiguities in interpretation, given the particular pragmatic context of any language use. For example, how would one
fully understand the meaning of an ostensibly simple sentence like

Acquisition of Language
Shes going to leave without understanding the particular context in which this sentence is used, and there is an infinite set of
these contexts possible. The challenge of developing an infinitely
productive but constrained language system embedded in the
infinite possibilities for pragmatic determination of interpretation
remains a central challenge distinguishing machine language and
the language of the human species. All areas of cognitive science
must be invoked to study this area. Finally, especially with regard to
language acquisition in the child, developmental psychology must
be consulted critically. The child is a biologically and cognitively
changing organism. Indeed, an understanding of commonalities
and/or differences between language acquisition in child and adult
requires expertise in the field of developmental psychology.
In addition, the fact that the human species, either child or
adult, is capable not only of single language acquisition but also
of multilanguage acquisition, either simultaneously or sequentially, exponentially increases the complexity of the area (see
bilingualism and multilingualism). The fact that these
languages may vary in their parametric values (see parameters) and be either oral or visual further complicates the area.

THEORETICAL FOUNDATIONS
Not surprisingly, then, the field of research involving the study of
language acquisition is characterized by all the complexity and
variation in theoretical positions that characterize the field of linguistics and the language sciences in general. The area of language
acquisition research reflects varying approaches to the study of
grammar, that is, variations regarding viewpoints on what constitutes the end state of language knowledge that must be acquired,
for example, generative or functionalist approaches. At
the same time, it is characterized by variation in approach to the
study of language acquisition, in particular, ranging from various forms of logical to empirical analyses, and including disputes
regarding methodological foundations and varying attempts
at explanatory theories (ranging from rationalist to empiricist
types). (Lust 2006, Chapter 4, provides a review.)
The field is led by a strong theory of what is necessary logically for a strong explanatory theory of language acquisition in
the form of Noam Chomskys early proposal for a language
acquisition device (LAD). This model not only involved
explication of what needed to be explained but also spawned
decades of research either for or against various proposals bearing on components of this model. Yet, today, no comprehensive
theory of language acquisition exists, that is, no theory that would
fully account for all logically necessary aspects of a LAD.
At the same time, decades of research have now produced an
explosion of new discoveries regarding the nature of language
acquisition. These empirical discoveries, combined with theoretical advances in linguistic theory and with the development
of the interdisciplinary field of cognitive science, bring us today
to the verge of such a comprehensive theory, one with firm scientific foundations.

STRUCTURE OF THIS OVERVIEW


In this overview, I first briefly characterize and distill focal tensions in the field of language acquisition and briefly survey

varying positions in the field on these focal issues, then exemplify a range of research results, displaying crucial new discoveries in the field. Ill conclude by formulating leading questions for
the future. For those interested in pursuing these topics further, I
close with selected references for future inquiry in the field.

Focal Tensions
Classically, approaches to the study of language acquisition have
been categorized as nativist or empiricist (see innateness and
innatism). These approaches, which correspond generally to
rationalist or empiricist approaches to epistemology (see Lust
2006 for a review), have been typically associated with claims
that essential faculties responsible for language acquisition in
the human species involve either innate capabilities for language
or not. However, now the debates have become more restricted,
allowing more precise scientific inquiry. No current proposals, to
our knowledge, suggest that nothing at all is innate. (See Elman
et al, 1996 for example). No one proposes that every aspect of
language knowledge is innate. For example, in the now-classic
Piaget-Chomsky Debate (see Piatelli-Palmarini 1980), Chomsky
did not deny that experience was necessary for a comprehensive
theory of language acquisition. On the other hand, Jean Piaget
proposed an essentially biological model of cognitive development and an antiempiricist theory, coherent with Chomskys
position in this way.
Rather, the current issue that is most central to the field of
language acquisition now is whether what is innate regarding
language acquisition involves components specific to linguistic
knowledge, for example, specific linguistic principles and
parameters, or whether more general cognitive knowledge or
learning principles themselves potentially innate can account
for this knowledge. This issue corresponds to the question of
whether organization of the mental representation of language
knowledge is modular or not. Proponents of the view that specifically linguistic factors are critical to an explanation of human
language acquisition generally work within a linguistic theory of a
modular language faculty, that is, a theory of universal grammar (UG), which is proposed to characterize the initial state
of the human organism, that is, the state of the human organism before experience. This current descendant of Chomskys
LAD provides specific hypotheses (working within a generative grammar framework) regarding the identity of linguistic
principles and parameters that may be innately, or biologically,
determined in the human species (e.g., Anderson and Lightfoot
2002). Proponents of the alternative view generally assume a
functionalist theory and a model of cultural learning, that is,
a model where culture in some ways provides the structure of
language, potentially without specific linguistic printiples, with
only general learning principles (e.g., Van Valin 1991; Tomasello
2003; Tomasello, Kruger, and Ratner 1993).

Positions in the Field


Debates in the field of language acquisition continue today
around issues of innateness and modularity (e.g., Pinker, 1994;
Tomasello 1995; Edelman 2008). Research in the field is often
polarized according to one paradigm or the other. Several

57

The Cambridge Encyclopedia of the Language Sciences


specific issues focalize these debates. Two of the most pressing
currently concern the nature of development, and the nature of
the childs use of language input.
Critically, all approaches must confront the empirical fact
that childrens language production and comprehension develop
over time. Language acquisition is not instantaneous. The question is whether childrens language knowledge also develops
over time, and if it does, does it change qualitatively, and if it does
change qualitatively, what causes the change. This question of
development is parallel to questions regarding cognitive development in general, where stage theories are debated against
continuity theories. Positions in the field today vary in several
ways regarding their understanding of the nature of development in language acquisition. Even within researchers working
within a rationalist paradigm that pursues universal grammar as
a specifically linguistic theory of a human language faculty, there
is disagreement. The theory of UG does not a priori include an
obvious developmental component. Positions in the field can be
summarized as in iiv.
(i) Some propose that essential language knowledge is fundamentally innate, and do not centrally address the issue of
what change in language knowledge may occur during language
development (e.g., Crain 1991). In this view, apparent cases of
language knowledge change in a young childs production or
comprehension are often mainly attributed to methodological
failures, for example, involving the researchers choice of specific tasks for testing childrens knowledge. The view of language
development is characterized by the following: On the account
we envision, childrens linguistic experience drives children
through an innately specified space of grammars, until they hit
upon one that is sufficiently like those of adult speakers around
them, with the result that further data no longer prompts further
language change (Crain and Pietroski 2002, 182). (Compare
LAD and a recent proposal by Yang 2006.) On this view, it would
appear that grammars are predetermined and available for
choice. Presumably, specific lexicons, as well as other peripheral aspects of language knowledge, would be exempt from this
claim of predetermined knowledge. The range and nature of the
innately specified space of grammars would have to be explicated, as this framework develops.
(ii) Some propose that UG itself develops over time in a manner that is biologically driven, that is, through maturation. Major
aspects of language development, for example late acquisition of
the passive construction (such as The ball was thrown by Mary
in English), are attributed to biologically determined changes in
UG (e.g., Wexler 1999; Radford 1990). This maturation is proposed
to be the major explanation of language development. Specific
language grammars (SLGs) are not, for the most part, treated as
essential independent components of the language acquisition
challenge, but often as simply triggered by UG parameters. Any
learning that would be involved for the child in language development would involve merely peripheral aspects of language,
with the nature of the periphery yet to be specified.
(iii) Some propose that UG is continuous, but UG is interpreted as involving just those principles and parameters that constrain and guide the language acquisition process. SLG develops
over time through experience with data from a specific language
and through the childs active construction of it. (See examples

58

of a proposal for grammatical mapping in Lust 1999, 2006 and


Santelmann et al. 2002.) The task of language acquisition goes
beyond the periphery of grammatical knowledge and beyond
mere triggering of parameters; it lies in constructive synthesis of
a specific language grammatical system, which is constrained by
UG but not fully determined by it.
(iv) Some propose that UG is irrelevant to language acquisition; the process can be described by an alternative mechanism,
critically involving some form of usage-based learning (e.g.,
Elman et al. 1996; Tomasello 2003, 2005): In general, the only
fully adequate accounts of language acquisition are those that give
a prominent role to childrens comprehension of communicative
function in everything from words to grammatical morphemes to
complex syntactic constructions (Tomasello 2005, 183).
Within the proposals of iiii which all work within a generative
UG framework, (i) and (iii) propose a strong continuity hypothesis
of language development, although they differ crucially in what
is claimed to be continuous. In (iii), it is only the essential set of
principles and parameters of UG constituting the initial state
that is proposed to be biologically programmed and to remain
constant over development, while in (i), it is comprehensive
grammatical knowledge, with no distinction made between UG
and SLG. While a maturational approach such as in (ii) would
appear to maintain the premise of biological programming of
language knowledge, and thus be consistent with a theory of
UG, it raises several critical theoretical and empirical issues that
are still unresolved. For example, theoretically, the question
arises: What explains the change from one UG state to another, if
this determination is not programmed in UG itself? Empirically,
in each area where a maturational account has been proposed,
further advanced research has often revealed grammatical competence that was thought to be missing in early development,
for example, early knowledge of functional categories. (See Lust
1999 and Wexler 1999 for debate.)
All proposals must now confront the fundamental developmental issues: what actually changes during language development and why. Those within the generative paradigm must
sharpen their vision of what in fact constitutes UG, and those
outside of it must sharpen their view of how infinitely creative but
infinitely constrained grammatical knowledge can be attained
on the basis of communicative function alone. All proposals
must be accountable to the wide and cumulative array of empirical data now available. (See, for example, Discoveries in the
next section.)
Researchers are also opposed in their view of how the child
uses input data. (See Valian 1999 for a review.) All approaches
must confront this fundamental area, whether the UG framework is involved (e.g., in explaining how the environment triggers parameter setting) or whether an empiricist framework is
involved (where the mechanism of imitation of individual words
and/or utterances may be viewed as an essential mechanism of
data processing and grammar development for the child, e.g.,
Tomasello 1995, 2002, 2005). Indeed, the very nature of imitation
of language stimuli is under investigation (e.g., Lust, Flynn, and
Foley 1996). In each case, the mechanism proposed must reliably account for the childs mapping from these input data to the
specific knowledge of the adult state, thus solving what has been
called the logical problem of language acquisition and it must

Acquisition of Language
also be empirically verified, that is, be a veridical account of how
the child actually does use surrounding data. Those approaching the logical problem of language acquisition are formally
diagnosing the properties of syntactic and semantic knowledge
that must be acquired, and assessing varied formal learnability
theories that may possibly account for the childs mapping from
actual input data to this knowledge. (e.g., Lightfoot 1989). (See
also projection principle.)

Discoveries
Fortunately, not only have the last decades seen continuous
theoretical developments in areas of linguistics regarding the
nature of language knowledge, and continuous sharpening
of the debate on the nature of language development, but the
field of language acquisition has also produced a wealth of new
empirical discoveries, all of which promise to inform the crucial
questions and debates in the field, and to eventuate in a more
comprehensive theory of language acquisition than has yet been
available. Ill exemplify only some of these here, all of which are
foci of intense current research. (See Lust 2006 for a more comprehensive review.)
THE FIRST 12 MONTHS. Informed by developed methodologies
for investigating infant perception of language, we now know
that neonates show early sensitivities to language variations
and categorize these variations (e.g., Ramus et al. 2000; Mehler
et al. 1996, 1988). For example, given varying sound stimuli from
Japanese and varying sound stimuli from English speech, newborn infants distinguish variation from Japanese to English significantly more than variation within either English or Japanese,
including speaker variation. In this sense, the infant is seen to
be categorizing variation across languages. (See infantile
responses to language.) More specifically, we also know
that formal aspects of language (phonetic, phonological, and
syntactic) begin to develop at birth, even before language comprehension, and provide the foundations for the appearance of
overt language production and comprehension in the child at
about 12 months of age. Now we also know something of how
that development proceeds even during the first 12 months of
life.
In the area of acquisition of phonology, we have discovered
the very fine, precise, and extensive phonetic sensitivities of the
newborn, sensitivities that appear to be exactly the right ones
to underlie all potential cross-linguistic phonological contrasts,
for example, contrasts in voicing or place and manner features
that characterize speech sounds. Once again, these sensitivities
are categorical (as in categorical perception; see speech perception in infants). We know that by about 6 months, these
sensitivities reveal appropriate cross-linguistic selection and
modulation, and by 12 months, this process is nicely attuned to
the childs first productive words of their specific language (e.g.,
Werker 1994; see Jusczyk 1997 and de Boysson-Bardies 1999 for
reviews). For example, while the 12-month-old infant acquiring Hindi will have maintained sensitivity to Hindi contrasts in
aspiration, the infant acquiring English will show diminished
response to such distinctions, which are not linguistically contrastive in English. Although labials appear in first words across

languages, the proportion of labials in both late babbling and


first words will reflect the input of labials in the adult language
being acquired (de Boysson-Bardies, Vihman, and Vihman
1991).
In fact, children are picking out words as formal elements from
the speech stream even before they understand them. For example, 8-month-olds who were exposed to stories read to them for
10 days (30 minutes of prerecorded speech, including three short
stories for children) during a two-week period, two weeks later
distinguished lists of words that had appeared in these stories,
for example, (5a), from those which had not, for example, (5b)
(Jusczyk and Hohne 1997).
(5) a. sneeze
elephant
python
peccaries
b. aches
apricot
sloth
burp

Given the age of the children and the nature of the words
tested, it is clear that children are engaged both in word segmentation and in long-term storage of these formal elements, even
without their semantic content. Moreover, like the acquisition of
phonology, childrens sensitivities to the language-specific structure of words begins to show refinement after the sixth month.
American 9-month-olds, though not 6-month-olds, listened
longer to English words than to Dutch words, while Dutch
9-month-old infants showed the reverse preference (Jusczyk
et al. 1993).
Simultaneously, and in parallel, in the area of syntax, infants
have also been found to be laying the formal foundations for
language knowledge even within the first 12 months. They are
carving out the formal elements that will form the basis for the
syntax of the language they are acquiring. Precise sensitivities
to linear order, as well as to constituent structure, have
been discovered in these first few months of life. For example,
infants as young as 4 months of age have been found to distinguish natural well-formed clause structure, like (6a) from
non-natural ones, like (6b), in stories read to them (where /
represents clause breaks through pauses, and the stimuli are
matched in semantic and lexical content) (Hirsh-Pasek et al.
1987).
(6) a. Cinderella lived in a great big house/ but it was sort of dark/
because she had this mean, mean, mean stepmother
b. in a great big house, but it was/ sort of dark because she
had/ this mean

Experimental research has begun to reveal how infants


accomplish this discrimination in the speech stream, suggesting that the mapping of prosodic phrasing to linguistic units may
play an important role.
Again, sensitivities become more language specific after the
sixth month. Although 6-month-olds did not distinguish natural (e.g., 7a) and non-natural phrasal (verb phrase) structures
(e.g., 7b), that is, phrasal constituents smaller than the clause,
9-month-olds did:

59

The Cambridge Encyclopedia of the Language Sciences


(7) a. The little boy at the piano/ is having a birthday party
b. The little boy at the piano is having/ a birthday party

In all these cases, both phonological and syntactic development does not reduce either to simple loss of initial sensitivities
or to simple accrual or addition of new ones, but a gradual integration of a specific language grammatical system.
More recently, research has begun to reveal even more precisely how these early sensitivities are related to later language
development, thus foreshadowing a truly comprehensive theory
of language acquisition (Newman et al. 2006; Kuhl et al. 2005).
BEYOND FIRST WORDS: LINGUISTIC STRUCTURE. Continuous with
advances in our understanding of early infant development, we
are now also seeing a potential revolution in our understanding of the early stages of overt language acquisition, that is,
those periods within the first three years of life, where the child
is beginning to overtly manifest language knowledge in terms
of language production and comprehension. Child language in
these early periods has traditionally been referred to as holophrastic or, a little later, telegraphic (see two-word stage)
in nature. Even in these early periods, numerous studies are now
revealing childrens very early sensitivity to functional categories
(FCs) in language, that is, to grammatical elements that function
formally to a large degree, often with little or no semantic content. These functional categories play a critical role in defining
constituent structure in language, and in defining the locus of
syntactic operations. Thus, the evidence that infants and toddlers are accessing these FCs in their early language knowledge
begins to provide crucial data on the foundations for linguistic
systems in the child. (Such FC are reflected in different ways
across languages. In English they are reflected in determiners
such as the, auxiliary verbs like do, complementizers like
that, or inflection of verbs, for example.)
Early research had revealed that young children perceive and
consult functional categories, such as determiners, even when
they are not overtly producing them regularly in their natural
speech. For example, L. Gerken, B. Landau, and R. Remez (1990)
showed that 2-year-olds recognize the distinction between
grammatical and ungrammatical function words, contrasting
these, for example (8a) and (8b), in their elicited imitation of
these sentences. Gerken and Bonnie McIntosh (1993) showed
that 2-year-olds used this knowledge in a picture identification
task, discriminating between (9a) and (9b), where grammatical
forms facilitated semantic reference.
(8) a. Pete pushes the dog
b. Pete pusho na dog
(9) a. Find the bird for me
b. Find was bird for me

More recently, a wide range of experimental researchers working with expanded infant testing methods have replicated these
results, and also revealed similar functional category distinctions
even in younger children. For example Yarden Kedar, Marianella
Casasola, and Barbara Lust (2006) showed that infants as young
as 18 months also distinguish sentences like (9a) and (9b) in a
preferential looking task, and again, object reference is facilitated by the grammatical form. Precursors to these functional

60

category sensitivities appear to be available even within the first


12 months (e.g., Shady 1996; Demuth 1994).
Contrary to the widespread view that the contentful lexicon
(involving nouns and verbs) is the privileged basis for acquisition
of syntax in early language acquisition, these results are beginning to suggest that, in fact, functional categories are fundamental (see lexical learning hypothesis).
PRINCIPLES AND PARAMETERS. Principles and parameters that
are hypothesized to provide the linguistic content of UG and
of the human language faculty provide leading hypotheses for
language acquisition research. (See principles and parameters theory and language acquisition.) Investigators
continue to search not only for theoretical motivation for such
principles and parameters but also for empirical evidence of the
role of UG-based principles and parameters in early language
acquisition. A wide range of empirical research has now accrued
in this paradigm, paralleling theoretical developments (e.g.,
Snyder 2007; Crain and Lillo-Martin 1999; Guasti 2002; Roeper
2007; Lust 2006, among others). This research reveals a wide
array of evidence regarding very fundamental linguistic principles, including the central and perhaps most fundamental UG
principle of structure dependence:
The rules operate on expressions that are assigned a certain structure in terms of a hierarchy of phrases of various types. (Chomsky
1988, 45)

Evidence for this fundamental linguistic principle has been


adduced in studies of childrens acquisition of several areas,
including various types of question formation (e.g., Crain and
McKee 1985; Crain and Nakayama 1987; deVilliers, Roeper, and
Vainikka 1990), empty category and pronoun interpretation (e.g.,
Cohen Sherman and Lust 1993; Nuez del Prado, Foley, and Lust
1993; Lust and Clifford 1986), and quantifier scope (e.g., Chien
1994).
Results from young children at early stages of development
across languages have shown that they very early distinguish
coordinate and subordinate structures and that they differentiate syntactic processes in these different structures accordingly.
For example, in English, they very early distinguish sentences like
(10a) and (10b) in both comprehension and production (Cohen
Sherman and Lust 1993):
(10) a. [The turtle tickles the skunk] and [0 bumps the car]
b. Tom [promises/tells Billy [0 to eat the ice cream cone]]

Chinese children differentiate coordinate and embedded


structures and differentiate subjects and topics (see topic
and comment) in Chinese accordingly (Chien and Lust 1985).
Across English, Japanese, and Sinhala, children differentiate possibilities for anaphora according to the embedded or adjoined
structure in which proforms appear (Lust and Clifford 1986;
Oshima and Lust 1997; Gair et al. 1998; Eisele and Lust 1996).
In general, very early linguistic knowledge including knowledge of language involving diminished content, where direct
overt phonetic information is not available is attested in studies of childrens acquisition of sentences with ellipses. For example, in sentences such as (11), young children have been found
to reveal not only competence for empty category interpretation

Acquisition of Language

(in the does too clause, where the second clause does not
state what Bert did), but also competence for construction of the
multiple interpretations allowed in this ambiguous structure (as
in 11ad), and constraint against the ungrammatical possibilities (as in 11 ei). (See Foley et al. 2003 for an example.) In other
words, they evidence early control of and constraints on empty
category interpretation and on other forms of ellipsis. Studies
of Chinese acquisition show similar results (Guo et al. 1996).
Without structure dependence and grammatical computation
over abstract structure underlying such sentences (e.g., reconstructing the verb phrase [VP] in the second clause), children
would not be expected to evidence this competence (see Foley
et al. 2003). All interpretations are pragmatically possible.
(11) Oscar bites his banana and Bert does too.
Possible Interpretations:
a. O bites Os banana and B bites Bs banana.
b. O bites Os banana and B bites Os banana.
c. O bites Bs banana and B bites Bs banana.
d. O bites Es banana and B bites Es banana.
Impossible Interpretations
*e. O bites Os banana and B bites Es banana.
*f. O bites Bs banana and B bites Os banana.
*g. O bites Bs banana and B bites Es banana.
*h. O bites Es banana and B bites Os banana.
*i. O bites Es banana and B bites Bs banana.

As is the case in this study of VP ellipsis acquisition, empirical research results in the Principles and Parameters framework
mutually inform theoretical development; they contribute to and
help to resolve theoretical debates on the representation of this
area of linguistic knowledge.
Evidence for early parameter setting with regard to linear
order in natural language is now provided by a wide range of
cross-linguistic studies showing very early language-specific
sensitivity to the clausal head direction and directionality of
RECURSION in the language being acquired (e.g., Lust, in preparation; Lust and Chien 1984). Very early differentiation of the
pro drop (i.e., argument omission wherein subjects may not be
overtly expressed) possibilities of a language have been attested
across languages (e.g., Italian and English, Spanish and English)
(Valian 1991; Austin et al. 1997, after Hyams 1986). In fact, children have been found to critically consult the complementizer
phrase, and the subordinate structure domain in order to make
this differentiation (Nuez del Prado, Foley, and Lust 1993).
FROM DATA TO GRAMMAR. Our understanding of how the infant,
from birth, consults the surrounding speech stream is now also
expanding quickly. Research has revealed very fine sensitivities
to particular aspects of the speech stream (e.g., Saffran, Aslin,
and Newport 1996), and research has begun to isolate the precise role of particular cues in this process, for example, STRESS,
phonotactic constraints, and statistical distributions (e.g, Johnson
and Jusczyk 2001). Various forms of bootstrapping may be available to the child (phonological or prosodic or syntactic bootstrapping, for example). Here, bootstrapping generally refers to that
process by which the child, in the initial state, might initiate new

learning by hanging on to some aspect of the input that it can


access. Research is beginning to provide evidence on how and
when these forms of bootstrapping may work in the young child
(e.g, work of Gleitman 1990 and others on syntactic bootstrapping and its role in early lexical learning and the the collection of
papers in Morgan and Demuth 1996). Precise hypotheses are now
being formed regarding the mechanisms by which certain parameters may be set very early by the infant, even before the first word,
by consulting prosodic and other aspects of the speech stream
(Mazuka 1996).
CROSS-SPECIES COMPARATIVE METHOD. Advances in crossspecies comparisons now provide an additional dimension to the
study of language acquisition, allowing refinement of our specification of what is particularly human and of what is particularly
linguistic about human acquisition of language (e.g., Hauser,
Chomsky, and Fitch 2002; Call and Tomasello 2007; Ramus et al.
2000). For example, comparative studies with cotton-top tamarin
monkeys revealed certain capabilities in this species to discriminate language stimuli (e.g., Dutch and Japanese) that were comparable to human infants (Ramus et al. 2000). This implied that
a general auditory process was involved in the discrimination. In
contrast, other processes discovered in early acquisition of phonology have been found not to generalize. (See, for example, the work
of Kuhl et al. 2005.)
RESILIENCE. Finally, the tremendous resilience of the language
acquisition feat in the face of varying input, including dearth of
input, has been revealed through important work on young deaf
childrens spontaneously created home sign (Goldin-Meadow
2003). The role of community in sign language creation is also
revealed through in-depth studies of the creation of Nicaraguan
Sign Language in young children (Senghas 1995; Kegl, Senghas,
and Coppola 1999).

Toward the Future


FIRST AND SECOND LANGUAGE IN CHILD AND ADULT. Current and
future studies that seek to triangulate the language faculty by
precise comparative studies between child first language acquisition and adult second language and between child monolingual
first language acquisition and multilanguage acquisition promise
to be able to dissociate factors related to biological maturation,
universal grammar, and specific language grammar in ways not
achievable by studies of first language acquisition alone (e.g.,
Flynn and Martohardjono 1994; Yang and Lust 2005).
PRAGMATIC AND COGNITIVE DEVELOPMENT AND GRAMMATICAL
DEVELOPMENT. Studies of the integration of childrens developing
pragmatic knowledge with their grammatical knowledge have only
begun; yet this integration characterizes every aspect of a childs
use of language (cf. Clark 2003). Such studies may critically inform
our understanding of childrens early syntax development (e.g.,
Blume 2002). Similarly, studies involving interactions between
general cognition and specific aspects of linguistic knowledge (e.g.,
Gentner and Goldin-Meadow 2003) will be critical for resolving
issues of modularity in language acquisition. (See constraints
in language acquisition.)

61

The Cambridge Encyclopedia of the Language Sciences


A NEUROSCIENCE PERSPECTIVE AND BEYOND. Current advances
in brain imaging methods (such as fMRI, and neurophysiological measures, such as EEG [electroencephalography] and MEG
[magnetoencephalography]) and in genome mapping (Greally
2007) promise new discoveries regarding fundamental issues
still open in the field of language acquisition: How are language
knowledge and language acquisition represented in the brain?
(See neuroimaging and genes and language). More precisely, what is the content of the initial state and how is it biologically represented? How is development in language knowledge
either determined by biological changes or a cause of them?
There have been many neuroscientific results regarding language
in the adult state (e.g., Friederici 2002; Binder and Price 2001;
see syntax, neurobiology of; phonetics and phonology,
neurobiology of; and semantics, neurobiology of, for
example), including multilingual language (e.g. Paradis 2000; see
bilingualism, neurobiology of). Research on brain development has also begun to yield fundamental descriptive results
(e.g., Casey, Galvan and Hare 2005). Nevertheless, it appears that
the issue of the language-brain relationship during early development is a terra incognita. The issue of how brain development
and cognitive development in the area of language development
co-occur in early development and over a lifetime will be one of
the key issues in the coming decades of the third millennium
(Friederici 2000, 66). Advances in the description of brain development (e.g., Almi et al. 2007) have not yet been related precisely
to advances in language acquisition.
Not only will advances in this area require technology applicable to a young and changing brain but also advances in the
interdisciplinary field of cognitive science. They will depend
on the development of strong linguistic theory, but also on the
development of a theory that maps the elements of linguistic theory to the activation of language in the brain. At present, there is
still no theory of how an NP relates to a neuron (Marshall 1980)
or how an abstract linguistic principle like structure dependence
or a particular linguistic constraint could be represented at a
neural level. Most imaging results to date provide evidence on
the processing of language, rather than on the knowledge of, or
representations of, language per se.
METHODOLOGY. Advances in this area will also require advancement on issues of methodology: Simply put, we need to understand, precisely, how our behavioral methodologies reflect on,
and interact with, the language processes we are attempting to
study and define (Swinney 2000, 241). Advancement in this
area is a prerequisite to advancement in our understanding of
language processing, language knowledge, and their interaction
in language acquisition, as well as in the interpretation of brain
imaging results. Many persistent sources of disagreement in
the field of language acquisition depend on resolution of issues
related to methodology (e.g., Crain and Wexler 1999; Lust et al.
1999), including those surrounding new infant-based methods
related to preferential looking.
LINGUISTIC THEORY AND LANGUAGE ACQUISITION. Finally, the
study of language acquisition will be advanced when linguistic
studies of the knowledge that must underlie the adult state of
language knowledge, no matter what language, are brought into

62

line more fully with studies of the language acquisition process.


(See Baker 2005 for an example of an argument for representation of this integration.) Many current disputes regarding the
fundamentals of language acquisition cannot be resolved until
disputes regarding the nature of the adult state of language
knowledge are further resolved, and until the field of Linguistics
and Psychology are more fully integrated in the field of Cognitive
Science.

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Aitchison, Jean. 2007. The Articulate Mammal. 5th ed. London: Routledge.
Almi, C. R., M. J. Rivkin, R. C. McKinstry, and Brain Development
Cooperative Group. 2007. The NIH MRI study of normal brain development (Objective-2): Newborns, infants, toddlers, and preschoolers.
NeuroImage 35: 30825.
Anderson, Stephen R., and David W. Lightfoot. 2002. The Language
Organ. Cambridge: Cambridge University Press.
Austin, Jennifer, Maria Blume, David Parkinson, Zelmira Nuez del
Prado, Rayna Proman, and Barbara Lust. 1997. The status of prodrop in the initial state: Results from new analyses of Spanish. In
Contemporary Perspectives on the Acquisition of Spanish, Vol 1:
Developing Grammars, ed. Anna Perez-Leroux and W. Glass, 3551.
Boston: Cascadilla Press.
Baker, Mark C. 2005. Mapping the terrain of language learning.
Language Learning and Development 1.1: 93129.
Baron, Naomi. 1977. Language Acquisition and Historical Change.
Amsterdam: North Holland.
Binder, Jeffrey R., and Cathy J. Price. 2001. Functional neuroimaging of
language. In Handbook of Functional Neuroimaging of Cognition, ed.
Robert Cabeza and Alan Kingstone, 187251. Cambridge, MA: MIT
Press.
Blake, Joanna. 2000. Routes to Child Language: Evolutionary and
Developmental Precursors. Cambridge: Cambridge University Press.
Blume, Maria. 2002. Discourse-morphosyntax in Spanish non-finite
verbs: A comparison between adult and child grammars. Unpublished
Ph.D. diss., Cornell University.
Boysson-Bardies, Bndicte de. 1999. How Language Comes to
Children: From Birth to Two Years. Cambridge, MA: MIT Press.
Boysson-Bardies, Bndicte de, B. Vihman, and M. M. Vihman. 1991.
Adaptation to language: Evidence from babbling and first words in
four languages. Language 67.2: 297319.
Call, Josep, and Michael Tomasello. 2007. The Gestural Communication
of Apes and Monkeys. Mahwah, NJ: Lawrence Erlbaum.
Casey, B. J., Adriana Galvan, and Todd Hare. 2005. Changes in cerebral functional organization during cognitive development. Current
Opinion in Neurobiology 15: 23944.
Chien, Yu-Chin. 1994. Structural determinants of quantifier scope: An
experimental study of Chinese first language acquisition. In Syntactic
Theory and First Language Acquisition: Cross-Linguistic Perspectives.
Vol 2: Binding, Dependencies, and Learnability. Ed. B. Lust, G. Hermon,
and J. Kornfilt, 391416. Mahwah, NJ: Lawrence Erlbaum.
Chien, Yu-Chin, and Barbara Lust. 1985. The concepts of topic and
subject in first language acquisition of Mandarin Chinese. Child
Development 56: 135975.
Chierchia, Gennaro. 1999. Linguistics and language. In The MIT
Encyclopedia of the Cognitive Sciences, ed. R. Wilson and F. Keil, xci
cix. Cambridge, MA: MIT Press.
Chomsky, Noam. 1988. Language and Problems of Knowledge.
Cambridge, MA: MIT Press.
Clark, Eve V. 2003. First Language Acquisition. Cambridge: Cambridge
University Press.

Acquisition of Language
Cohen Sherman, Janet, and Barbara Lust. 1993. Children are in control.
Cognition 46: 151.
Crain, Stephen. 1991. Language acquisition in the absence of experience. Behavioral and Brain Sciences. 14.4: 597650.
Crain, Stephen., Takuya Gloro, and Rosalind Thornton. 2006. Language
acquisition is language change. Journal of Psycholinguistic Research
35: 3149.
Crain, Stephen, and Diane Lillo-Martin. 1999. An Introduction to Linguistic
Theory and Language Acquisition. Malden, MA: Basil Blackwell.
Crain, Stephen, and Cecile McKee. 1985. Acquisition of structural restrictions on anaphors. In Proceedings of the Sixteenth Annual Meeting
of the North Eastern Linguistics Society, ed. S. Berman, J. Chloe, and
J. McDonough, 94110. Montreal: McGill University.
Crain, Stephen, and Mineharu Nakayama. 1987. Structure dependence
in grammar formation. Language 63: 52243.
Crain, Stephen, and Paul Pietroski. 2002. Why language acquisition is a
snap. Linguistic Review 19: 16384.
Crain, Stephen, and Kenneth Wexler. 1999. Methodology in the study
of language acquisition: A modular approach. In Handbook of First
Language Acquisition, ed. William C. Ritchie and Tej K. Bhatia, 387
426. San Diego, CA: Academic Press.
deGraff, Michele. 1999. Creolization, language change and language
acquisition: A prolegomenon. In Language Creation and Language
Change, ed. M. deGraff, 146. Cambidge, MA: MIT Press.
Demuth, Katherine. 1994. On the underspecification of functional categories in early grammars. In Syntactic Theory and First Language
Acquisition: Cross-Linguistic Perspectives. Vol. 1. Ed. B. Lust, M. Suner,
and J. Whitman, 11934. Hillsdale, NJ: Lawrence Erlbaum.
deVilliers, Jill, Tom Roeper, and Ann Vainikka. 1990. The acquisition of long-distance rules. In Language Processing and Language
Acquisition, ed. Lynn Frazier and Jill deVilliers, 25798. Dordrecht, the
Netherlands: Kluwer.
Edelman, Shimon. 2008. Computing the Mind: How the Mind Really
Works. Oxford: Oxford University Press.
Eisele, Julie, and Barbara Lust. 1996. Knowledge about pronouns: A
developmental study using a truth-value judgment task. Child
Development 67: 3086100.
Elman, Jeffrey L., Elizabeth A. Bates, Mark H. Johnson, Annette
Karmiloff-Smith, Domenico Parisi, and Kim Plunkett. 1966. Rethinking
Innateness. Cambridge, MA: MIT Press.
Flynn, Suzanne, and Gita Martohardjono. 1994. Mapping from the initial state to the final stage: The separation of universal principles and
language specific principles. In Syntactic Theory and First Language
Acquisition: Cross Linguistic Perspectives, ed. Barbara Lust, Magui
Suner, and John Whitman, 31936. Hillsdale, NJ: Lawrence Erlbaum.
Foley, Claire, Zelmira Nuez del Prado, Isabella Barbier, and Barbara
Lust. 2003. Knowledge of variable binding in VP-ellipsis: Language
acquisition research and theory converge. Syntax 6.1: 5283.
Friederici, A. D. 2000. The developmental cognitive neuroscience of language: A new research domain. Brain and Language 71: 658.
. 2002. Towards a neural basis of auditory sentence processing.
Trends in Cognitive Science 6.2: 7884.
Gair, James, Barbara Lust, Lewala Sumangala, and Milan Rodrigo. 1998.
Acquisition of null subjects and control in some Sinhala adverbial
clauses. In Studies in South Asian Linguistics: Sinhala and Other South
Asian Languages, 27185. Oxford: Oxford University Press.
Gentner, Dedre, and Susan Goldin-Meadow, eds. 2003. Language in
Mind. Cambridge, MA: MIT Press.
Gerken, L., B. Landau, and R. Remez. 1990. Function morphemes in
young childrens speech perception and production. Developmental
Psychology 26: 20416.
Gerken, Louann, and Bonnie McIntosh. 1993. Interplay of function morphemes and prosody in early language. Developmental Psychology
29.3: 44857.

Gleitman, Lila. 1990. The structural sources of verb meanings. Language


Acquisition 1.1: 355.
Goldin-Meadow, Susan. 2003. The Resilience of Language: What Gesture
Creation in Deaf Children Can Tell Us About How All Children Learn
Language. New York: Psychology Press.
. 2005. What language creation in the manual modality tells us
about the foundations of language. Linguistic Review 22: 199226.
Greally, John M. 2007. Encyclopaedia of humble DNA. Nature
447: 7823.
Grimes, Barbara, ed. 2000. Ethonologue: Languages of the World.
Dallas: SIL International.
Guasti, Maria Teresa. 2002. Language Acquisition: The Growth of
Grammar. Cambridge, MA: MIT Press.
Guo, F. F., C. Foley, Y.-C. Chien, B. Lust, and C.-P. Chiang. 1996.
Operatorvariable binding in the initial state: A cross-linguistic
study of VP ellipsis structures in Chinese and English. In Cahiers de
Linguistique Asie Orientale 25.1: 334.
Hauser, Marc, Noam Chomsky, and Tecumseh Fitch. 2002. The faculty of language: What is it, who has it, and how did it evolve. Science
298: 156979.
Hirsh-Pasek, Kathy, Diane Kemler-Nelson, Peter Jusczyk, K. WrightCassidy, B. Druss, and L. Kennedy. 1987. Clauses are perceptual units
for young children. Cognition 26: 26986.
Hirsh-Pasek, Kathy, and Roberta Michnick Golinkoff. 1996. The Origins of
Grammar. Cambridge, MA: MIT Press.
Hyams, Nina. 1986. Language Acquisition and the Theory of Parameters.
Dorcrecht and Boston: Reidel.
Johnson, Elizabeth, and Peter Jusczyk. 2001. Word segmentation by
8-month-olds: When speech cues count more than statistics. Journal
of Memory and Language 44: 54867.
Jusczyk, Peter. 1997. The Discovery of Spoken Language. Cambridge,
MA: MIT Press.
Jusczyk, W., A. D. Friederici, J. M. I. Wessels, V. Y. Svenkerud, and A. M.
Jusczyk. 1993. Infants sensitivity to the sound patterns of native language words. Journal of Memory and Language 32: 40220.
Jusczyk, Peter W, and Elizabeth A. Hohne. 1997. Infants memory for
spoken words. Science 277: 19846.
Kedar, Yarden, Marianella Casasola, and Barbara Lust. 2006. Getting
there faster: 18- and 24-month-old infants use of function words to
determine reference. Child Development 77.2: 32538.
Kegl, Judith, Ann Senghas, and M. Coppola. 1999. Creation through contact: Sign language emergence and sign language change in Nicaragua.
In Language Creation and Language Change: Creolization, Diachrony
and Development, ed. M. DeGraff, 179237. Cambridge, MA: MIT
Press.
Kuhl, Patricia, Barbara Conboy, Denise Padden, Tobey Nelson, and
Jessica Pruitt. 2005. Early speech perception and later language development: Implications for the critical period. Language Learning and
Development 1.3/4: 23764.
Lightfoot, David. 1989. The childs trigger experience: Degree 0 learnability. Behavioral and Brain Sciences 12: 32134.
Lust, Barbara. 1999. Universal grammar: The strong continuity hypothesis in first language acquisition. In Handbook of First Language
Acquisition, ed. William C. Ritchie and Tej K. Bhatia, 11155. San
Diego, CA: Academic Press.
. 2006. Child Language: Acquisition and Growth. Cambridge:
Cambridge University Press.
. Universal Grammar and the Initial State: Cross-Linguistic Studies
of Directionality. In preparation.
Lust, Barbara, and Yu-chin Chien. 1984. The structure of coordination in
first language acquisition of Mandarin Chinese. Cognition 17: 4983.
Lust, Barbara, and Teresa Clifford. 1986. The 3-D study: Effects of
depth, distance and directionality on childrens acquisition of anaphora: Comparison of prepositional phrase and subordinate clause

63

The Cambridge Encyclopedia of the Language Sciences


embedding. In Studies of the Acquisition of Anaphora: Defining the
Constraints, ed. B. Lust, 20344. Dordrecht, the Netherlands: Reidel
Press.
Lust, Barbara, Suzanne Flynn, and Claire Foley. 1996. What children
know about what they say: Elicited imitation as a research method.
In Methods for Assessing Childrens Syntax, ed. Dana McDaniel, Cecile
McKee, and Helen Cairns. Cambridge, MA: MIT Press.
Lust, Barbara, Suzanne Flynn, Claire Foley, and Yu-Chin Chien. 1999. How
do we know what children know? Problems and advances in establishing scientific methods for the study of language acquisition and linguistic theory. In Handbook of First Language Acquisition, ed. William C.
Ritchie and Tej K. Bhatia, 42756. San Diego, CA: Academic Press.
Marshall, John C. 1980. On the biology of language acquisition.
In Biological Studies of Mental Processes, ed. D. Caplan, 30120.
Cambridge, MA: MIT Press.
Mazuka, Reiko. 1996. Can a grammatical parameter be set before the first
word? Prosodic contributions to early setting of a grammatical parameter. In Signal to Syntax: Bootstrapping from Speech to Grammar in
Early Acquisition, ed. James Morgan and Katherine Demuth, 31330.
Hillsdale, NJ: Lawrence Erlbaum.
Mehler, Jacques, Emmanuel Dupoux, Thierry Nazzi, and G. DehaeneLambertz. 1996. Coping with linguistic diversity: The infants viewpoint. In Signal to Syntax: Bootstrapping from Speech to Grammar in
Early Acquisition, ed. James L. Morgan and Katherine Demuth, 10116.
Mahwah, NJ: Lawrence Erlbaum.
Mehler, Jaques, Peter Jusczyk, Ghislaine Lambertz, Nilofar Halsted,
Josiane Bertoncini, and Claudine Amiel-Tison. 1988. A precursor of
language acquisition in young infants. Cognition 29: 14478.
Morgan, James L., and Katherine Demuth, eds. 1996. Signal to
Syntax: Bootstrapping from Speech to Grammar in Early Acquisition.
Mahwah, NJ: Lawrence Erlbaum.
Newman, Rochelle, Nan Bernstein Ratner, Ann Marie Jusczyk, Peter W.
Jusczyk, and Kathy Ayala Dow. 2006. Infants early ability to segment
the conversational speech signal predicts later language development: A
retrospective analysis. Developmental Psychology 42.4: 64355.
Nuez del Prado, Zelmira, Claire Foley, and Barbara Lust. 1993. The significance of CP to the pro-drop parameter: An experimental study of
Spanish-English comparison. In Proceedings of the Twenty-Fifth Child
Language Research Forum, ed. Eve Clark, 14657. Stanford, CA: CSLI.
Oshima, Shin, and Barbara Lust. 1997. Remarks on anaphora in Japanese
adverbial clauses. In Papers on Language Acquisition: Cornell
University Working Papers in Linguistics, ed. Shamita Somashekar,
Kyoko Yamakoshi, Maria Blume, and Claire Foley. Ithaca, NY: Cornell
University Press.
Paradis, Michel. 2000. The neurolinguistics of bilingualism in the next
decades. Brain and Language 71: 17880.
Piatelli-Palmarini, Massimo, ed. 1980. Language and Learning: The
Debate Between Jean Piaget and Noam Chomsky. Cambridge: Harvard
University Press.
Pinker, Steven. 1994. The Language Instinct. New York: W. W. Morrow.
Radford, Andrew. 1990. Syntactic Theory and the Acquisition of English
Syntax. Cambridge: Cambridge University Press.
Ramus, Frank, Marc D. Hauser, Cory Miller, Dylan Morris, and Jaques
Mehler. 2000. Language discrimination by human newborns and by
cotton-top tamarin monkeys. Science 288.5464: 34951.

64

Roeper, Tom. 2007. The Prism of Grammar: How Child Language


Illuminates Humanism. Cambridge, MA: MIT Press.
Saffran, Jenny R., Richard N. Aslin, and Elissa L. Newport. 1996. Statistical
learning by 8-month-old infants. Science 274: 19268.
Santelmann, Lynn, Stephanie Berk, Shamita Somashekar, Jennifer
Austin, and Barbara Lust. 2002. Continuity and development in the
acquisition of inversion in yes/no questions: Dissociating movement
and inflection. Journal of Child Language 29: 81342.
Senghas, A. 1995. Childrens Contribution to the Birth of Nicaraguan Sign
Language. Cambridge, MA: MIT Press.
Seuss, Dr. [1957] 1985. Cat in the Hat. New York: Random House.
. [1963] 1991. Hop on Pop. New York: Random House.
Shady, M. E. 1996. Infants sensitivity to function morphemes. Ph.D.
diss., State University of New York at Buffalo.
Snyder, William. 2007. Child Language. Oxford: Oxford University Press.
Swinney, David. 2000. Understanding the behavioral-methodology/
language-processing interface. Brain and Language. 71: 2414.
Tomasello, Michael. 1995. Language is not an instinct. Cognitive
Development 10: 13156.
. 2002. The New Psychology of Language. Cognitive and Functional
Approaches to Language Structure. Mahwah, NJ: Lawrence Erlbaum.
. 2003. Constructing a Language: A Usage-Based Theory of Language
Acquisition. Cambridge: Harvard University Press.
. 2005. Beyond formalities: The case of language acquisition.
Linguistic Review 22: 18398.
Tomasello, Michael, Ann Cale Kruger, and Hilary Horn Ratner. 1993.
Cultural learning. Behavioral and Brain Sciences 16: 495552.
Valian, Virginia. 1991. Syntactic subjects in the early speech of American
and Italian Children. Cognition 40: 2149.
. 1999. Input and language acquisition. In Handbook of Child
Language Acquisition, ed. William Ritchie and Tej Bhatia, 497530. San
Diego, CA: Academic Press.
Van Valin, Robert D. 1991. Functionalist theory and language acquisition. First Language 11: 740.
Werker, Janet F. 1994. Cross-language speech perception:
Developmental change does not involve loss. In The Development
of Speech Perception: The Transition from Speech Sounds to Spoken
Words, ed. J. Goodman and H. Nusbaum, 93120. Cambridge, MA:
MIT Press.
Werker, Janet F., and Richard C. Tees. 1984. Cross-language speech perception: Evidence for perceptual reorganization during the first year of
life. Infants Behavior and Development 7: 4963.
Wexler, Kenneth. 1999. Maturation and growth of grammar. In
Handbook of First Language Acquisition, ed. T. Bhatia and Wm. Ritchie,
55110. San Diego, CA: Academic Press.
Yang, Charles D. 2006. The Infinite Gift. New York: Scribner.
Yang, S., and B. Lust. 2005. Testing effects of bilingualism on executive attention: Comparison of cognitive performance on two nonverbal tests. BUCLD 29 Proceedings Online Supplement. Somerville,
MA: Cascadilla. Available online at http://www.bu.edu/linguistics/
APPLIED/BUCLD/supp29.html.
. 2007. Cross-linguistic differences in cognitive effects due to
bilingualism: Experimental study of lexicon and executive attention
in 2 typologically distinct language groups. BUCLD 31 Proceedings.
Somerville, MA: Cascadilla.

7
ELABORATING SPEECH AND WRITING:
VERBAL ART
Patrick Colm Hogan

The past half century has seen considerable interaction between


the sciences of language and the study of literature. But this
interaction has been largely unidirectional, with influence
flowing from language science to literature. This may be seen
most clearly in the massive impact of Ferdinand de Saussure on
literary study since the 1960s. However, generative grammar, cognitive linguistics, connectionism, and other
approaches have also had effects on poetics and literary theory.
In the following pages, I wish to consider the general issues of
what distinguishes verbal art as an object of study in language
science. However, before I turn to this, it is important to get a
sense of how the analysis of literature and the analysis of language have been interconnected since the current phase of language study began 50 years ago.

THEORIES OF LANGUAGE AND THEORIES


OF LITERATURE
When the influence of language science on literary theory is
considered, it is helpful to begin with a division between literary
theorists who have drawn on broad methodological principles
and literary theorists who have taken up particular linguistic
theories. For example, my own work on literary universals
(Hogan 2003) does not rely on any particular account of language universals. However, I do follow the general methodological principles for isolating genetic and areal distinctness (though
see areal distinctness and literature), distinguishing
different varieties of statistically unexpected cross-cultural patterns, and so on.
This type of influence, however, is the exception rather than
the rule. Other writers have drawn on particular theories of
language, using them literally or analogically to explore literature. Taking up the structure set out in the Preface, we may
distinguish neurobiological, mentalistic, and social theories,
as well as theories that bear on acquisition and evolution.
Mentalistic theories have been the most prominent. As noted
in the Preface, within mentalistic theories, we may distinguish
intentionalist accounts of language (referring to human subjectivity and intention) from representationalist accounts (treating algorithmic operations on mental symbols). Intentionalist

theories of language have been developed most prominently


within ordinary language philosophy. The ideas of Ludwig
Wittgenstein and the principles of speech-act theory have
been taken up by literary theorists, such as Mary Louise Pratt and
Stanley Fish. Some philosophers working in other areas of the
analytic philosophy of language, such as Donald Davidson, have
also had considerable influence (see Dasenbrock 1993).
While intentionalist theories of language have certainly been
influential in literary study, their use has been limited and, so
to speak, instrumental. They tend to be taken up for particular
interpretive aims. This is often the case with literary borrowings
from linguistics and the philosophy of language. However, the
uses of representationalist theories have been different. In these
cases, the literary and linguistic theories have been much more
thoroughly integrated. This is the result of two factors. First, the
research areas of the linguistic and literary theories overlap.
Second, there has been collaboration between linguists and literary theorists in treating these areas.
More exactly, there are two important representationalist
schools that have had significant impact in literary study. One
is cognitive linguistics. Some of the most important work in
cognitive linguistics has treated metaphor. While cognitive linguists were initially interested in ordinary uses of metaphor, they
quickly extended their analyses to poetic metaphor. This was
facilitated by the collaboration of a linguist, George Lakoff, and a
literary theorist, Mark Turner.
A similar point may be made about Chomskyan or generative
representationalism. One part of Chomskyan theory treats phonology. Patterns in sound and stress are of obvious importance
in verse. Thus, certain aspects of poetic form may be included
within the framework of generative linguistic theories. Work in
this area has been facilitated by collaborations between linguists
and literary critics as well (see Fabb and Halle 2006). (Theorists
have used Chomskyan generative principles as a model for other
aspects of literary theory also; see generative poetics.)
Brain-related theorization on language and literature is less
developed, in part because it is highly technical and in part
because neuroscientific theorization about language itself is
much more recent. Turning again to the divisions in the Preface,
we may distinguish between connectionist approaches and neurobiological approaches proper.
There has been some work on verbal art and parallel distributed processing. For example, some writers have used connectionist models to discuss creativity (see Martindale 1995) and
there has been some work on connectionism and metaphor
(e.g., Chandler 1991). Though limited, the work done in this field
is generally well integrated into connectionist theories (i.e., it is
not solely instrumental).
Recently, there has been considerable interest in neurobiology and art. This work has addressed many nonlinguistic
aspects of brain function. However, some has focused specifically on language. Much of this addresses hemispheric specialization, exploring distinctive features of verbal art (see Kane 2004;
poetic language, neurobiology of).
Given the general orientation of literary critics, it is unsurprising that social aspects of speech and language have figured
more prominently in literary study. At the level of dialogue, literary theorists have drawn on, for example, Paul Grices account

65

The Cambridge Encyclopedia of the Language Sciences


of conversational implicature, as well as various ideas of
Mikhail Bakhtin (see dialogism and heteroglossia). In
terms of larger groups, literary theory has been highly influenced
by historian Michel Foucaults ideas about discourse and social
power (see discourse analysis [foucaultian]). They
have also drawn on the sociological ideas of Pierre Bourdieu
(see field; market, linguistic; and habitus, linguistic)
and others. Broader social movements, such as Marxism and
feminism (see marxism and language, gender and language, and sexuality and language) and their associated
theories, have also contributed importantly to literary discussion, though not necessarily in a way that bears particularly on
language science.
The literary use of most social theories has tended to be
instrumental. However, more narrowly sociolinguistic analyses of literature have been integrated into research programs
in sociolinguistics. This is largely because, here too, the areas of
research overlap, and language scientists have been involved in
research along with literary interpreters. For example, William
Labovs studies of oral narrative and the researchers of writers in
corpus linguistics have contributed to the advancement of
both sociolinguistics and literary study.
The same general point holds for the study of acquisition.
There has been valuable work done on, for example, the acquisition of metaphor and the development of verbal
humor.
Finally, evolutionary theory has inspired many literary theorists in recent years (see verbal art, evolution and). Its
advocates propose sweeping evolutionary explanations for a
wide range of literary phenomena. It is not clear that this program has gone beyond the stage of conjecture. In any case, it is
general and often not tied specifically to language.
As I have suggested, much work in the preceding areas is
very significant. However, much of it begins, like the classical
European epic, in medias res. It does not set out a clear field of
study for a language science of verbal art. Rather, the most successful work tends to focus on those areas of verbal art that fall
within the purview of nonliterary research programs. Put differently, it tends to treat verbal art as a case of something else (e.g.,
cognitive metaphor or phonology). In the remainder of this essay,
then, I do not explore these particular approaches and connections in more detail. Such a review is, in any case, redundant, as
this material is covered in the following entries. Instead, I consider what is distinctive about verbal art and why, as a result, it is
an important area of study for language science.

THE PARTIAL AUTONOMY OF VERBAL ART: INDIRECT


ADDRESS, SIDE PARTICIPATION, AND PLAY
Perhaps the most obvious differentiating characteristics of verbal
art are that it is normative and rare. While all cultures have verbal
art (see Kiparsky 1987, 1956), few people in any culture produce
works of verbal art (though, of course, they do produce many constituents of such works novel metaphors, allusions, wit, and so
forth). On the other hand, these differences are not definitive in
themselves. Rather, they seem to result from other factors.
Consider two samples of speech actions: 1) the following
excerpt from recorded speech Go away, Im cooking. Excuse

66

me please, Im trying to cook. I havent got enough potatoes


(Biber, Conrad, and Reppen 2006, 69); and 2) the following
excerpt from Shakespeares Sonnet 97 How like a winter hath
my absence been/From thee, the pleasure of the fleeting year!
The first piece of speech calls to mind a particular, active context. The second is removed from any such context. Perhaps,
then, works of verbal art are more separable from their immediate context. As a first approximation, we might say that verbal
art has other verbal art as its primary, distinctive context, and
the current material context has largely inhibitory force. In other
words, our understanding and evaluation of a work of verbal art
are distinguished from other sorts of understanding and evaluation, first of all, by their relation to other works of verbal art our
sense of storytelling techniques, our awareness of the larger story
cycle in which a particular narrative occurs, our expectations
about characters, and so forth.
This does not mean that verbal art is entirely insensitive to
immediate context. Our response to stories may be inflected, primarily in an inhibitory way, by current material circumstances.
Consider jokes, a form of verbal art. One standard type of joke
is ethnic. The standard formats of such jokes (e.g., How many
xs does it take to screw in a light bulb?), the general function of
jokes, and so forth provide the broad, distinctive context for interpretation and response. The most obvious function of the immediate, material context comes when a member of the relevant ethnic
community is present and that function is usually inhibitory.
Removal from immediate context cannot be quite the crucial
property, however. Consider, for example, the present essay.
Physically, I am alone as I am writing. Anyone who reads this
essay will be removed from the material context in which I am
writing, and that material context will be irrelevant to the readers
response and understanding. But that does not make this essay
verbal art. Perhaps, then, the most important difference between
the aforementioned speech actions is not context per se but
something closely related to context. The joke suggests that this
has something to do with the audience. Perhaps the difference is
a matter of the way the speaker addresses his or her audience.
To consider this, we might return to those cases. It is clear that
the person who says, Go away, Im cooking, is talking to his or
her audience. I, too, am addressing my readership in writing this
essay even if my idea of that readership is rather amorphous.
But the sonnet, as a socially circulated poem, is not addressing
its readership. Even if Shakespeare initially drafted the poem as
a private message to a particular person, he made the decision
that its readership would not be its addressee when he made it
a public poem.
More exactly, works of verbal art tend to be marked by indirect address, rather than direct address. When considered from
the perspective of the reader rather than the author, indirect
address is roughly the same as side participation, as discussed
by Richard Gerrig and Deborah Prentice (1996). Gerrig and
Prentice distinguish several possible roles in a conversation.
Obvious roles include speaker, addressee, and overhearer. The
authors add a fourth role side participant. Suppose I am with
my wife at the grocery store. She sees von Humboldt, a colleague
of hers. The colleague says, Oh, that meeting with de Saussure,
our new provost, is on the twelfth. When she says the new provost she is doing so for my benefit. My wife knows perfectly well

Elaborating Speech and Writing


who de Saussure is. The conversation does not really concern
me. If I had been standing a few feet away, I would have merely
been an overhearer and von Humboldt would have said only,
Oh, that meeting with de Saussure is on the twelfth. But since I
was closer, I became a side participant and von Humboldt had to
take my knowledge and interest into account when speaking.
The difficulty with the account of Gerrig and Prentice is that
it is, for the most part, analogical. It suggests an area of research
and theorization. However, it does not develop this in algorithmic specificity (i.e., spelling out how it will work, step by step)
in relation to the structures, processes, and contents of human
cognition. To explore the idea further, we might consider a simple model of speech (oral or written). This model begins from
the premise that language is a form of action. Thus, it has the
usual components of action goals, motivations or emotions,
anticipated outcomes, and so forth.
Here, I wish to isolate two stages in the production of speech.
One is the basic generative phase in which the initial utterance
is formulated. The second is an adjustment phase, which follows
our realization of just what we are saying. Intuitively, we might
expect that awareness of speech would precede generation.
But it does not. Indeed, a moments reflection suggests why. In
order to realize that I am about to say Hello, I must in some
sense have already generated the Hello, even if my lips have
not yet moved. More importantly, empirical research indicates
that human action generally involves just this sort of duality. For
example, as Henrik Walter points out, our brains initiate or project actions approximately .5 to .7 seconds before the actions are
performed. We are able to modify or inhibit the action .3 to .5
seconds after it is projected (see Walter 2001, 24850). In keeping
with this temporal sequence, adjustment may precede, interrupt,
or follow the execution of an action.
Suppose someone asks me if I need a ride. I begin to say, No,
thanks. Duns is picking me up. I realize that the person does not
know who Duns is. I may adjust the sentence before speaking
My cousin Duns is picking me up. Or I may succeed in making
the change only after beginning the sentence Duns uh, hes
my cousin hes picking me up. When the adjustment takes
place before speaking, we might refer to it as implicit. When
it occurs in the course of speaking or after speaking, we may
refer to it as explicit. Finally, it is important to note that actions
have two broad sources. One is extrinsic, or externally derived;
the other is intrinsic, or internally derived (see MacNeilage 1998,
2256 on the neural substrates for this division).
Now we are in a position to clarify the nature of indirect
address or, more generally, indirect action. Indirect action,
as I am using the phrase, involves a relative decrease in the
extrinsic aspects of action. Thus, the sources of both generation
and adjustment are more intrinsic than is commonly the case.
Moreover, when they occur, extrinsic adjustments tend to be
implicit rather than explicit.
To get a better sense of how indirect action operates, it is useful to look at a paradigmatic case of such action play. Indeed,
play and side participant interaction have a great deal in common. When Jane and Jill play school, each of them must keep
in mind that the other person has a real identity outside the role
she is playing. Moreover, each of them must continually adjust
her speech and behavior to take that into account. For example,

suppose Jane is in third grade and Jill is in second grade. Jane


begins to play a third grade teacher. She starts by referring to
what the class did last week. However, she has to keep in mind
that Jill does not know what the third grade class did last week.
Thus, she may have to say Now, class, you remember that last
week we began by discussing government and binding
theory. As this suggests, play is a purer case of indirection than
ordinary side participation. In the case of side participation,
our adjustments are more likely to be explicit. For example, my
wifes colleague is likely to turn to me when explaining that de
Saussure is the new provost. In play, explicit adjustments occur
when the pretense of play is disrupted. Suppose Jill accidentally
addresses Jane as Jane, then explicitly adjusts that to I mean,
Frau Doktor Wittgenstein. Jane is likely to get annoyed with this
breach in the play, perhaps responding, Jill, you should never
call your teacher by her first name. Now you have to sit in the corner and wear the Aversive Stimulus Operant Conditioning Cap.
Following this usage, we may say that verbal art is a form of
indirect speech action in the same sense as one finds in play.
Indeed, verbal art is, in certain respects, a form of play. Note,
however, that in verbal art, in play, and elsewhere, there is not an
absolute separation between direct and indirect action. Again,
indirect action reduces extrinsic factors. It does not eliminate
them. The difference is one of degree.
Indeed, the difference between extrinsic and intrinsic is not
fine grained enough to clarify the distinctiveness of verbal art.
This becomes clear as soon as we notice that almost all writing
is predominantly intrinsic in that it is not generated or adjusted
primarily by reference to external circumstances. Moreover,
the opportunities for extensive revision in writing allow for virtually all adjustments to be implicit in the final text. How, then, do
we distinguish ordinary writing from writing that is verbal art?
Here, too, the distinction is a matter of degree. Writing
involves different gradations of removal from direct address.
Take, for example, a letter. Suppose von Humboldt is not speaking to my wife about a faculty meeting, but is instead writing her
a note. In an obvious way, von Humboldts address to my wife is
indirect and intrinsic. After all, my wife is not present. However,
von Humboldts address to my wife is direct in another way, for it
is oriented precisely toward her. It remains guided by her as von
Humboldt imagines her.
In order to understand how this works, and how it bears on
verbal art, we need to have a clearer understanding of imagination and action. When I begin to act, I tacitly project or imagine
possible outcomes for my action. For example, when I see a car
coming toward me, I run to the curb. This is bound up with a tacit
imagination of where I should be in order to fulfill my purpose
of not being run over. More exactly, we may understand imagination, like speech, as involving two (continuously interacting)
processes. One generates possible scenarios. The other makes
adjustments. I project running to the curb, but then notice a manhole in my path. This leads me to adjust my imagination of the
precise trajectory. The nature of the generation and the nature of
the adjustment will change, depending on the guiding purposes
of the action. In some cases of speech action, the purpose involves
a real addressee. In some cases, it does not. In a face-to-face dialogue, a speaker will continually generate, adjust, and regenerate what he or she is saying in part due to the addressees actual

67

The Cambridge Encyclopedia of the Language Sciences


response. In writing a letter, a writer will be guided by his or her
tacit imagination of the addressee. Though that imagination is not
adjusted by reference to actual responses, it nonetheless serves as
a guide for the generation of the speech. Crucially, this imagined
addressee maintains his or her external, independent existence
for the speaker. As such, that addressee is, so to speak, a pseudoextrinsic guide to the generation of the speech.
Here, then, we may adjust our account of indirect address
in verbal art. Verbal art minimizes both extrinsic and pseudoextrinsic elements in the production and adjustment of
speech and in the imaginations that are connected with speech.
Moreover, it minimizes explicit markers of adjustments for side
participants.
We began with the idea that verbal art is relatively independent of direct, material context. That context involves authors
and readers. In this connection, we have been considering
the relation between the author and readers as one of indirect
address. The other crucial element of material context is, of
course, reference. Indeed, verbal art, as commonly understood,
has an even more obvious relation to reference than to address,
for verbal art is paradigmatically conceived of as fiction. As such,
it has an unusual degree of referential autonomy. In other words,
it tends to be characterized by indirect reference as well as indirect address. This, too, is illustrated most clearly in play. Suppose
Jane and Jill are playing telephone. Jane picks up a banana, puts
it up to the side of her face, then holds it out to Jill and says, Its
for you. In saying it, she is not referring to a banana. She is
referring to a telephone.
In sum, verbal art is not just partially autonomous with
respect to immediate circumstances. It is largely independent of
the extrinsic and pseudoextrinsic aspects of the generation and
adjustment of speech action and associated imagination. This is
true with respect to both address and reference.

ART AND ACTION: THE PURPOSES OF VERBAL ART


In discussing action-related imagination, we noted that the precise nature of imagination varies according to the purposes of the
action. Generally speaking, the goals of physical action concern
the alteration of some situation. To a great extent, speech actions
have the same function. One crucial difference between speech
actions and bodily actions is that speech actions have their effects
only through minds. However much I plead with the packet of
noodles, they wont turn themselves into pad thai. But, if I sweetly
ask someone to make the noodles into pad thai, perhaps that
person will do so. Speech actions, then, tend to aim at altering
the world by altering peoples actions. In order to alter peoples
actions, they aim at altering two things first, the way those people understand the world and, second, the way they feel about it,
including the way they feel about the speaker himself or herself.
More exactly, we may distinguish two psychological purposes of speech actions. These are informational and emotional.
Informational purposes may be further divided into pragmatic
and regulative. Pragmatic information concerns factual, directional, or other information that facilitates our pursuit of goals.
Regulative information concerns broad ethical, prudential, or
related principles, which tend to serve an adjusting function.
For example, feeling hungry, I form the goal of eating something.

68

This combines with pragmatic information about the presence of


cookies in the cookie jar to initiate the action of getting a cookie
out of the cookie jar. However, before I reach for the cookie jar,
prudential information stops me as I recall the deleterious effects
that cookies are likely to have on my lithe and shapely form.
For imaginations or actions to result from pragmatic information, there must be some motivation present as well. In other
words, I must have some emotion. Emotion is what leads us to
act, whether the action is speech or larger bodily movement. It is
also what leads us to refrain from acting. We may divide emotions
into two sorts, depending on their function in action sequences.
The first might be called the initiating emotion. An initiating
emotion is any emotion that one feels in current circumstances
and that impels one to act. Action here includes the initiation of
movement and the initiation of imagination. We imagine various
outcomes of our actions. For example, feeling hungry, I swiftly
and unreflectively imagine eating a cookie. But my imagination
does not stop there. I may imagine my wife seeing me with the
crescent of cookie in my hand and the telltale crumbs sticking to
my lips, then chastising me, reminding me of the doctors warnings, and explaining once again that the cookies are for my nieces
and nephew. Or I may suddenly see an image of myself with wobbly love-handles. In each case, I experience what might be called
a hypothetical emotion. A hypothetical emotion is a feeling that
I experience in the course of imagining the possible trajectories
of my action. While initiating emotions give rise to (or generate)
the action sequence initially, hypothetical emotions qualify (or
adjust) that action sequence. Hypothetical emotions may intensify the initiating emotion; they may inhibit it; they may respecify
the precise goals of the action sequence (e.g., from eating cookies to eating carrots), or they may affect the precise means that I
adopt (e.g., checking that my wife is not around).
From the preceding example, it may seem that hypothetical
emotions are all egocentric. However, hypothetical emotions
may also be empathic. For example, I may forego my plan to eat
cookies because I imagine my tiny nephews disappointed face
when he reaches into the empty jar. Empathic hypothetical emotions may have the same qualifying effects on initiating emotions
and actions as do egocentric hypothetical emotions.
Hypothetical emotions are critical for all types of action,
including verbal action. Consider again the case where von
Humboldt explains that de Saussure is the new provost. In doing
this, von Humboldt tacitly imagines the conversation from my
perspective and, sensing that I may not follow and, more importantly, that I might feel left out she provides the information.
If I wish to alter someones emotions through speech, I will
appeal primarily to initiating emotions. Thus, the alteration of
initiating emotions is usually a central purpose of speech action.
However, certain sorts of hypothetical emotions are important
as well. For example, when requesting a favor, I may foreground
how grateful I will be. One purpose is to encourage my addressee
to imagine my gratitude and experience the related hypothetical
emotion (roughly, feeling appreciated).
In sum, ordinary speech actions involve an informational
aim and an emotional aim, usually coordinated to produce some
change in the world, currently or at some time in the future. The
informational aim involves both pragmatic and regulatory components, though in ordinary speech, the pragmatic component

Elaborating Speech and Writing


is probably dominant. The emotional aim involves both initiating and hypothetical emotions. In ordinary speech, the initiating
emotion is probably dominant.
In being removed from direct address and direct reference,
works of verbal art commonly do not have a goal of directly altering particular material conditions. Nonetheless, verbal art is animated by the same two psychological goals that animate other
forms of speech action. Verbal art, too, has an informational
component and an emotional component. Indeed, cross-culturally, aestheticians and critics tend to see verbal art as successful
insofar as it affects us emotionally and insofar as it develops significant ideas or themes.
The emphasis in verbal art, however, tends to be different
from that of other speech actions in both cases. Specifically, verbal art commonly does not stress pragmatic aspects of information. Certainly, there are, for example, political works that set
out to give the audience pragmatic information. However, this
is not the general tendency of verbal art. Rather, verbal art tends
to develop regulative concerns. These regulative concerns make
their appearance in literary themes. When we interpret a work
and seek to understand its point, we are commonly looking
for a theme, which is to say, some sort of regulative information. Moreover, in terms of emotion, verbal art tends to inspire
not initiating emotions but hypothetical emotions particularly
empathic hypothetical emotions.

THE MAXIMIZATION OF RELEVANCE


Up to this point, I have been speaking of the content of speech
action generally and verbal art in particular. But the form of verbal art is widely seen as crucial, perhaps its definitive feature. The
content/form division is somewhat too crude to form the basis
for a sustained analysis of verbal art. However, it does point to
an important aspect of verbal art, and an important differentiating tendency the partial autonomy and patterning of distinct
linguistic levels in verbal art.
All linguistic levels phonology, morphology, syntax, and so forth are, of course, relevant to speech actions of
all types. However, most components are, in most cases, only
instrumentally relevant. Morphology is relevant for communicating whether I want one cookie or two cookies. But it has no
separate function. Put differently, the specification of phonology,
morphology, and syntax is a sort of by-product of my pragmatic
aims. I want to get two cookies. In English, I signal that I want
two cookies rather than one by using the word two, putting
that word before cookie, adding s to cookie, and so forth.
I do not set out to do anything with phonology, morphology, or
syntax. One might even argue that I do not set out to do anything
special with semantics or with discourse principles. I just run
my language processors to achieve the active goal. Features of
phonology and so forth are relevant to my action only insofar
as my speech processor makes them relevant. They have, in this
way, minimal relevance.
A number of literary theorists, prominently the Russian
Formalists, have stressed that verbal art foregrounds its language (see foregrounding). In cognitive terms, we might say
that verbal art tends to enhance linguistic patterns to the point
where they are likely to be encoded by readers (i.e., perceived

and integrated with other information; see encoding) and


may even draw attentional focus. This occurs most obviously
in the phonetic/phonological aspect of language. Poets pattern
stress beyond what occurs spontaneously. They organize vowel
sounds and initial consonants to produce assonance and alliteration. The point is less clear in morphology, though corpus
linguistic studies have pointed to differential tendencies (see
Biber, Conrad, and Reppen 2006, 5865)
Syntax is a complex and thus particularly interesting case.
Verbal art does undoubtedly tend to pattern syntactic usage in
encodable ways. But the poetic patterns in these cases tend to
be intensifications of the ways in which ordinary speech actions
pattern syntax. For example, in ordinary speech, we have some
tendency to use parallel syntactic structures. In poetry, we are
more likely to use these. A more obtrusive foregrounding of syntax occurs when we violate expectations of patterning in verbal
art. Such violation is the other primary way in which verbal art
treats linguistic levels more autonomously. In the case of syntax, there are several ways in which this may occur. One obvious
way is through disjoining syntactic units and line units, such that
line breaks do not coincide with syntactic breaks (see Tsur 2006,
14652). Another is by rejecting standard principles of word
order.
Verbal art also manifests distinctive tendencies in semantics.
These are found most obviously in lexical preferences and metaphor. In the case of lexical preferences, verbal art may draw on
rarer or more unexpected words. Perhaps more importantly, it
may pattern the suggestions and associations of terms. In ordinary speech, we tend to concern ourselves with the associative
resonance of a term only in extreme cases (e.g., when its connotations may be offensive to a particular addressee). In verbal
art, however, the writer is much more likely to organize his or her
lexical choices so that the suggestions of the terms are consistent
(e.g., in emotional valence; see dhvani and rasa).
As to metaphor, our use of tropes in ordinary speech is surprisingly constrained. There are aspects of everyday metaphor that
are creative. However, on the whole, we follow well-worn paths.
Lakoff and Turner have argued that a great deal of literary metaphor draws on the same broad structures as ordinary speech.
However, literary metaphors extend, elaborate, and combine
these structures in surprising ways (see poetic metaphor).
Finally, we find parallel tendencies in discourse practices.
Consider the principles of conversation articulated by Grice
(see cooperative principle). These principles form a set of
practical conditions for any sort of interaction between speakers. For example, it is a fundamental principle of conversation
that one should not say things that are irrelevant to the conversation. Grice points out, however, that one may flout these
principles, violating them in gross and obvious ways. Flouting a
principle of conversation gives rise to interpretation. Jones and
Smith are discussing who should be hired as the new assistant
professor in the Hermeneutics of Suspicion. Jones says, Theres
a new applicant today Heidegger. What do you think of him?
Smith replies, Nice penmanship. Since penmanship is clearly
irrelevant to the topic, Smith may be said to be flouting the principle of relevance. Jones is likely to interpret Smiths comment
as indicating a dim view of Heideggers qualifications. Literary
works often flout conversational principles.

69

The Cambridge Encyclopedia of the Language Sciences


All these points indicate significant differences between verbal art and other sorts of speech action. Again, these differences
do not create a hard-and-fast division between verbal art and
all other types of speech. There is a continuum here, with many
parameters and degrees of variation. Nonetheless, there is a clear
differential tendency.
The differences we have just been considering are a matter
of the various linguistic levels of a literary text bearing autonomously or separately on our understanding and experience of
the work. This is first of all and most obviously a matter of creating patterns. However, given the violations of syntactic rules and
the flouting of conversational principles, it seems clear that verbal art does not simply create extra patterns on top of the usual,
instrumental patterns produced by ordinary language processes. It also violates patterns of ordinary language processes.
Most importantly, in both cases, the result renders the linguistic level in some way directly (rather than just instrumentally)
relevant to our experience of the work. Thus, it maximizes the
relevance of language features.
But in what way are these features relevant? As with any sort
of speech action, relevance is, first of all, relevance to the aims of
the action. Again, the primary aims of verbal art are thematic and
emotional. Thus, the maximization of relevance is the thematic
or emotional use of (ordinarily unexpected) noninstrumental
patterns or violations of (ordinarily expected) instrumental
patterns from different linguistic levels.
In touching on these issues, literary critics have tended to
emphasize thematic relevance. Some writers have seen patterns
and violations of patterns in phonology, morphology, and syntax
as consequential for interpretation. It is probably true that such
formal features are thematically interpretable in some cases.
However, it seems doubtful that such features are generally relevant to interpretation. In contrast, extended patterns or violations
in semantics and pragmatics are almost always interpretively
relevant. In these aspects, the main difference between verbal art
and other forms of speech action is where the interpretive process ends. We will consider this issue later.
If anything, the maximization of relevance applies more fully
to the communication of emotion than to the communication of
themes. There are two types of affective response that bear importantly on verbal art, and thus on the maximization of relevance. We
might refer to the first as pre-emotional and the second as specific
emotional. Pre-emotional effects are effects of interest. Interest is
pre-emotional in two senses. First, it is often an initial stage in an
emotion episode. Second, it is a component of all specific emotions after those emotions have been activated. Specific emotions
are simply our ordinary feelings sorrow, joy, disgust, and so on.
More exactly, interest is the activation of our attention system.
That activation occurs whenever we experience something new
or unexpected (see Frijda 1986, 2723, 318, 325, 386). Such activation prepares us for events that have emotional significance.
It directs our attention to aspects of the environment or our own
bodies that may be emotion triggers. Once a specific emotion is
activated, that emotion system reactivates our attention system,
focusing it on properties relevant to that emotion in particular.
For example, suppose I am out in the woods and hear something
move. That arouses my attention. I carefully listen and look. If
I see a friend at a distance, I feel some joy. That joy keeps my

70

attention on the friend and simultaneously directs my attention


to ways in which I can reach him. If I see some sort of animal, I
feel fear. That fear keeps my attention on the animal and simultaneously directs my attention to ways that I can escape.
The arousal of interest is obviously crucial to literary experience. It is also important to other speech actions. However, in
nonliterary cases, a great deal of the interest comes from the
direct relation between the people involved (thus, the speaker
and addressee), as well as the practical situation referenced in
the speech action. In other words, a great deal comes from direct
address and direct reference, both of which are greatly inhibited
in verbal art. Nonetheless, verbal art has other means of producing interest. We have just been discussing two such means the
multiplication of non-normal patterns and violations of normal
or expected patterns at various linguistic levels. Both are precisely
the sorts of deviation from normalcy that produce interest.
Specific emotional effects are fostered most obviously by
semantics and pragmatics. For example, a great deal of our
emotional response to verbal art seems to be bound up with the
patterning of associative networks that spread out from lexical
items (see Oatley 2002 and suggestion structure). The
extension and elaboration of metaphorical structures are clearly
consequential in this regard, particularly as the concreteness
of metaphors often enhances associations with concrete experiential memories, including emotional memories. The specific
emotional impact of phonological, morphological, and syntactic
features is less obvious, but no less real, at least in some cases.
For example, the interplay between nonlinguistic organization
(e.g., in line divisions) and linguistic organization (e.g., in sentence divisions) may serve to convey a sense of a speakers voice
and, along with this, an emotional tone.
Thus, once more, we see both continuity and difference
between verbal art and other sorts of speech action. Most speech
actions involve a minimal, incidental patterning of linguistic levels. In contrast, the general tendency of literary art is toward the
maximization of the relevance of different linguistic levels. This
relevance is a function of the main purposes of the text, thematic
and emotional. Again, this need not be relevance that increases
informational content. Indeed, I suspect that it most often is not.
It need not even contribute to the specific emotions of the text.
It may be a matter of enhancing interest or of qualifying the specific emotions fostered by the text. In each case, though, there is
some partially autonomous variation in the organization of the
linguistic level in question, through the addition of unexpected
patterning, through the violation of expected patterning, or both.

ON INTERPRETATION AND THE USES OF TEXTS


In the preceding section, I referred briefly to the point at which
interpretation ends. When verbal art is considered in relation to
interpretation, the first thing to remark is that verbal art is notorious for its hermeneutic promiscuity (see philology and hermeneutics). It is widely seen as almost infinitely interpretable.
At one level, this is surprising. Novels, for example, are developed in great detail and with considerable elaboration of the
characters attitudes and actions. It might seem that this would
constrain interpretation relative to the much vaguer and more
elliptical speech actions of ordinary life. But that is not generally

Elaborating Speech and Writing


believed to be the case. This suggests that the degree of interpretability of a text is not a function of its elaboration. Rather, interpretability appears to be a function of a texts distance from direct
address and direct reference to practical conditions, particularly
as these are connected with pragmatic information. Put differently, the limits of interpretation are not so much a matter of
the words themselves. They are, first of all, a matter of action.
Here as elsewhere, action is animated by the agents goals,
emotions, expectations, and so forth. In ordinary life, then, we
usually understand validity in interpretation as a function of
the speakers intention (see intentionality). More technically, our prototype of interpretive validity almost certainly
includes the intention of the speaker or author as a critical norm.
(The point is related to the argument of Steven Knapp and Walter
Benn Michaels that we invariably interpret for speakers intention, though it is not identical, for there are cases where we do
not adhere to this prototype and thus do not interpret for the
speakers intention.) However, intention is an abstract criterion.
We do not have access to speakers intentions. So even interpreting for intention, we need an operational criterion for validity
as well. That is where our own action enters. Action commonly
guides our sense of when we have gotten an interpretation right.
At the dinner table, someone says, Could you pass the that? I
am not sure what he means by that. Unable to read his mind, I
engage in an action either a bodily action (passing the beans) or
a speech action (asking if he meant the beans). When I pass the
beans, he knows that I have understood his intention. When he
accepts the beans, I infer that I understood his intention.
The interpretation of verbal art is as removed from such practical action as possible. Thus, our ordinary operational criterion
is rendered ineffective. The only obvious practical behaviors
relating to literary interpretation are professional for example,
the acceptance or rejection of articles in academic journals.
(The point is codified in Fishs contention that validity in literary interpretation is defined by the norms of interpretive
communities.)
A number of further problems for intentional inference arise
in connection with this removal of literary interpretation from
practical action. First, many texts are read or performed, and
thus interpreted, far from their authors and even after their
authors are dead. If we take a strict intentionalist view of validity, then the author is the only one who has the authority to
determine that a given action does indeed satisfy an operational
criterion. Suppose Jones leaves instructions for his funeral. The
funeral director follows them as well as she can. But Jones is not
around to confirm that she got things right. Obviously, there
are things that she might do to ascertain Joness intention. For
example, she might talk to Joness friends and relatives or she
might try to learn something about Joness religion and ethnic
background. These are the sorts of concerns that lead writers
such as Hans-Georg Gadamer to stress tradition as a crucial
guide to interpretation.
A second problem is more distinctively connected with verbal art per se. Both informational and affective patterns are
more complex in verbal art than in most other speech actions.
For example, a literary work communicates thematically relevant information by maximizing the relevance of virtually
every semantic and discursive detail in the text. If I am meeting

someone for the first time, I may describe my outfit so that he


or she can recognize me. It has pragmatic consequences. But
if an author describes a characters outfit, that description may
bear on our understanding of the characters emotions, class
status, or religious beliefs. Those features may in turn bear on
our broader understanding of human relations, class conflict,
or religious practice as portrayed in the work. In short, ordinarily incidental details may have thematic (thus regulative)
consequences.
Moreover, literary narratives tend to develop subtle and variable affinities and ambivalences (see Hogan 2003, 12251). In
some cases, the development of these affinities actually runs contrary to the authors self-conscious sense of his or her own aims.
The most famous case may be Miltons portrayal of Satan. Satan
has often been seen as the most engaging figure in Paradise Lost,
but this was certainly not Miltons self-conscious intention.
This is part of a larger point that intention is not a single, unified operation. There are different sorts of intention with different
objects, constraints, and processes. Perhaps the most important
form of intention for verbal art is what we might call aesthetical intent (see Hogan 1996, 16393). The aesthetical intent of an
author is to produce a work that has the right sort of experiential
effect. This right effect is not something that the author is likely
to be able to articulate separately from the work itself. It is simply
what he or she experiences when he or she feels that the work
is now complete. In composing the work, the author generates
and adjusts the text, testing the outcome against his or her own
response. The authors sense that the work is complete need not
mean that the work conforms to the authors self-conscious attitudes and commitments.
One way of explaining aesthetical intent is that the author
adopts an aesthetical attitude toward the work, or a dhvani attitude, as Anand Amaladass put it. This is a matter of broadening ones attention to the work, expanding ones encoding of the
work, to include its multiple resonances and emotive patterns.
In short, it involves approaching the text as a work in which different linguistic levels are maximally relevant. Correlatively, it
involves approaching the text as a work that is removed from
constraints of direct reference or address, constraints that ordinarily orient our judgment of informational relevance and our
construal of emotional bearing. When approaching a work as
verbal art, readers and audience members adopt this attitude as
well to varying degrees.
The mention of readers approaches to texts brings us to a
final complication. We have been considering ways in which
works produced as verbal art tend to be different from other
speech actions. But the nature of a given text is not determined
solely by authorial intent. Despite our prototypical concern for
authorial intent, we are free to approach works of verbal art in
pragmatic ways and to approach other works with an aesthetical
attitude. As writers in cultural studies have stressed, practices of
literary analysis may be applied to a wide range of texts that were
not initially intended to be literary. In short, literariness may
be defined by interpretation or reception no less than it may be
defined by production; it may be defined by readers no less than
by authors. In this way, the usual characteristics of verbal art may
be extended to other texts or withdrawn from texts (as when a
novel is studied for its authors psychology).

71

The Cambridge Encyclopedia of the Language Sciences


On the other hand, the expansion of hermeneutic liberality
results from the existence of verbal art. Interpretive autonomy
arises in the first place through the removal of speech action
from direct address and direct reference, along with the attenuation of pragmatic information and initiating emotion. In this
way, interpretive practices that bridge literary and nonliterary
speech actions are themselves a distinctive product of literary
speech action.

SHAKESPEAREAN INDIRECTION
As the preceding discussions have been rather abstract, it is
valuable to end with a more developed example. Consider
Shakespeares Sonnet 97:
How like a winter hath my absence been
From thee, the pleasure of the fleeting year!
What freezings have I felt, what dark days seen!
What old Decembers bareness every where!
And yet this time removed was summers time,
The teeming autumn big with rich increase,
Bearing the wanton burthen of the prime,
Like widowed wombs after their lords decease:
Yet this abundant issue seemd to me
But hope of orphans, and unfathered fruit,
For summer and his pleasures wait on thee,
And thou away, the very birds are mute;
Or if they sing, tis with so dull a cheer
That leaves look pale, dreading the winters near.

The poem presents a straightforward instance of indirect


address. There is, of course, a narratee in the poem, an explicit
thee. But the poem only plays at addressing this thee. The
point is clear, for example, when the reader is brought in as a side
participant with the otherwise superfluous information, this
time removed was summers time. If there were a defining, material context and if there were direct address in this speech action,
the speaker would not need to explain that the separation had
occurred over the summer. The beloved would surely remember
this. The point is reinforced more subtly by the shifts in spatial orientation in the poem (what some writers see as particular types of
deixis [see Stockwell 2002, 4157]). These may be understood as a
matter of indirect reference. If this speech action were grounded
in a particular situation, then there would almost certainly be
some fixed reference point, a specific home that would define
which of the lovers was stationary and which had departed. But
the poem is contradictory on this score. In the opening lines,
the speaker refers to my absence /From thee and his time
removed. But toward the end of the poem, he reverses the spatial
orientation and related direction of movement. Now, the beloved
is away and summer will wait for her return.
The introduction of summer in this context suggests something else. The poet is not only like little Jane, covertly explaining
what the third graders did in their last class. He may also be like
Jane in referring to a telephone by way of a banana. Put differently, the indirectness of both address and reference make the
reference of summer itself uncertain. Without a context, readers are free to associate summer tacitly with anything that can,

72

in play, be summer (e.g., a time of professional success for the


speaker). In short, summer becomes a metaphor.
The mention of metaphors points us toward the maximization
of relevance. However, before going on to this, it is important to
remark on some other contextual features of the poem, contextual features that are themselves bound up with metaphor. As I
indicated earlier, the entire tradition of poetry forms an implicit
context for any given poem (a point stressed by T. S. Eliot). One
obvious way in which Shakespeare suggests a context of verbal
art is in his use of seasonal imagery. Winter is a standard literary
image of romantic separation. This is, of course, related to the
conceptual metaphor, discussed by Lakoff and others: LIFE
IS A YEAR. As Lakoff and Turner explain, there are several processes poets use to create novel instances of these schemas.
Shakespeare is clearly creating such a novel instance when he
maps the source metaphor of winter onto the target, summer
(see source and target). Moreover, in making summer into
(metaphorical) winter, he is intensifying the effect of the metaphor by contrast. But he does not say summer only. He expands
the target time to summer and autumn. In a way, this is peculiar. The contrast would have been more obviously enhanced
by making the target time period spring and summer. Why does
Shakespeare choose autumn? There are several reasons. Two
are closely interrelated. First, he wishes to intensify the tacit
emplotment of the speakers isolation. That speaker is now
anticipating the very worst, for if summer was like winter, how
devastating will winter be? Second, he wishes to hold out hope
for something beyond that period of terrible loneliness spring.
That hope is possible only through an indirect reference to the
broader, literary context in which spring is the season when the
lovers are finally reunited.
The indirection of the poem makes it unlikely that it will
have any pragmatic information as its aim. Pragmatic information most often involves reference to particular situations, or to
forms of general knowledge. Of course, there is not a strict division between general pragmatic knowledge and, say, prudential
regulatory information. But to the degree that the poem communicates thematic points, those points do incline toward prudential regulatory information. For example, suppose summer
is metaphorical (i.e., winter is metaphorical for summer,
which is itself metaphorical for, say, professional success). Then
the poem suggests that the enjoyment of external conditions
(e.g., professional success) is lost when it cannot be shared a
point with clear regulatory consequences.
The indirection of the poem also indicates that it is unlikely
to be aimed at the elicitation of initiating emotions. This can be
seen if we contrast it with a similar letter, sent by the poet to his
beloved. Such a letter could be aimed at convincing the beloved
to return home. The poem, in contrast, does not have any such
initiating emotional aim. Its emotional aim is, rather, confined to
provoking empathic hypothetical emotions.
The maximization of relevance contributes to the poems
achievement of its primary aims. As usual, this maximization is
most obvious at the phonological level. Consider only the first
two lines. The poem is in iambic pentameter. However, when
spoken naturally, these lines do not follow the meter. Indeed,
there is considerable tension between spontaneous stress

Elaborating Speech and Writing


patterns and the meter. One natural way of speaking the lines
would be as follows:
Hw lke wntr hth m bsnce ben
Frm the, th plasre f th fletng yar!

There are several things one might remark on here. First, in


natural speech, the lines are in a somewhat irregular tetrameter. However, this irregularity is not unpatterned. There is a
striking rhythmic motif that occurs in the middle of both lines.
The sequences lke wntr hth m bsnce and the, th
plasre f th fletng have the same stress pattern. This is not
monotonous because the lines also manifest three variations.
First, a caesura appears in only one of the sequences. Second,
the first word of the line changes from stressed to unstressed.
Finally, the last word changes from unstressed to stressed. Thus
we find novelty, therefore the triggering of at least mild interest,
at two levels.
In addition, there are interpretive and specific emotional consequences. The disjunction of syntactic and verse breaks may help
to give the poem a sense of voice. Specifically, it suggests to me a
speaker who pauses before saying what is most painful about his
absence. Being away from home means that one is absent from
many things. But, here, there is only one crucial attachment. The
line break imitates the emotion that so often interrupts ones
speech when such occasions are real and directly addressed.
There may also be effects of tempo in these lines. Hw lke
wntr is heavy with accents and thus slower than th fletng
yar. This provides an instance of sound echoing sense. It also
suggests an interpretive point time flies in the sense that life
passes quickly, leaving us little time together; but, being apart, we
experience each moment of that fleeting time as a slow drag.
We would have many further points to interpret if we were
to consider lexical choice, metaphorical patterning, the flouting
of conversational principles, and so forth. Each of these directs
us toward the endless interpretability of verbal art. As in other
works, there is no clear operational criterion that would tell us
that we have reached the end of our interpretation or that our
interpretation is correct. Of course, as Fish indicates, there are
professional constraints on our judgments in these areas. A
psychoanalytic critic might discuss how the poet seems to shift
between the position of a separated lover and that of an orphan,
suggesting an oedipal relation to the beloved. A writer in queer
theory might stress that the poet puts his beloved in the position
of the (deceased) father, not the mother, thus suggesting a male
beloved.
This brings us back to interpretation and reception. Again, we
are always free to take indirect address and put it into a more
directly referential context. For example, we may seek to read
through the sonnets to Shakespeares own life and sexual feelings. Indeed, as Walter Ong pointed out many years ago, we have
a strong tendency to do just that, placing the decontextualized
voice of the poem back in a human body at a particular place in a
particular time. That sense of concrete human embodiment is
itself no doubt a crucial part not only of literary response but also
of all human communication.
The preceding point suggests once again the distinctiveness of verbal art and its continuity with other speech actions.

Verbal art both fosters the proliferation of interpretations


and sharpens our sense of the human embodiment that limits those interpretations. We find the same tension between
sameness and difference in all the characteristics we have considered. Although I have been stressing difference, the sameness is no less consequential. For example, in the context of
an encyclopedia of language sciences, it is important that in
studying verbal art, we are likely to isolate properties and relations in language that we might otherwise have passed over
properties and relations of address, reference, informational
structure and orientation, and type and force of emotional
consequence or function. In short, verbal art is a critical element of human life. As such, it is a critical object of study in its
own right. It is also a crucial part of human speech action. As
such it is a crucial, if sometimes neglected, part of the language
sciences as well.

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Amaladass, Anand. 1984. Philosophical Implications of Dhvani:
Experience of Symbol Language in Indian Aesthetics. Vienna: De Nobili
Research Library.
Biber, Douglas, Susan Conrad, and Randi Reppen. 2006.
Corpus Linguistics: Investigating Language Structure and Use.
Cambridge: Cambridge University Press.
Chandler, Steven. 1991. Metaphor comprehension: A connectionist approach to implications for the mental lexicon. Metaphor and
Symbolic Activity 6.4: 22758.
Dasenbrock, Reed Way, ed. 1993. Literary Theory After Davidson.
University Park: Pennsylvania State University Press.
Eliot, T. S. 2001. Tradition and the individual talent. In The Norton
Anthology of Theory and Criticism, ed. Vincent B. Leitch, 10928. New
York: W. W. Norton.
Fabb, Nigel, and Morris Halle. 2006. Metrical complexity in Christina
Rossettis verse. College Literature 33.2: 91114.
Fish, Stanley. 1980. Is There a Text in This Class? The Authority of
Interpretive Communities. Cambridge: Harvard University Press.
Frijda, Nico. 1986. The Emotions. Cambridge: Cambridge University
Press.
Gadamer, Hans-Georg. 1989. Truth and Method. 2d ed. Trans. Joel
Weinsheimer and Donald Marshall. New York: Crossroad.
Gerrig, Richard, and Deborah Prentice. 1996. Notes on audience
response. In Post-Theory: Reconstructing Film Studies, ed. David
Bordwell and Nol Carroll, 388403. Madison: University of Wisconsin
Press.
Grice, Paul. 1989. Studies in the Way of Words. Cambridge: Harvard
University Press.
Hogan, Patrick Colm. 1996. On Interpretation: Meaning and Inference
in Law, Psychoanalysis, and Literature. Athens: University of Georgia
Press.
. 2003. The Mind and Its Stories: Narrative Universals and Human
Emotion. Cambridge: Cambridge University Press.
Jakobson, Roman. 1987. Language in Literature, ed. Krystyna Pomorska
and Stephen Rudy. Cambridge, MA: Belknap Press.
Kane, Julie. 2004. Poetry as right-hemispheric language. Journal of
Consciousness Studies 11.5/6: 2159.
Kiparsky, Paul. 1987. On theory and interpretation. In The Linguistics
of Writing: Arguments Between Language and Literature, ed. Nigel
Fabb, Derek Attridge, Alan Durant, and Colin MacCabe, 18598.
New York: Methuen.

73

The Cambridge Encyclopedia of the Language Sciences


Knapp, Steven, and Walter Benn Michaels. 1985. Against theory. In
Against Theory: Literary Studies and the New Pragmatism, ed. W. J. T.
Mitchell, 1130. Chicago: University of Chicago Press.
Labov, William. 1972. Language in the Inner City: Studies in the Black
English Vernacular. Philadelphia: University of Pennsylvania Press.
Lakoff, George, and Mark Turner. 1989. More Than Cool Reason: A Field
Guide to Poetic Metaphor. Chicago: University of Chicago Press.
MacNeilage, Peter. 1998. Evolution of the mechanism of language
output: Comparative neurobiology of vocal and manual communication. In Approaches to the Evolution of Language: Social and
Cognitive Bases, ed. James Hurford, Michael Studdert-Kennedy, and
Chris Knight, 22241. Cambridge: Cambridge University Press.
Martindale, Colin. 1995. Creativity and connectionism. In The Creative
Cognition Approach, ed. Steven Smith, Thomas Ward, and Ronald
Finke, 24968. Cambridge, MA: MIT Press.
Oatley, Keith. 2002. Emotions and the story worlds of fiction. In Narrative
Impact: Social and Cognitive Foundations, ed. Melanie Green, Jeffrey
Strange, and Timothy Brock, 3969. Mahwah, NJ: Erlbaum.

74

Ong, Walter J., S.J. The jinnee in the well wrought urn. In The
Barbarian Within and Other Fugitive Essays and Studies, 1525. New
York: Macmillan.
Pratt, Mary Louise. 1977. Toward a Speech Act Theory of Literary Discourse.
Bloomington: Indiana University Press.
Shakespeare, William. 2006. The Sonnets, ed. G. Blakemore Evans.
Cambridge: Cambridge University Press.
Stockwell, Peter. 2002. Cognitive Poetics: An Introduction.
London: Routledge.
Tsur, Reuven. 2006. Kubla Khan: Poetic Structure, Hypnotic Quality and
Cognitive Style A Study in Mental, Vocal and Critical Performance.
Amsterdam: John Benjamins.
Walter, Henrik. 2001. Neurophilosophy of Free Will: From Libertarian
Illusions to a Concept of Natural Autonomy. Trans. Cynthia Klohr.
Cambridge, MA: MIT Press.

THE CAMBRIDGE ENCYCLOPEDIA OF

THE LANGUAGE SCIENCES

Abduction

Absolute and Statistical Universals


and system of language that following generations infer often
differ from the system earlier generations are using. This often
results in semantic change, syntactic change, and sound
change.

A
ABDUCTION

Albert Atkin

Abduction is a form of reasoning first explicated by the nineteenth-century philosopher C. S. Peirce. The central concept
he wishes to introduce is that of generating new hypotheses to
explain observed phenomena partly by guesswork or speculation. In his early work, Peirce tried to explain abductive reasoning, as distinct from deductive and inductive reasoning, by
reference to syllogistic form. For instance, the following schema
is an example of deductive reasoning:
All the beans in the bag are white
These beans came from this bag
Therefore, these beans are white

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Anderson, Henning. 1973. Abductive and deductive change. Language
49: 76593.
Burks, Arthur. 1946. Peirces theory of abduction. Philosophy of Science
13: 3016.
McMahon, April. 1994. Understanding
Language
Change.
Cambridge: Cambridge University Press.
Peirce, C. S. 1935. The Collected Papers of Charles S. Peirce. Vol. 5.
Cambridge: Harvard University Press.

ABSOLUTE AND STATISTICAL UNIVERSALS

This is distinct from inductive reasoning which, Peirce argues,


follows this pattern:
These beans came from this bag
These beans are white
Therefore, all the beans in this bag are white
And both these forms are distinct from abductive reasoning
which, Peirce argues, follows this pattern:
These beans are white
All the beans in this bag are white
Therefore, the beans came from this bag
In later work, however, Peirce felt that trying to fit abductive reasoning into such a strict syllogistic form was restrictive, and instead he opted for the following schema to explain
abduction:
The surprising fact C is observed
But if A were true, C would be a matter of course
Hence, there is a reason to suspect that A is true.
(Peirce 1935, 189)

For example, suppose I observe that my car will not start.


One good explanation for this would be that it is out of fuel.
Consequently, it seems that we have a good reason to think that
my cars refusal to start is due to its being out of fuel. Of course,
we may very quickly discover that my car has plenty of fuel,
and a different hypothesis must be adopted, but Peirce always
intended that abductive reasoning was fallible and conjectural,
awaiting confirmation from other testing.
Peirces account of abduction has been widely adopted in
the philosophy of science, but it has also been of some interest
to linguists. One particularly prominent use of abduction has
been in historical linguistics for explaining language
change (see, for instance, Anderson 1973). The systematic
features of a language that govern the use of one generation
are opaque to the following generation as they acquire that language the only access is through language output. It appears,
then, that following generations must use abductive inferences
to access the rules of language before applying those rules to
new cases. And, of course, since abduction is fallible, the rules

Language universals are statements that are true of all languages; for example, all languages have stop consonants. But
beneath this simple definition lurks deep ambiguity, and this
triggers misunderstanding in both interdisciplinary discourse
and within linguistics itself. A core dimension of the ambiguity
is captured by the opposition absolute versus statistical universal, although the literature uses these terms in varied ways.
Many textbooks draw the boundary between absolute and statistical according to whether a sample of languages contains exceptions to a universal. But the notion of an exception-free sample
is not very revealing, even if the sample contained all known
languages: There is always a chance that an as yet undescribed
language, or an unknown language from the past or future, will
provide an exception.
It is impossible, in principle, to survey all languages of our species. If we nevertheless want to make claims about all languages,
only two routes are open: a priori deduction of necessarily true
statements or statistical extrapolation from empirical samples to
the entire set. Absolute universals can then be defined as those
that are necessarily true, statistical universals as those that are
extrapolated from samples.

Absolute Universals
For statements to be necessarily true, they must follow from a
priori assumptions. The assumptions that linguists make are
diverse and heavily debated. An example is the assumption that
words consist of morphemes, that is, minimal form-meaning pairs. If one accepts this, then it is necessarily true that all
languages have morphemes, and there cannot be exceptions.
Why? Suppose someone claims to have discovered a language
without morphemes. One can of course simply analyze the language without mentioning morphemes, but obviously that cannot challenge the universal just because one can always defend
it by reanalyzing the language with morphemes. The only true
challenge would be to show that analyzing some data in terms
of morphemes leads to structures that are in conflict with other
assumptions, for example, that form-meaning pairs combine
exclusively by linear concatenation. The conflict can be illustrated
by languages with morphologies like the English plural geese,
where the meanings plural and goose do not correspond to linear

77

Absolute and Statistical Universals

(iii) add additional assumptions that reconcile the conflict.

This view of absolute universals is highly controversial: Many


linguists limit absolute universals to what is descriptively necessary
in every language; many psychologists propose that children apply
different and much more general principles in acquiring a language than those found in linguists metalanguages; and to date,
no absolute universal has been confirmed by genetic research.

On any of these options, the universal remains exceptionless: On solution (i), no language has morphemes; on solutions (ii) and (iii), all languages have morphemes. As a result,
absolute universals can never be falsified by individual data.
Their validity can only be evaluated by exploring whether they
are consistent with other absolute universals that are claimed
simultaneously.
Absolute universals can also be thought of as those aspects
of ones descriptive metalanguage often called a theoretical framework that are necessarily referred to in the analysis
of every language, that is, that constitute the descriptive a priori.
Depending on ones a priori, this includes, apart from the morpheme, such notions as distinctive feature, constituent (see constituent structure), argument, predicate (see predicate
and argument), reference, agent, speaker, and so on. In some
metalanguages, the a priori also includes more specific assumptions, for example, that constituents can only be described by
uniform branching (all to the left, or all to the right), or only by
binary branching, and so on.
The status of absolute universals is controversial. For many
linguists, especially in typology and historical linguistics, absolute universals are simply the descriptive a priori,
with no additional claim on biological or psychological reality. The choice between equally consistent universals/metalanguages for example, among options (i), (ii), and (iii) in
the previous example is guided by their success in describing
structures and in defining variables that capture distributional
patterns, an evaluation procedure comparable to the way in
which technical instruments for analyzing objects are evaluated
in the natural sciences. In the morphology problem, typologists
would most likely chose option (ii) because it allows for defining
a variable of stem-internal versus affixal plural realization that
has an interesting distribution (suggesting, for example, that
within-stem realization is favored by a few families in Africa and
the Near East).
In generative grammar, by contrast, absolute universals
are thought of not only as descriptively a priori but also as biologically given in what is called universal grammar: they are
claimed to be innate (see innateness and innatism) and to
be identical to the generalizations that a child makes when learning language. Thus, if the morpheme is accepted as a universal,
that is, a priori term of our metalanguage, it will also be claimed to
be part of what makes languages learnable (see learnability)
and to be part of our genetic endowment. An immediate consequence of such an approach is that something can be claimed as
universal even if it is not in fact necessary in the analysis of every
language. For example, even if some language (e.g., the Rotokas
language of Bougainville) lacks evidence for nasal sounds, one
could still include a distinctive feature [ nasal] in Universal
Grammar. Rotokas speakers are then said to have the feature as
part of their genetic endowment even if they dont use it.

What is not an absolute universal is a variable (or character, or


parameter): some languages have a certain structure or they
dont have it, or to different degrees. It is interesting to note that
most variables show some skewing in their distribution; some values of a variable are favored only in certain geographical areas (relative pronouns in Europe) or only in certain families (stem-internal
inflection in Afroasiatic). But some values are globally favored (e.g.,
nasals) or, what is more typical, globally favored under certain
structural conditions (e.g., postnominal relative clauses among
languages with objects following the verb). These global preferences are called unconditional (unrestricted) and conditional
(restricted) statistical universals, respectively. (An alternative term
for conditional statistical universals is implicational universals, but this invites confusion because their probablistic nature
differentiates them from logical implications; cf. Cysouw 2005)
Statistical universals are mostly motivated by theories of how
languages develop, how they are used, how they are learned,
and how they are processed. One such theory, for example, proposes that processing preferences in the brain lead to a universal
increase in the odds for postnominal structures among verb-object languages (Hawkins 2004).
Statistical universals take the same forms as statistical hypotheses in any other science for example, they can be formulated
in terms of regression models. They can be tested with the same
range of statistical methods as in any other science, and, again
as in other sciences, the appropriate choice of models, population assumptions, and testing methods is an issue of ongoing
research (e.g. Cysouw 2005; Janssen, Bickel, and Ziga 2006;
Maslova 2008).
A central concern when testing statistical universals is to
ascertain true globality, that is, independence of area and family. Areas can be controlled for by standard factorial analysis, but
it is an unsettled question just what the relevant areal relations
are; for example, should one control for the influence of Europe
or the entire Eurasia or both? A quick solution is to assume a
standard set of five or six macroareas in the world and accept as
universal if a distribution is independent of these areas (Dryer
1989). But the rationale for such a set is problematic, and this
has led to a steep surge of interest in research on areas and their
historical background (e.g., Nichols 1992; Haspelmath et al.
2005).
Controlling for family relations poses another problem.
Under standard statistical procedures, one would draw random samples of equal size within each family and then model
families as levels of a factor. However, over a third of all known
families are isolates, containing only one member each. And
picking one member at random in larger families is impossible
if at the same time one wants to control for areas (e.g., admitting
an Indo-European language from both Europe and South Asia).

strings of morphemes. Confronted with such data, there are


three options:
(i) give up the notion of morpheme;
(ii) give up the assumption of linear concatenation;

78

Statistical Universals

Absolute and Statistical Universals

Accessibility Hierarchy

In response to this problem, typologists seek to ensure representativity of a sample not by random selection within families but
by exhaustive sampling of known families, stratified by area. In
order to then control for unequal family sizes, one usually admits
only as many data points per family as there are different values
on the variables of interest (Dryer 1989; Bickel 2008).
Samples that are not based on random sampling do not
support parametric inference by statistical tests. An alternative
to this is randomization methods (Janssen, Bickel, and Ziga
2006): The null hypothesis in these methods is that an observed
preference can be predicted from the totals of the sample (e.g.,
that an observed 90% postnominal relatives in VO [verb-object]
languages could be predicted if 90% of the entire sample had
postnominal relatives) not that the sample stems from a population without the observed preference. Extrapolation to the total
population (the entire of set of human languages) can then only
be based on plausibility arguments: If a preference significantly
deviates from what is expected from the totals of the observed
sample, it is likely that the preference holds in all languages. A
key issue in such argumentation is whether the tested variables
are sufficiently unstable over time so that a present sample can
be assumed to not reflect accidental population skewings from
early times in prehistory (Maslova 2000). In response to this,
typologists now also seek to test universals by sampling language
changes instead of language states a move that is sometimes
called the dynamization of typology (Greenberg 1995; Croft
2003).
While the number of proposed statistical universals is
impressive the Universals Archive at Konstanz has collected
more than 2,000 (Plank and Filimonova 2000) very few of them
have been rigorously tested for independence of area, family,
and time.
Balthasar Bickel
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bickel, B. 2008. A refined sampling procedure for genealogical control.
Sprachtypologie und Universalienforschung 61: 2233.
Croft, W. 2003. Typology and universals. 2d ed. Cambridge: Cambridge
University Press.
Cysouw, M. 2005. Quantitative methods in typology. In Quantitative
Linguistics: An International Handbook, ed. G. Altmann, R. Khler, and
R. Piotrowski, 55478. Berlin: Mouton de Gruyter.
Cysouw, M, ed. 2008. Special issue on analyzing The World Atlas of
Language Structures. Sprachtypologie und Universalienforschung 61.
Dryer, M. S. 1989. Large linguistic areas and language sampling. Studies
in Language 13: 25792.
Greenberg, J. H. 1995. The diachronic typological approach to language. In Approaches to Language Typology, ed. M. Shibatani and T.
Bynon, 14366. Oxford: Clarendon.
Haspelmath, M., M. S. Dryer, D. Gil, and B. Comrie, eds. 2005. The World
Atlas of Language Structures. Oxford: Oxford University Press.
Hauser, M. D., N. Chomsky, and W. T. Fitch. 2002. The faculty of language: What it is, who has it, and how did it evolve? Science 298: 1569
79. This paper and the response by S. Pinker and R. Jackendoff 2005
launched an ongoing debate on the nature and extent of absolute universals in generative grammar.
Hawkins, J. A. 2004. Efficiency and Complexity in Grammars.
Oxford: Oxford University Press.
Janssen, D., B. Bickel, and F. Ziga. 2006. Randomization tests in language typology. Linguistic Typology 10: 41940.

Maslova, E.. 2000. A dynamic approach to the verification of distributional universals. Linguistic Typology 4: 30733.
. 2008. Meta-typological distributions. Sprachtypologie und
Universalienforschung 61: 199207.
Newmeyer, F. J. 2005. Possible and Probable Languages: A Generative
Perspective on Linguistic Typology. New York: Oxford University Press.
Nichols, J. 1992. Language Diversity in Space and Time. Chicago: The
University of Chicago Press.
Pinker, S., and R. Jackendoff. 2005. The faculty of language: Whats special about it? Cognition 95: 20136.
Plank, F., and E. Filimonova. 2000. The Universals Archive: A
brief introduction to prospective users. Sprachtypologie und
Universalienforschung 53: 10923.
Plank, F., ed. 2007. Linguistic Typology 11.1 (Special issue treating the
state of typology.)

ACCESSIBILITY HIERARCHY
Edward L. Keenan and Bernard Comrie (1972, 1977) introduce
the accessibility hierarchy (AH) as a basis for several crosslinguistic generalizations regarding the formation of relative
clauses (RCs).
AH: SUBJ > DO > IO > OBL > GEN > OCOMP

The terms of the AH are main clause subject, direct object, indirect object, object of pre- or postposition, genitive (possessor), and
object of comparison. Keenan and Comrie cross-classified RCs
along two parameters: 1) the head noun precedes or follows the
restrictive clause (RestCl), and 2) the case of the position relativized, NPrel, is pronominally marked or not. In (1), from German,
the RestCl, underlined, follows the head in (1a) and precedes it
in (1b). In (2a,b), from Hebrew and Russian, NPrel is pronominally marked but not in English.
(1) a. der Mann, der in seinem Bro arbeitet
the man, who in his
study is+working
the man who is working in his study
b. der in seinem Bro arbeitende Mann
the in his
study working man
the man who is working in his study
(2) a. ha-isha
she Dan natan la
et ha-sefer
the-woman that Dan gave to+her acc the-book
the woman that Dan gave the book to
b. devuka, kotoruju Petr ljubit
girl,
who(acc) Peter loves
the girl who Peter loves

A given choice of values for the two parameters defines an


RC-forming strategy. A strategy that applies to SUBJ is called primary. German has two primary strategies, a postnominal, +case
one, (1a), and a prenominal, case one, (1b). Keenan and Comrie
support three hierarchy generalizations:
(3) a. All languages have a primary strategy
b. A given RC-forming strategy must apply to a continuous
segment of the AH
c. A primary strategy may cease to apply at any position on
the AH

For example, many West Austronesian languages, such as


Malagasy (Madagascar), only have primary strategies. So we can
only relativize the agent in (4a).

79

Accessibility Hierarchy
(4) a. Manolotra (m+aN+tolotra) vary ho anny vahiny aminny
lovia vaovao ny tovovavy
offers (pres+act+offer) rice forthe guests onthe dishes new the
young+woman
The young woman offers rice to the guests on the new dishes
b. ny tovovavy (izay) manolotra vary ho anny vahiny aminny
lovia vaovao
the woman (that) offers rice tothe guests onthe dishes new
the young woman who offers rice to the guests on the new dishes
c. *ny vary (izay) manolotra ho anny vahiny aminny lovia
vaovao ny tovovavy
the rice (that) offers tothe guests onthe dishes new the
young+woman
the rice that the young woman offers to the guests on the new
dishes

The first four words in (4c) claim that the rice is doing the
offering a nonsense. Malagasy does not, however, have an
expressivity gap here since it has a rich voice system allowing any major NP in a clause as subject. The form of offer that
takes theme subjects is atolotra, recipient subjects tolorana,
and oblique subjects anolorana. (5a,b) illustrate Theme and
Instrument RCs.
(5) a. ny vary (izay) atolo-dRasoa ho anny vahiny aminny lovia
vaovao
the rice (that) offered-by+Rasoa forthe guests onthe new
dishes
the rice that the young+woman offers to the guests on the
new dishes
b. ny lovia vaovao (izay) anoloran-dRasoa vary ho anny
vahiny
the dishes new (that) offered-by+Rasoa rice for the guests
the new dishes on which the young woman offered rice to the
guests

Bantu languages, such as Luganda, (6), illustrate the DO cutoff.


Only subjects and objects are directly relativizable. Obliques can
be promoted to object using applicative affixed verbs. So the
instrumental in (6a) is only relativizable from (6c).
(6) a. John yatta enkoko n (= na) ekiso
John killed chicken with knife
b. *ekiso John kye-yatta enkoko (na) knife
John rel-killed chicken (with)
c. John yattisa (yatt+is+a) ekiso enkoko
John kill+with knife chicken
John killed+with a knife the chicken
d. ekiso John kye-yattisa enkoko knife
John rel-kill+with chicken
the knife John killed the chicken with

Independent support for the AH: Keenan (1975) shows that


stylistically simple texts used RCs formed high on the AH proportionately more than texts independently judged stylistically
hard. Second, Comries work (1976) supports the conclusion that
the positioning of demoted subjects in morphological causatives
tends to assume the highest function on the AH not already filled.
Thus in the French Jai fait rire les enfants I made-laugh the children, the children surfaces as a DO as laugh lacks a DO. But in

80

Acoustic Phonetics
causativizing a transitive verb, its agent argument may surface as
an IO (Jai fait manger les pinards aux enfants I made-eat the
spinach to the children). Lastly, S. Hawkins and Keenan (1987)
show psycholinguistically that recall of RCs formed on high positions on the AH was better than recall of ones formed on low
positions.
One interesting modification to the hierarchy generalizations concerns syntactic ergativity. Keenan and Comrie noted
that Dyirbal (Dixon 1972) relativizes absolutives intransitive
subjects and transitive objects, but not transitive subjects. A
verbal affix (antipassive) derives intransitive verbs from transitive ones with the agent as subject, hence relativizable. Mayan
languages such as Jacaltec (Craig 1977, 196) are similar. This is
an elegant solution to the requirement that agents be relativizable, analogous to Bantu applicatives or Austronesian voices
affixes.
(7) a. x s watxe naj hun-ti
asp 3abs 3erg make cl:man one-this
He made this
b. naj x watxe n hun-ti
cl:man asp 3abs make ap one-this
the man (who) made this

Edward L. Keenan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Comrie, Bernard. 1976. The syntax of causative constructions: Crosslanguage similarities and divergences. In The Grammar of Causative
Constructions, Syntax and Semantics 6, ed. Masayoshi Shibatani, 261
312. Amsterdam: Academic Press.
Craig, Colette. 1977. Jacaltec. Austin: University of Texas Press.
Dixon, Robert M. W. 1972. The Dyirbal Language of North Queensland.
Oxford: Cambridge University Press.
Hawkins, Sarah, and Edward L. Keenan. 1987. The psychological validity
of the accessibility hierarchy. In Universal Grammar: 15 Essays, ed. E.
L. Keenan, 6089. London: Croom Helm.
Keenan, Edward L. 1975. Variation in Universal Grammar. In Analyzing
Variation in English, ed. R. Fasold and R. Shuy, 13648. Washington,
DC: Georgetown University Press.
Keenan, Edward L., and Bernard Comrie. 1972. Noun phrase accessibility and Universal Grammar. Paper presented at the Annual Meetings
of the Linguistic Society of America, Atlanta.
. 1977. Noun phrase accessibility and Universal Grammar.
Linguistic Inquiry 8.1: 6399.

ACOUSTIC PHONETICS
Like the rest of linguistics, acoustic phonetics combines description with theory. Descriptions are images of acoustic properties
and quantitative measures taken from these images, and theory
accounts for the way in which a sounds articulation determines
its acoustics.

Description
The three most commonly used images of speech are the waveform, spectrum, and spectrogram (Figure 1). The waveform displays differences in sound pressure level (in pascals) over time
(Figure 1a, d), the spectrum differences in sound pressure level

Acoustic Phonetics

Figure 1. Waveforms, spectra, and


spectrograms of 30 ms intervals of
the vowel [i] (ac), and the fricative
[s] (df).

(in deciBels) over frequency (Figure 1b, e), and the spectrogram
differences in frequency over time (Figure 1c, f,); darkness indicates the sound pressure level at particular frequencies and
moments in the spectrogram.
The images in Figures 1ac differ from those in Figure 1df
in every conceivable respect: Sound pressure level varies more
or less regularly and repetitively, every 0.0073 second, in the
vowel [i] (as in heed), while in the fricative [s] (as in see), it
instead varies nearly randomly. The vowel is thus nearly periodic, while the fricative is decidedly aperiodic. This difference
gives the vowel a clear pitch, while making the fricative instead
sound noisy.
A single cycles duration in a periodic sound is its period (T);
the distance it travels in space is its wavelength (). As measures
of a single cycles extent, both period and wavelength are reciprocally related to frequency (F), the number of cycles per second,
or Hz:
(1)

F (cycles/sec) =

1
T (sec/cycle)

(2)

F (cycles/sec) =

c (cm/sec)
(cm/cycle)

Note that the numerator in (2) is not 1 but instead c, the speed
of sound.
The spectrum and spectrogram of [i] (Figures 1b, c) show peaks
and horizontal bands, respectively, known as formants, at roughly
300, 2,200, and 2,800 Hz. The corresponding images of [s] (Figures
1e, f) show a broad energy band spanning 4,0007,000 Hz.
Whether a sound is periodic and where in its spectrum energy
is concentrated are nearly sufficient to distinguish all speech
sounds from one another acoustically, and these two properties also reflect the two components of the theoretical model for
transforming articulations into acoustics. All voiced sounds are
periodic, as are trills. Sonorants (vowels, glides, liquids, nasals)
are usually periodic, while obstruents (fricatives, stops, affricates) are aperiodic. Voiced obstruents are both periodic and
aperiodic. Differences in vowel quality and consonantal place of
articulation are both realized acoustically as differences in where
energy is concentrated in their spectra. The remaining property
is duration, which besides conveying short:long contrasts also
contributes to the conveying of tense:lax contrasts between vowels and the voicing and manner contrasts between consonants.

Theory
Speech sounds are the product of the application of a filter that
determines the frequencies in which energy is concentrated

81

Acoustic Phonetics
First resonance

Glottis = closed

Minimum

released or continuously through a fricatives narrow constriction. In strident sounds, the jet breaks up against a baffle
just downstream, increasing turbulence and noise intensity
considerably.

Lips = open

Maximum

2.5

7.5

10

12.5

15

17.5

(a)

Distance from glottis (cm)

Second resonance

2.5

7.5

10

12.5

15

17.5

(b)

Distance from glottis (cm)

Third resonance

2.5

7.5

10

Distance from glottis (cm)

12.5

15

17.5

(c)

Figure 2. The oral cavity as a tube closed at one end and open at the
other: (ac) standing waves corresponding to the first three resonances,
each with a velocity minimum at the closed end and a velocity maximum
at the open end.

RESONANCE. Both periodic and aperiodic sound sources introduce acoustic energy into the oral cavity across a broad enough
range of frequencies to excite any resonance of the air inside
the oral cavity. If the articulators are in their rest positions
and vocal folds are in the voicing position, this cavitys shape
approximates a uniform tube, closed at the glottis and open at
the lips (Figure 2). A resonance is produced by the propagation of acoustic energy away from the source and its reflection
back and forth off the two ends of the tube, which establishes a
standing wave. In a standing wave resonance, the locations of
zero and maximum pressure variation are fixed.To understand
how air resonates, it is easier to consider the change in pressure
level in the standing wave, rather than pressure level itself, that
is, the extent to which the air molecules are being displaced
longitudinally, or equivalently the velocity of the pressure
change, rather than the extent of their instantaneous compression or rarefaction. Air is most freely displaced longitudinally
at the open end, the lips, and least freely at the closed end, the
glottis. As a result, the standing waves that fit best inside the
oral cavity are those whose wavelengths, and thus frequencies,
are such that they have a velocity maximum (antinode) at the
lips, and a velocity minimum (node) at the glottis. Because the
air resonates more robustly at these frequencies than at others, the oral cavitys resonant response filters the sound source,
passing energy in the source at some frequencies and stopping
it at others.
Figures 2ac show the three lowest-frequency standing
waves that fit these boundary conditions. How are their frequencies determined? Figures 2ac show that one-quarter
of the first resonances wavelength spans the distance from
the glottis to the lips (the oral cavitys length, Loc), threequarters of the seconds, and five-quarters of the thirds. More
generally:
(3)

in the sounds spectrum to a periodic and/or aperiodic sound


source.
SOUND SOURCES. Sound sources are produced by using valves
that control air flow through the vocal tract to transform the
energy in that flow into sound.In periodic sound sources, the
flow of air causes a valve to open and close rapidly and regularly,
which in turn causes air pressure to rise and fall just downstream.
The repeated opening and closing of the glottis, known as vocal
fold vibration, is the most common periodic sound source; others are uvular, alveolar, and bilabial trills.
Aperiodic sound sources are produced by keeping a valve
completely or nearly closed, in stops and fricatives, respectively. Either way, oral air flow is obstructed enough that
oral air pressure rises behind the obstruction. This pressure
rise speeds up flow enough to turn it into a turbulent and
thus noisy jet, in either a brief burst when a stop closure is

82

Loc =

2n 1
n
4

where n is the resonance number. Solving for wavelength and


substituting into (2) yields:

(4)

Fn =

c
4
Loc
2n 1

Substituting 35,000 cm/sec for c and 17.5 cm for Loc (the average adult males oral cavity length) yields 500, 1,500, and 2,500
Hz as the first three resonances frequencies, values close to
schwas.
Because the variable Loc is in the denominator, resonance
frequencies are lower in adults and mens longer oral cavities than in childrens or womens shorter ones, and likewise when the lips are protruded in a rounded vowel, such

Acoustic Phonetics

Figure 3. Spectra of the vowels (a) [i] as in heed, (b) [u] as in whod,
and (c) [a] as in hod. The individual peaks are the harmonics of the fundamental frequency (F0) of voice sound source, and the formants are the
ranges of amplified harmonics. Peaks corresponding to F1F3 are labeled
at the top of each panel.

as the [u] in whod, rather than spread in an unrounded one,


such as the [i] in heed. These observations yield the length
rule: Resonance frequencies vary inversely with resonating
cavity length.
The first three formants of [i] (Figures 1b, c) differ decidedly in frequency from those of schwa, because raising the
tongue body toward the front of the palate decreases the oral
cavitys cross-sectional area there and increases it in the pharynx, while spreading the lips, which shortens the oral cavity.
Although the length rule predicts how shortening changes formant frequencies, an additional rule is needed to predict how
decreasing and increasing cross-sectional area affects formant
frequencies.
The predictions of two heuristics that are widely used for
this purpose are tested here against the observed formant
frequency differences between the three vowels [i, u, a], and
between the three places of articulation of the stops in [b,
d, g]. F1 is lower in the high vowels [i, u] (Figures 3a, b)
than in schwa, but higher in the low vowel [a] (Figure 3c). F2

Figure 4. Spectrograms of the first 150 ms of the words (a) bad, (b)
dad, and (c) gad. The onsets of F1F3 are labeled.

is higher in front unrounded [i] than schwa but lower in low


back unrounded [a] and especially high back rounded [u]. F1
varies inversely with tongue height, and F2 varies directly with
tongue backness and lip rounding. The formant frequencies
of all other vowel qualities lie between the extremes observed
in these vowels, just as all other vowels lingual and labial
articulations lie between those of these vowels. F1 starts low
and rises following [b, d, g] (Figure 4), both F2 and F3 start low
following [b] (Figure 4a), both formants start higher following
[d] (Figure 4b), and they diverge from very similar frequencies
following [g] (Figure 4c). Although consonants are articulated
at other places of articulation, these three places are distinguished in nearly all languages, and many distinguish only
these three.
The first heuristic treats the constriction as dividing the oral
cavity into separate resonating cavities (Figure 5), and it applies
the length rule independently to each of them. The first three formants are the three lowest of the six resonances produced by the
two cavities. This heuristic may be called the cavity association
heuristic because each formant can be associated with the cavity
from which it came. There are two complications. First, the cavity behind the constriction is effectively closed at both ends, and

83

Acoustic Phonetics
constriction

back

8 cm2

1 cm2

front

3 cm
6 cm

8.5 cm

Figure 5. The configuration of the oral cavity with a constriction partway along its length.

so its resonances must have velocity minima at both ends. Their


frequencies are predicted by:

(5)

Fn =

c
2
n Lrc

where Lrc is the length of the resonating cavity. The second complication is that the acoustic interaction of the constriction with
the cavity behind it produces a Helmholtz resonance. Its frequency (Fh) is:
(6) Fh =

c
2

Ac
Ab Lb Lc

Ac is the constrictions cross-sectional area, Lc is its length, and


Ab and Lb are the cross-sectional area and length of the cavity
behind the constriction. If a 3 cmlong constriction with a crosssectional area of 1 cm2 is moved incrementally from 3 cm above
the glottis to 0.5 cm back of the lips along a 17.5 cm oral cavity,
the back and front cavities produce the resonance frequencies
displayed in Figure 6, along with the Helmholtz resonance. The
arrows projected down from the intersections between back and
front cavity resonances show where F2 and F3 change association from the front to the back cavity.
The constriction centers in [a, u, i] are roughly one-quarter
(4.5 cm), two-thirds (11.5 cm), and three-quarters (13 cm) of the
distance from the glottis to the lips. The constriction of [g] is close
to [u]s, while [d]s is about seven-eighths the distance from the
glottis (15.5 cm), and [b]s is of course at the lips (17.5 cm).
The Helmholtz resonance is lowest for all constriction locations and thus constitutes F1. It also lowers progressively as the
constriction is moved forward because the cavity behind the
constriction lengthens. The cavity-association heuristics successful predictions include the following : 1) The low or pharyngeal vowel [a] has a higher F1 than the high or velar and palatal
vowels [u, i]; 2) F1 is low following [g, d, b]; 3) [a]s F2 (the front
cavitys first resonance) is low, 4) F2 and F3 (the front and back
cavities first resonances) start at very similar frequencies following [g], because a velar constriction is close to where the front
and back cavities first resonances cross at 11 cm from the glottis;
5) F2 and F3 start low following [b] (the back cavitys first and
second resonances). It incorrectly predicts: 6) The F2 of [i] (the
back cavitys first resonance) is low, indeed lower than [u]s; and
7) F2 and F3 (the back cavitys first and second resonances) are
low following [d]. For [u], the calculations leave out the acoustic

84

Figure 6. The first three resonance frequencies of the back cavity (filled
symbols) and front cavity (empty symbols) and the Helmholtz resonance
(crosses) produced by incremental movement of the constriction in
Figure 5.

effects of lip rounding, which closes the front cavity at both ends
and introduces another Helmholtz resonance. None of the resonances produced by this front cavity are lower than the back cavity resonances, but the additional Helmholtz resonance is low
enough to constitute the F2 observed in [u] (657 Hz if the labial
constriction has a cross-sectional area of 1 cm2 and a length of 2
cm, and the front cavity is 4.5 cm long).
In the second, perturbation, heuristic, a constrictions proximity to a resonances velocity minimum or maximum determines how it perturbs that resonances frequency away from its
schwa-like value: A constriction near a velocity minimum raises
the formants frequency, while one near a maximum lowers it
instead (expansions have the opposite effects). Figures 4ac show
that minima occur at even quarters of a resonances wavelength
and maxima at odd quarters, and that their locations are at fixed
proportions of the length of the oral cavity. Because constriction locations are also a certain proportion of the distance from
the glottis to the lips, whether they coincide with a minimum or
maximum can be calculated by multiplying both sides of (4) by
the proportion of the oral cavitys length that corresponds to the
constrictions location and rounding the result on the right-hand
side to the nearest quarter (Table I).
The perturbation heuristic successfully predicts the effects of
the bilabial, palatal, velar, and pharyngeal constrictions on all
three formants of [b, i, g, a], and likewise the effects of the alveolar and velar constrictions on F1 and F3 in [d, u], but it fails to
predict F2 raising after [d], and F2 lowering in [u]. The latter can

Adaptation
Table 1. Calculating a constrictions proximity to a resonances velocity minimum or maximum from the constrictions proportional distance from
the glottis to the lips.
Place of
constriction

Segment

Proportion of oral
cavity length

Proximity to minimum or maximum


Calculation

Odd/even

Lower/higher

Labial

1*1/4=1/4
1*32/4=3/4
1*53/4=1/4

Odd
Odd
Odd

F1 lower
F2 lower
F3 lower

Alveolar

7/8

7/8*1/4=7/321/4
7/8*32/4=21/323/4
7/8*53/4=35/324/4

Odd
Odd
Even

F1 lower
F2 lower
F3 higher

Palatal

3/4

3/4*1/4=3/161/4
3/4*32/4=9/162/4
3/4*53/4=15/164/4

Odd
Even
Even

F1 lower
F2 higher
F3 higher

Velar

g, u

2/3

2/3*1/4=2/121/4
2/3*32/4=6/122/4
2/3*53/4=10/123/4

Odd
Even
Odd

F1 lower
F2 higher
F3 lower

Pharyngeal

1/4

1/4*1/4=1/160/4
1/4*32/4=3/161/4
1/4*53/4=5/161/4

Even
Odd
Odd

F1 higher
F2 lower
F3 lower

again be predicted once the acoustic effects of lip rounding are


added, as the simultaneous labial constriction, together with the
protrusion of the lips, lowers F2 along with all other formants.

Summary
Speech sounds articulations produce sound sources by transforming aerodynamic energy into acoustic form, and those
sound sources in turn cause air inside the oral cavity to resonate,
at frequencies determined by the length of the resonating cavities and where they are constricted.
John Kingston
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Fant, C., and M. Gunnar. 1960. Acoustic Theory of Speech Production. The
Hague: Mouton.
Jakobson, Roman, C. Fant, M. Gunnar, and Morris Halle. 1952.
Preliminaries to Speech Analysis, Cambridge, MA: MIT Press.
Ladefoged, Peter, and Ian Maddieson. 1996. Sounds of the Worlds
Languages. Oxford: Blackwell Publishers.
Stevens, Kenneth N. 1998. Acoustic Phonetics. Cambridge, MA: MIT Press.

ADAPTATION
An adaptation is a characteristic in an organism that evolved
because it helped the organism or its relatives to survive and
reproduce. Examples include the vertebrate eye, claws, mammary glands, the immune system, and the brain structures that
underlie the human capacity for language. More completely,
an adaptation is 1) a reliably developing set of characteristics
2) whose genetic basis became established and organized in the
species (or population) over evolutionary time because 3) the
adaptation interacted with recurring features of the body or environment 4) in a way that, across generations, typically caused
this genetic basis to increase its gene frequency.

If a characteristic lacks any of these features, it is not an adaptation. An adaptation is not, therefore, simply anything in an
individual with a good or functional outcome, or that has
useful effects by intuitive standards. Rice cultivation, useful as it
is, is not a biological adaptation because it lacks a specific genetic
basis. Similarly, the English language is not an adaptation,
however useful it might be. In contrast, if a mutation occurred
that modified a neural structure so that the vocal chords could
more reliably produce distinct phonemes, and this gene spread
throughout the species because its bearers prospered due to the
advantages resulting from a lifetime of more efficient communication, then the modified neural structure would qualify as an
adaptation.
Researchers judge whether something is an adaptation by
assessing how likely or unlikely it is that its functional organization was produced by random mutation and spread by genetic
drift. For example, the eye has hundreds of elements that are
arranged with great precision to produce useful visual inputs. It is
astronomically unlikely that they would have arrived at such high
levels of mutual coordination and organization for that function
unless the process of natural selection had differentially retained
them and spread them throughout the species. Consequently,
the eye and the visual system are widely considered to be obvious examples of adaptations. For the same reason, evolutionary
scientists consider it overwhelmingly likely that many neurocognitive mechanisms underlying language are adaptations for
communication (a proposition that Noam Chomsky has disputed; see Lyle Jenkinss essay, Explaining Language, in this
volume). Language competence reliably develops, is believed
to have a species-typical genetic basis, and exhibits immensely
complex internal coordination that is functionally organized to
produce efficient communication, which vastly enhances the
achievement of instrumental goals, plausibly including those
linked to fitness.

85

Ad hoc Categories
Within the evolutionary sciences, the concept of adaptation
plays an indispensable role not only in explaining and understanding how the properties of organisms came to be what they
are, but also in predicting and discovering previously unknown
characteristics in the brains and bodies of species.Evolutionary
psychologists, for example, analyze the adaptive problems our
ancestors were subjected to, predict the properties of previously unknown cognitive mechanisms that are expected to have
evolved to solve these adaptive problems, and then conduct
experimental studies to test for the existence of psychological
adaptations with the predicted design (see evolutionary
psychology). An understanding that organisms embody sets
of adaptations rather than just being accidental agglomerations
of random properties allows organisms to be properly studied as
functional systems. If language is accepted as being the product
of adaptations, then there is a scientific justification for studying
the underlying components as part of a functional system.
The concept of adaptation became more contentious when
human behavior and the human psychological architecture
began to be studied from an adaptationist perspective. Critics
have argued that not every characteristic is an adaptation an
error adaptationists also criticize. More substantively, critics
have argued that it is impossible to know what the past was like
well enough to recognize whether something is an adaptation.
Adaptationists counter that we know many thousands of things
about the past with precision and certainty, such as the threedimensional nature of space and the properties of chemicals, the
existence of predators, genetic relatives, eyes, infants, food and
fertile matings, and the acoustical properties of the atmosphere,
and that these can be used to gain an engineers insight into why
organisms (including humans) are designed as they are.
Julian Lim, John Tooby, and Leda Cosmides
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Gould, S. J., and R. C. Lewontin. 1979. The spandrels of San Marco and
the Panglossian paradigm: A critique of the adaptationist programme.
Proceedings of The Royal Society of London, Series B 205.1161: 58198.
Pinker, Steven. 1994. The Language Instinct. New York: Morrow.
. 2003. Language as an adaptation to the cognitive niche. In
Language Evolution, ed. M. Christiansen and S. Kirby, 1637. New
York: Oxford University Press.
Tooby, John, and I. DeVore. 1987. The reconstruction of hominid
behavioral evolution through strategic modeling. In The Evolution
of Primate Behavior: Primate Models. ed. W. G. Kinsey, 183237. New
York: SUNY Press.
Williams, George C. 1966. Adaptation and Natural Selection: A Critique
of Some Current Evolutionary Thought. Princeton, NJ: Princeton
University Press.

AD HOC CATEGORIES
An ad hoc category is a novel category constructed spontaneously to achieve a goal relevant in the current situation (e.g., constructing tourist activities to perform in Beijing while planning a
vacation). These categories are novel because they typically have
not been entertained previously. They are constructed spontaneously because they do not reside as knowledge structures in
long-term memory waiting to be retrieved. They help achieve a

86

relevant goal by organizing the current situation in a way that


supports effective goal pursuit.
Ad hoc categories contrast with thousands of well-established
categories associated with familiar words (e.g., cat, eat, happy).
Extensive knowledge about these latter categories resides in
memory and may often become active even when irrelevant to
current goals. When ad hoc categories are used frequently, however, they, too, become highly familiar and well established in
memory. The first time that someone packs a suitcase, the category things to pack in a suitcase is ad hoc. Following many trips,
however, it becomes entrenched in memory.
Ad hoc categories constitute a subset of role categories,
where roles provide arguments for verbs, relations, and schemata. Some role categories are so familiar that they become
lexicalized (e.g., seller, buyer, merchandise, and payment name
the agent, recipient, theme, and instrument roles of buy). When
the conceptualization of a role is novel, however, an ad hoc category results (e.g., potential sellers of gypsy jazz guitars). Pursuing
goals requires the constant specification and instantiation of
roles necessary for achieving them. When a well-established category for a role doesnt exist, an ad hoc category is constructed
to represent it.
Both conceptual and linguistic mechanisms appear central to
forming ad hoc categories. Conceptually, people combine existing concepts for objects, events, settings, mental states, properties, and so on to form novel conceptual structures. Linguistically,
people combine words in novel ways to index these concepts.
Sometimes, novel concepts result from perceiving something
novel and then describing it (e.g., seeing a traditional opera set in
a modern context and describing this newly encountered genre
as modernized operas). On other occasions, people combine
words for conceptual elements before ever encountering an
actual category instance (e.g., describing mezzo sopranos who
have power, tone, and flexibility before experiencing one). The
conceptual and linguistic mechanisms that formulate ad hoc
categories are highly productive, given that components of these
categories can be replaced systematically with alternative values
from semantic fields (e.g., tourist activities to perform in X, where
X could be Rome, Florence, Venice, etc.). Syntactic structures
are also central for integrating the conceptual/linguistic components in these categories (e.g., the syntax and accompanying
closed class words in tourist activities to perform in Rome).
Lawrence Barsalou (1983) introduced the construct of ad hoc
categories in experiments showing that these categories are not
well established in memory and do not become apparent without
context. Once constructed, however, they function as coherent
categories, exhibiting internal structures as indexed by typicality
gradients. Barsalou (1985) showed that these gradients are organized around ideal values that support goal achievement and
also around frequency of instantiation. He also showed (1987)
that these internal structures are generally as stable and robust
as those in familiar taxonomic categories.
Barsalou (1991) offered a theoretical framework for ad hoc
categories (see also Barsalou 2003). Within this framework, ad
hoc categories provide an interface between roles in knowledge structures (e.g., schemata) and the environment. When a
role must be instantiated in order to pursue a goal but knowledge of possible instantiations does not exist, people construct

Ad hoc Categories

Adjacency Pair

an ad hoc category of possible instantiations (e.g., when going


camping for the first time, constructing and instantiating
activities to perform on a camping trip). The particular instantiations selected reflect their fit with a) ideals that optimize
goal achievement and b) constraints from the instantiations
of other roles in the knowledge structure (e.g., activities to
perform on a camping trip should, ideally, be enjoyable and
safe and should depend on constraints such as the vacation
location and time of year). Once established, the instantiations
of an ad hoc category are encoded into memory and become
increasingly well established through frequent use (e.g., establishing touring back roads and socializing around the campground as instances of activities to perform on a camping trip).
Barsalou (1999) describes how this framework can be realized
within a perceptual symbol system. Specifically, categories
(including ad hoc categories) are sets of simulated instances
that can instantiate the same space-time region of a larger
mental simulation (where a simulation is the reenactment of
modality-specific states, as in mental imagery).
Ad hoc categories have been studied in a variety of empirical contexts. S. Glucksberg and B. Keysar (1990) proposed that
ad hoc categories underlie metaphor (e.g., the metaphor jobs are
jails conceptualizes the category of confining jobs). C. J. Cech, E.
J. Shoben, and M. Love (1990) found that ad hoc categories are
constructed spontaneously during the magnitude comparison
task (e.g., forming the ad hoc category of small furniture, such
that its largest instances anchor the upper end of the size dimension). F. Vallee-Torangeau, S. H. Anthony, and N. G. Austin
(1998) found that people situate taxonomic categories in background settings to form ad hoc categories (e.g., situating fruit
to produce fruit in the produce section of a grocery store). E. G.
Chrysikou (2006) found that people rapidly organize objects into
ad hoc categories that support problem solving (e.g., objects that
serve as platforms).
Research has also addressed ad hoc categories that become
well established in memory, what Barsalou (1985, 1991) termed
goal-derived categories (also called script categories, slot filler
categories, and thematic categories). J. Luciarello and K. Nelson
(1985) found that children acquire goal-derived categories associated with scripts before they acquire taxonomic categories (e.g.,
places to eat). B. H. Ross and G. L. Murphy (1999) examined how
taxonomic and goal-derived concepts simultaneously organize
foods (e.g., apples as belonging simultaneously to fruit and snack
foods). D. L. Medin and colleagues (2006) found that goal-derived
categories play central roles in cultural expertise (e.g., tree experts
form categories relevant to their work, such as junk trees).
Although ad hoc and goal-derived categories are ubiquitous
in everyday cognition, they have been the subject of relatively
little research. Much further study is needed to understand
their structure and role in cognition. Important issues include
the following: How do productive conceptual and linguistic
mechanisms produce ad hoc categories? How do these categories support goal pursuit during situated action? How do these
categories become established in memory through frequent use?
How does the acquisition of these categories contribute to expertise in a domain?
Lawrence W. Barsalou

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Barsalou, L. W. 1983. Ad hoc categories. Memory & Cognition
11: 21127.
. 1985. Ideals, central tendency, and frequency of instantiation as determinants of graded structure in categories. Journal
of Experimental Psychology: Learning, Memory, and Cognition
11: 62954.
. 1987. The instability of graded structure: Implications for the nature
of concepts. In Concepts and Conceptual Development: Ecological
and Intellectual Factors in Categorization, ed. U. Neiser, 10140.
Cambridge: Cambridge University Press.
. 1991. Deriving categories to achieve goals. In The Psychology of
Learning and Motivation: Advances in Research and Theory. Vol. 27.
Ed. G. Bower, 164. San Diego, CA: Academic Press.
. 1999. Perceptual symbol systems. Behavioral and Brain Sciences
22: 577660.
. 2003. Situated simulation in the human conceptual system.
Language and Cognitive Processes 18: 51362.
Cech, C. J., E. J. Shoben, and M. Love. 1990. Multiple congruity effects in judgments of magnitude. Journal of Experimental
Psychology: Learning, Memory, and Cognition 16: 114252.
Chrysikou, E. G. 2006. When shoes become hammers: Goal-derived categorization training enhances problem-solving performance. Journal
of Experimental Psychology: Human Learning and Performance
32: 93542.
Glucksberg, S., and B. Keysar. 1990. Understanding metaphorical comparisons: Beyond similarity. Psychological Review 97: 318.
Lucariello, J., and K. Nelson. 1985. Slot-filler categories as memory organizers for young children. Developmental Psychology
21: 27282.
Medin, D. L., N. Ross, S. Atran, D. Cox, J. Coley, J. Proffitt, and S. Blok.
2006. Folkbiology of freshwater fish. Cognition 99: 23773.
Ross, B. H., and G. L. Murphy. 1999. Food for thought: Cross-classification
and category organization in a complex real-world domain. Cognitive
Psychology 38: 495553.
Valle-Tourangeau, F., S. H. Anthony, and N. G. Austin. 1998. Strategies
for generating multiple instances of common and ad hoc categories.
Memory 6: 55592.

ADJACENCY PAIR
conversation analysis, an inductive approach to the microanalysis of conversational data pioneered by Harvey Sacks
(1992), attempts to describe the sequential organization of
pieces of talk by examining the mechanics of the turn-taking system. Adjacency pairs reflect one of the basic rules for turn-taking
(Sacks, Schegloff, and Jefferson 1974), in which a speaker allocates the conversational floor to another participant by uttering
the first part of a paired sequence, prompting the latter to provide the second part. Examples are question-answer, greetinggreeting as in (1), and complaint-excuse:
(1) A: Hi there
B: Oh hi

The constitutive turns in adjacency pairs have the following


structural characteristics:
(i) They are produced by two different speakers.
(ii) They are, as the term suggests, adjacent. This is not a strict
requirement, as the two parts can be separated by a so-called
insertion sequence, as in (2):

87

Age Groups
(2) A:
B:
A:
B:

Whats the time now? (Question 1)


Dont you have a watch? (Question 2)
No. (Answer 2)
I think its around three. (Answer 1)

Static versus Dynamic Theories

(iii) They are organized as a first and a second part, that is,
they are nonreversible. This is the case, incidentally, even in
ostensibly identical first and second parts, such as the greeting-greeting pair in (1), where reversing the order results in
an aberrant sequence.
(iv) They are ordered, so that a particular first part requires a
relevant second part (e.g., greetings do not follow questions).
The fact that the second part is conditionally relevant on the first
part does not mean that only one option is available; in fact, certain first parts typically allow for a range of possible second parts.
If two (or more) options are possible, one will be the more socially
acceptable, preferred response, the other(s) being dispreferred;
this phenomenon is known as preference organization, as in:
(3) A: Have a piece of cake (first part)
B1: Great thanks I will (preferred second)
B2: Ehm actually Ive just eaten but thanks anyway (dispreferred second)

As illustrated in (3), dispreferred second parts tend to be structurally different from preferred seconds (B2 being indirect, including an explanatory account, and containing hesitation markers,
unlike B1).
Another, related phenomenon that merits mention here is
presequencing: Certain adjacency pairs can be introduced or
foreshadowed by a preceding exchange, as in:
(4) A1:
B1:
A2:
B2:

Do you sell fresh semiskimmed milk?


We sure do
Ill have two bottles then please
OK

Real-Time Change

This whole exchange forms one unit, in the sense that the occurrence of the question-answer pair A1-B1 is only interpretable
given the subsequent request-compliance adjacency pair A2-B2.
Phenomena such as this are indicative of a level of sequential
organization in conversation beyond two-turn sequencing (see
Schegloff 2007).
Ronald Geluykens
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Sacks, Harvey. 1992. Lectures on Conversation. Oxford: Blackwell.
Sacks, Harvey, Emanuel A. Schegloff, and Gail Jefferson. 1974. A simplest systematics for the organization of turn-taking in conversation.
Language 50: 696735.
Schegloff, Emanuel A. 2007. The Language of Turn and Sequence.
Cambridge: Cambridge University Press.

AGE GROUPS
Age is one of the primary independent variables in
sociolinguistics, along with social class, sex, ethnicity, and
region. Age is the primary social correlate for determining that
language is changing and for estimating the rate of change and
its progress.

88

Mundane observations make it abundantly clear that language


change is not punctual but gradual, and not categorical but
variable. Traditional views of language change imposed methodological restrictions to avoid viewing change while it was progressing. Linguists had little confidence in their ability to discern
change precisely and accurately amid the apparent lawlessness
of social phenomena, as Edward Sapir incisively put it (1929,
213). However, in the 1960s, linguists began studying language in
its social context. By that time, economics, anthropology, sociology, and other social sciences were well established, and linguistic studies belatedly admitted sociolinguistics, its social science
adjunct (Chambers 2002b).
Viewing language changes as they progressed entailed the
admission of coexisting linguistic entities as data. Linguists
were required to study the social distribution of, for example,
both sneaked and snuck as variants of the past tense of sneak
(as in the example to be discussed). Dealing coherently with
variables necessitated determining the distribution of variants with certain social factors, including the age of the speakers. Uriel Weinreich, William Labov, and Marvin I. Herzog, in
the document that became the manifesto for viewing language
change in its social context, said: A model of language which
accommodates the facts of variable usage and its social and
stylistic determinants not only leads to more adequate descriptions of linguistic competence, but also naturally yields a theory
of language change that bypasses the fruitless paradoxes with
which historical linguistics has been struggling for half a century (1968, 99). Primary among those paradoxes, of course,
were concepts of change as punctual, categorical, and static. By
admitting social variables, it became possible to view change as
gradual, variable, and dynamic, consistent with commonsense
observations.

The study of the progress of change can be carried out in real time
by revisiting survey sites at intervals and observing changes in the
social distribution of variants from one time to the next. Because
changes will not necessarily be completed in the interval but will
be continuing, their progress must be calculated by some kind
of proportional measure, such as the percentage of the variants,
their relative frequency, or their probabilities. The proportional
differences from one visit to the next provide a quantitative measure of the progress of the change.
Studying change in real time has a number of methodological disadvantages. Most obvious is the time required before
attaining a result. Locating subjects on subsequent visits also
poses obvious problems because of mobility, cooperation, or
death. Furthermore, subsequent visits require the addition of
new subjects at the youngest age group each time. This is necessary because of the manner in which linguistic innovations
typically diffuse (see diffusion ) throughout the population.
Instead of spreading outward from the source and affecting
almost everybody in their sphere of influence, as infectious
diseases do in epidemics and technological adoptions do
when, say, spin dryers replace washboards, linguistic innovations tend to be stratified. Under ordinary circumstances,
people acquire their accents and dialects in their formative

Age groups
100
90
80
70
60
50

Figure 1. Percentage of people in different age groups who


say snuck, not sneaked, as past tense of sneak in the
Golden Horseshoe, Canada (Chambers 2002a, 36466).

40
30
20
10

years, between 8 and 18, and maintain them throughout their


lives. People who grow up saying sneaked as the past tense
of sneak tend to use that form all their lives, even after they
come to know that younger people in their region say snuck
instead. Because of this stratification, the progress of a linguistic change is not measurable in the life span of an individual
or one age group but only in comparison between individuals
whose formative years are not the same, that is, between different age groups.

Change in Progress
Correlating linguistic variants with their use by age groups in
the community as the change is taking place is a way of measuring its progress. Inductively, change is evident when a linguistic
variant occurs with greater frequency in the speech of younger
people than in the speech of their elders. There are some exceptions (such as age grading), but the inference of change can be
made with reasonable confidence when the frequency of the
variant is stratified from one age group to the next (Labov 2001).
That is, linguistic changes are almost never bimodal, with one
variant occurring in the speech of younger people and a different one in the speech of older people. Instead, the variants are
typically dispersed along the age continuum in a progressive
gradation.
Figure 1 provides a case study. The variable is the past tense of
the verb sneak, with variants sneaked and snuck. The community
is the Golden Horseshoe, the densely populated region in southern Ontario, Canada. Figure 1 shows the percentage of people
who say snuck, not sneaked, correlated with their age, from octogenarians to teenagers. The correlation shows a progression from
18 percent in the oldest group to 98 percent in the youngest, with
gradation in the intermediate age groups (29 percent of 70s, 42
percent of 60s, and so on).
Other things being equal, it is possible to draw historical
inferences from apparent-time displays like Figure 1. The survey from which the data are drawn took place in 1992. Among
people born 80 or more years prior, that is, before 1913,
sneaked was the standard variant and snuck had very little

1419

2029

3039

4049

5059

6069

7079

0ver 80

currency. It gained currency steadily thereafter, however, and


accelerated most rapidly in the speech of the 50-year-olds,
people born in 193342, increasing by some 25 percent and
becoming the variant used by almost 70 percent of them. In
subsequent decades, it was adopted by ever-greater numbers.
In the 1980s, the formative years for people born in 19738, the
teenagers in this survey, snuck virtually eliminated sneaked as
a variant.
Changes normally take place beneath the level of consciousness. Young people seldom have a sense of the history of the
variants. In this case, people under the age of 30 often consider the obsolescent form sneaked to be a mistake when they
hear someone say it. There is no communal sense that sneaked
was the historical standard and accepted form for centuries.
Occasionally, changes become self-conscious in the early stages
if teachers, writers, or parents openly criticize them. Such criticisms almost never succeed in reversing trends, though they
may slow their momentum. When the incoming variant gains
enough currency, usually around 2030 percent, its use accelerates rapidly. It then slows again as it nears completion. The
graphic pattern is known as the S-curve in innovation diffusion,
with relatively slow (or flat) movement up to 2030 percent, a
rapid rise through the middle stages, and flattening again in the
final stages.
Lack of communal consciousness of linguistic changes
in progress is a consequence of its social stratification (see
inequality, linguistic and communicative ). Changes
generally progress incrementally, so that differences between
the most proximate age groups are small and barely noticeable.
In Figure 1, 30-year-olds differ from 40-year-olds by about 10
percent and from 20-year-olds almost not at all. The difference between 30-year-olds and 70-year-olds, by contrast, is
over 60 percent. Social relations are closest among age-mates,
and the gradation of differences so that proximate age groups
are most like one another blunts the perception of generation gaps within the community. By minimizing awareness of
changes as they progress, social gradation is a unifying force in
communities.

89

Age groups

Apparent-Time Hypothesis
Figure 1 draws historical inferences of change based on the
behavior of age groups surveyed at the same time. The replacement of sneaked by snuck is not directly observed as it would be
in real-time studies, in which researchers go to communities at
intervals and track the changes from one time to the next. Instead,
the inference of change is based on the assumption that under
normal circumstances, people retain the accents and dialects
acquired in their formative years. That assumption is known as
the apparent-time hypothesis. Common experience tells us that
the hypothesis is not without exceptions. People are sometimes
aware of expressions that they once used and no longer do, and
sometimes they will have changed their usage after their formative years. If such linguistic capriciousness took place throughout
the entire community, it would invalidate historical inferences
drawn from apparent-time surveys.
However, community-wide changes beyond the formative
years are rare. Real-time evidence, when available, generally
corroborates apparent-time inferences. In the case of sneaked/
snuck, for instance, earlier surveys made in the same region in
1950 and 1972 show proportional distributions of the variants
that approximate the apparent-time results. However, inferring
linguistic changes from the speech of contemporaneous age
groups is not a direct observation of that change. It remains a
hypothesis, and its validity must be tested wherever possible.

Age-Graded Changes
Deeper understanding of linguistic change in progress should
ultimately lead to predictable classes of deviation from the
apparent-time hypothesis. One known deviation is age-graded
change. These are changes that are repeated in each generation,
usually as people reach maturity (Chambers 2009, 200206).
Age-graded changes are usually so gradual as to be almost
imperceptible, so that tracking their progress is challenging.
As an example, in all English-speaking communities, there is a
rule of linguistic etiquette that requires compound subject noun
phrases (NPs) to list the first-person pronoun (the speaker) last.
Adults say Robin and I went shopping, and never say I and
Robin went shopping. There is no linguistic reason for this rule
(that is, the sentences mean the same thing either way), but
putting oneself first is considered impolite (see politeness).
Children, however, do not know this and tend to say Me and
Robin went shopping. At some as-yet-undetermined age, children become aware of the rule and change their usage to conform to adult usage.
Age-graded changes like these violate the apparent-time
hypothesis because the variants used by young people do not
persist throughout their lifetimes. Instead, young people change
as they reach maturity and bring their usage into line with adults.
The occurrence of age-graded changes does not refute the apparent-time hypothesis, but they provide a well-defined exception
to it. Failure to recognize them as age-graded can lead to an erroneous inference that change is taking place.

Age and Language Change


Introducing age groups into linguistic analysis as an independent variable yielded immediate insights into the understanding

90

Aging and Language


of how languages change, who the agents of change are, and how
changes diffuse throughout communities. The sociolinguistic
perspective on language change as dynamic, progressive, and
variable represents an advance in language studies. The principal theoretical construct, the apparent-time hypothesis, provides a comprehensive view of historical sequences from a single
methodological vantage point.
J. K. Chambers
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chambers, J. K. 2002a. Patterns of variation including change. In The
Handbook of Language Variation and Change, ed. J. K. Chambers,
Peter Trudgill, and Natalie Schilling-Estes, 34972. Oxford: Blackwell
Publishing.
. 2002b. Studying language variation: An informal epistemology. In The Handbook of Language Variation and Change, ed.
J. K. Chambers, Peter Trudgill, and Natalie Schilling-Estes, 314.
Oxford: Blackwell Publishing.
. 2009. Sociolinguistic Theory: Linguistic Variation and Its Social
Significance. 3rd ed. Oxford: Blackwell.
Labov, William. 2001. Principles of Linguistic Change: Social Factors.
Oxford: Blackwell Publishing.
Sapir, Edward. 1929. The status of linguistics as a science. Language
5: 20714.
Weinreich, Uriel, William Labov, and Marvin I. Herzog. 1968. Empirical
foundations for a theory of language change. In Directions for
Historical Linguistics: A Symposium, ed. Winfred P. Lehmann and
Yakov Malkiel, 95188. Austin: University of Texas Press.

AGING AND LANGUAGE


It is well-documented that in healthy aging, some aspects of linguistic ability, for example, phonology, syntax, and vocabulary, remain generally preserved into very old age. However,
other domains of language, for example, naming, comprehension and spoken discousre, undergo declines, albeit only at very
late stages in the adult life span. It is interesting to note that
although these linguistic changes occur generally in the older
adult population as a group, there are many older adult individuals who do not experience these deficits but, rather, continue to
perform as well as younger individuals. Thus, the finding of great
inter-individual variability should be considered.
This entry presents a detailed review of the three main linguistic domains experiencing decline with age. Additionally, a review
of language changes due to cognitive deterioration, for example,
dementia, is presented.

Naming
A common complaint among the healthy aging population is the
increased frequency of word-finding problems in their everyday
speech. A subset of these naming problems is often colloquially described as the tip-of-the-tongue (TOT) phenomenon. In
terms of cognitive models of lexical access, TOTs are described
as a type of word retrieval failure whereby individuals are able to
access the conceptual, semantic, syntactic, and even some
phonological/orthographic information (e.g., number of
syllables or initial sound/letters) of the target word, not enough
information, however, to fully phonologically encode the word

Aging and Language


for articulation. Research clearly supports the view that there is a
general breakdown in phonological encoding (James and Burke
2000). However, since this stage of processing involves various
substages (e.g., segment retrieval, syllable encoding) that occur
at very fast rates, as per Willem J. M. Levelt, Ardi Roelofs and
Antje S. Meyers (1999) model, behavioral methods are limited
in their ability to identify the locus of processing difficulty.
Evidence from priming studies has strongly demonstrated
that TOTs are due to a failure in transmission of available
semantic information to the phonological system, as explained
in the transmission deficit hypothesis (TDH; Burke et al. 1991).
This theory proposes that older people are especially prone to
word-retrieval problems due to weakened connections at the
phonological level. The phonological level, as compared to
the semantic level, is particularly vulnerable to breakdowns
in retrieval because this level generally has fewer connections
(e.g., phoneme-sound), whereas the semantic system has multiple joined connections (e.g. many words/concepts linked to
a given word). Factors such as word frequency or recency of
use influence the strength of phonological connections; that
is, the lower the word frequency and the less recently used the
word, the weaker the connections, leading to greater retrieval
difficulties.
Both younger and older individuals benefit from phonological primes as opposed to unrelated primes during moments of a
TOT state, supporting the TDH model. The finding that priming
leads to better retrieval is consistent with the claim that priming
strengthens the inherently weak phonological connections and
thus facilitates resolution of the TOT state. Furthermore, studies using semantic versus phonological cues have demonstrated
that in both younger and older people, provision of phonological
information was more effective, as a retrieval aid, than semantic
cues (Meyer and Bock 1992). This illustrates that in older individuals, semantic information is intact, although there is some
degradation in the use of phonological information.
In summary, much evidence supports the TDH and the claim
that the locus of the breakdown in TOT states is at the phonological stage. The exact phonological substage responsible for
this problem still remains unclear; however, there are indications from phonological cuing studies, self-reports of individuals
experiencing TOT states (Brown 1991), and an electrophysiological study of lexical retrieval in healthy younger and older adults
(Neumann 2007) that the first two substages (segmental and syllabic retrieval) are particular points at which breakdowns occur.

that older adults benefit more from context in noise than do their
younger cohorts.
Lexical comprehension is one aspect of language that is
largely preserved or even enhanced with age. Studies on vocabulary comprehension in older adults using a picture selection task
for auditorily presented words show that they are comparable to
younger adults in this task (Schneider, Daneman, and PichoraFuller 2002). However, lexical-semantic integration at the level of
sentence comprehension may be affected in older adults.
Sentence comprehension in the elderly is known to be poor
in comparison to younger listeners. A number of reasons, both
linguistic and cognitive, have been discussed, including decline
in auditory perceptual skills and lexical-semantic and syntactic
processing capacity, as well as working memory capacity, speed
of processing, and ability to process with competing stimuli and
inhibit noise (Wingfield and Stine-Morrow 2000). Studies on syntactic comprehension of language in older adults demonstrate
that they are slower at judging sentences that are syntactically
complex and semantically improbable (Obler et al. 1991). It has
also been found to be relatively more difficult for this population
than for younger adults to take advantage of constraining context
in a sentence. This leads to difficulties in situations where older
adults need to rely on context but are not able to do so skillfully.
However, in other aspects of sentence comprehension such as
disambiguation, no age-related differences has been reported.
It is apparent that sentence comprehension is largely mediated by the ability to hold these sentences in working memory
while they are being processed. Working-memory decline has
been reported in older adults (Grossman et al. 2002). Moreover,
syntactic-processing difficulties have also been attributed to
reduction in working-memory capacity (Kemptes and Kemper
1997). Executive functions such as inhibition and task switching
have been reported as negatively affected in the elderly.
Comprehension problems eventually affect the older adults
discourse abilities. Often, the elderly find that it gets harder to
follow discourse with advancing age. Studies have shown that
older adults are significantly poorer than younger adults in fully
understanding and inferring complex discourse, and this difficulty is enhanced further with increased perceptual or cognitive
load (Schneider, Daneman, and Pichora-Fuller 2002), such as
noise and length or complexity of the material. A combination of
general cognitive decline and a deterioration of specific linguistic and sensory-perceptual processes contribute to the general
slowing observed in the elderly while they engage in conversation and discourse.

Language Comprehension
It is relatively well known that comprehension in older adults
is compromised in comparison to younger adults. These problems in comprehension may arise from individual or combined
effects of decline in their sensory/perceptual, linguistic, or cognitive domains. Current research is focused on disambiguating
the effects of these processes on phonological, lexical-semantic,
and syntactic aspects of language decline in the elderly.
Research shows that speech perception in older adults is especially affected by noise. This means that even with normal hearing, older adults experience difficulties understanding speech
(e.g., sentences) under noisy conditions such as a cocktail party
or cafeteria. Experiments focusing on sentence perception show

Spoken Discourse
Results from many types of tasks requiring sentence production, such as cartoon description tasks (Marini et al. 2005), have
indicated that older adults tend to produce grammatically simpler, less informative and more fragmented sentences than do
younger adults. Middle-aged and young elderly adults tend to
be better and more efficient story constructors than younger
or older adults. However, older adults usually produce a larger
number of words in their narrative speech, but they can show
difficulties in morpho-syntactic and lexico-semantic processing
by making more paragrammatic errors and semantic paraphasias than younger adults.

91

Aging and Language


Older adults also show a decreased ability to coherently link
adjacent utterances in a story. Adults older than 75 use a larger
number of missing or ambiguous referents and more units of
irrelevant content that affect the coherence of the narratives
(Marini et al. 2005). Older people show a huge variation in storytelling abilities, and they can also compensate for their storytelling difficulties due to their greater accumulated life experience,
which they can use to combine different themes and to emphasize relevant details.
In spoken discourse, some changes in conversational-interaction style can occur in young elderly people (6074 years),
but the most noticeable changes are likely to take place in older
elderly people (7788 years), who show excessive verbosity, failure to maintain topic, poor turn-taking, and unclear referencing
(James et al. 1998).
Difficulties in grammatical processing in the aging can be
attributed to cognitive deterioration involving reduced working
memory (Kemper, Herman, and Liu 2004) and inhibitory deficits
(James et al. 1998), but they can also be a sign of impaired access
to the lemma level during lexical retrieval (Marini et al. 2005).
In summary, there are clear differences between younger and
older adults in their sentence production, storytelling, and conversational abilities. These differences are manifest as changes
in morpho-syntactic and lexico-semantic processing, excessive
verbosity, and reduced informativeness and coherence.

Dementia
Difficulties in language production, as well as comprehension,
become obvious in dementia. Word-finding difficulties are characteristic of mild cognitive impairment (MCI), Alzheimers disease (AD), and vascular dementia (VaD).
Arto Nordlund and colleagues (2004) indicated that 57.1 percent of the individuals with MCI had significantly lower scores
in different language tasks than did typical aging adults. In AD,
language-processing difficulties are early signs of the disease.
In particular, AD appears to cause a breakdown of the semantic domain of language, which can be reflected in the impaired
comprehension and use of semantic relations between words
(Garrard et al. 2001) and in the reduced semantic noun and verb
fluency performance (Pekkala 2004) in this population.
Comparative studies between AD and VaD have indicated that
cognitive and linguistic performance cannot clearly differentiate
the two types of dementia from each other. Elina Vuorinen, Matti
Laine, and Juha Rinne (2000) indicated that both AD and VaD
involved similar types of semantic deficits early in the disease,
including difficulties in comprehension, naming, and production of semantic topics in narrative speech, while word repetition, oral reading, and fluency of speech output were preserved
in both types of dementia.
Yael Neumann, Seija Pekkala, and Hia Datta
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brown, Alan S. 1991. A review of the tip-of-the-tongue experience.
Psychological Bulletin 109.2: 20423.
Burke, Deborah M., Donald G. MacKay, Joanna S. Worthley, and
Elizabeth Wade. 1991. On the tip of the tongue: What causes word
finding failures in young and older adults? Journal of Memory and
Language 30: 54279.

92

Garrard, Peter, Matthew A. Lambon Ralph, John R. Hodges, and Karalyn


Patterson. 2001. Prototypicality, distinctiveness, and intercorrelations: Analysis of the semantic attributes of living and nonliving concepts. Cognitive Neuropsychology 18.2: 12574.
Grossman, Murray, Ayanna Cooke, Christian DeVita, David Alsop, John
Detre, Willis Chen, and James Gee. 2002. Age-related changes in
working memory during sentence comprehension: An fMRI study.
NeuroImage 15: 30217.
James, Lori E., and Deborah M. Burke. 2000. Phonological priming
effects on word retrieval and tip-of-the-tongue experiences in young
and older adults. Journal of Experimental Psychology: Learning,
Memory, and Cognition 26.6: 137891.
James, Lori E., Deborah M. Burke, Ayda Austin, and Erika Hulme. 1998.
Production and perception of verbosity in younger and older adults.
Psychology and Aging 13: 35567.
Kemper, Susan, Ruth E. Herman, and Chiung-Ju Liu. 2004. Sentence
production by young and older adults in controlled contexts. Journal
of Gerontology Series B: Psychological Sciences and Social Sciences
59: 2204.
Kemptes, Karen A., and Susan Kemper 1997. Younger and older adults
on-line processing of syntactically ambiguous sentences. Psychology
and Aging 12: 36271.
Levelt, Willem J. M., Ardi Roelofs, and Antje S. Meyer 1999. A theory
of lexical access in speech production. Behavioral Brain Sciences
22: 175.
Marini, Andrea, Anke Boewe, Carlo Caltagirone, and Sergio Carlomagno.
2005. Age-related differences in the production of textual descriptions. Journal of Psycholinguistic Research 34: 43963.
Meyer, Anje S., and Kathryn Bock. 1992. The tip-of-the-tongue phenomenon: Blocking or partial activation? Memory and Cognition
20: 71526.
Morris, John, Martha Storandt, J. Phillip Miller, Daniel W. McKeel, Joseph
L. Price, Eugene H. Rubin, and Leonard Berg. 2001. Mild cognitive
impairment represents early-stage Alzheimers disease. Archives of
Neurology 58.3: 397405.
Neumann, Yael. 2007. An Electrophysiological Investigation of the Effects
of Age on the Time Course of Segmental and Syllabic Encoding during Implicit Picture Naming in Healthy Younger and Older Adults.
Publications of the Department of Speech, Language, and Hearing
Sciences. New York: City University of New York.
Nordlund, Arto, S. Rolstad, P. Hellstrm, M. Sjgren, S. Hansen, and A.
Wallin. 2004. The Goteborg MCI study: Mild cognitive impairment is
a heterogeneous condition. Journal of Neurology, Neurosurgery, and
Psychiatry 76: 148590.
Obler, Loraine K., Deborah Fein, Marjorie Nicholas, and Martin L. Albert
1991. Auditory comprehension and aging: Decline in syntactic processing. Applied Psycholinguistics 12: 43352.
Pekkala, Seija. 2004. Semantic Fluency in Mild and Moderate Alzheimers
Disease. Publications of the Department of Phonetics 47, University of
Helsinki. Available online at: http://ethesis.helsinki.fi/.
Reuter-Lorenz, Patricia A., John Jonides, Edward E. Smith, Alan Hartlye,
Andrea Miller, Christina Marshuetz, and Robert A. Koeppe. 2000. Age
differences in the frontal lateralization of verbal and spatial working memory revealed by PET. Journal of Cognitive Neuroscience 12: 17487.
Schneider, Bruce A., Meredith Daneman, and M. Kathleen PichoraFuller. 2002. Listening in aging adults: From discourse comprehension to psychoacoustics. Canadian Journal of Experimental Psychology
56: 13952.
Vuorinen, Elina, Matti Laine, and Juha Rinne. 2000. Common pattern
of language impairment in vascular dementia and in Alzheimers disease. Alzheimer Disease and Associated Disorders 14: 816.
Wingfield, Arthur, and Elizabeth A. L. Stine-Morrow. 2000. Language
and aging. In The Handbook of Cognition and Aging, ed. Craig and
Salthouse, 359416. Mahwah, NJ: Psychology Press.

Agreement

Agreement Maximization

AGREEMENT

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

Agreement is a form of featural dependency between different


parts of a sentence: The morphological shape of a word is a function of particular morphological features of a different, often distant, word. Since the Middle Ages, agreement was taken to be in
complementary distribution with government, and, hence, it
became important to determine both the context of each type of
relation and the reasons why this difference exists (Covington
1984). This rich tradition has survived in generative grammar, all
the way up to the minimalist program (see mimimalism), where
it is embodied under the notion agree.
Depending on the features and occurrences in an expression,
agreement can be characterized as external or internal (Barlow
and Fergusson 1988). External agreement typically involves person and number features, taking place between verbs and corresponding dependents. We can witness it in you are friendly
versus he is friendly. Internal agreement (concord) normally
involves gender and number features, and typically takes place
internal to nominal expressions, between adjectives or relative
clauses and the head noun, freely iterating. Concord is easily
observed in modified nominal expressions in Spanish: atractivas
damas attractive ladies versus atractivos caballeros attractive
gentlemen (agreeing elements are boldfaced). Genitive agreement internal to nominal expressions normally falls within the
external (not the internal) rubric.
The principles and parameters system concentrated on
external agreement, through the relation (head, specifier) (Aoun
and Sportiche 1983). However, since agreement is possible also
in situations where no such relation seems relevant, the minimalist program (Chomsky 2000) proposes a relation between
a probe and a goal. The probe contains a value-less attribute in
need of valuation from a distant feature of the same type, which
the probing mechanism achieves. The goal cannot be contained
within a derivational cycle (a phase) that is too distant from the
probe. When the probe finds an identical category within its complement domain, it attempts to get its valuation from it, thereby
sanctioning the relevant agreement. To illustrate, observe the
Spanish example in (1); note also the internal agreement manifested within the noun phrase:

Aoun, J. and D. Sportiche. 1983. On the formal theory of government.


Linguistic Review 2: 21136.
Barlow, M., and C. Barlow, eds. 1988. Agreement in Natural Language.
Stanford, CA: CSLI Publications.
Boeckx, C., ed. 2006. Agreement Systems, Amsterdam: Benjamins.
Chomsky, N. 2000. Minimalist inquiries: the framework. In Step by
Step, ed. R. Martin, D. Michaels, and J. Uriagereka, 89155. Cambridge,
MA: MIT Press
Covington, M. 1984 Syntactic Theory in the High Middle Ages,
Cambridge: Cambridge University Press.

(1) Parecen [haber quedado [los locos soldados] en la guarnicin]


seem-3rd/pl. have remained the-m./pl. crazy-m./pl.
soldiers-m./pl. in the garrison
(2) Probe1 [ [Goal]]] (plus iteration of -os within the nominal)

( ) person

pl. number

( ) number

m. gender

Agreement adds a strange redundancy to the language faculty.


In languages where the phenomenon is overt, the extra layer of
manifest dependency correlates with differing surface orders. But
it is unclear whether that justifies the linguistic emergence of the
agreement phenomenon, particularly since the probe/goal mechanism can be present without overt manifestations. This results in
much observed variation, from the almost total lack of overt agreement of Chinese to the poly-personal manifestation of Basque.
Juan Uriagereka

AGREEMENT MAXIMIZATION
Maximizing, or optimizing, agreement between speaker and
interpreter is part of the principle of charity (see charity, principle of) that, according to philosophers of language in the
tradition of W. V. O. Quine and Donald Davidson, governs the
interpretation of the speech and thought of others and guides
the radical interpretation of a radically foreign language.
According to Davidson, correct interpretation maximizes truth
and coherence across the whole of the beliefs of a speaker. For an
interpreter, maximizing truth across the beliefs he ascribes to a
speaker necessarily amounts to maximizing agreement between
himself and the speaker: He can only go by his own view of what
is true.
Take one of Davidsons own examples: Someone says There
is a hippopotamus in the refrigerator, and he continues: Its
roundish, has a wrinkled skin, does not mind being touched.
It has a pleasant taste, at least the juice, and it costs a dime. I
squeeze two or three for breakfast (1968, 100). The simplest way
of maximizing agreement between this speaker and us will probably be interpreting his expression hippopotamus as meaning
the same as our expression orange. Davidson himself, however,
soon came to consider maximizing agreement as a confused
ideal (1984, xvii) to be substituted with optimizing agreement
(1975, 169). The idea here is that some disagreements are more
destructive of understanding than others (1975, 169). Very
generally, this is a matter of epistemic weight; the more basic a
belief is, and the better reasons we have for holding it, the more
destructive disagreement on it would be: The methodology of
interpretation is, in this respect, nothing but epistemology seen
in the mirror of meaning (1975, 169). According to Davidson, it
is impossible to codify our epistemology in simple and precise
form, but general principles can be given: [A]greement on laws
and regularities usually matters more than agreement on cases;
agreement on what is openly and publicly observable is more
to be favored than agreement on what is hidden, inferred, or ill
observed; evidential relations should be preserved the more they
verge on being constitutive of meaning ([1980] 2004, 157).
Agreement optimization does not exclude the possibility of
error; speakers are to be interpreted as right only when plausibly possible (Davidson 1973, 137). In certain situations this
prevents the interpreter from ascribing beliefs of his own to
the speaker, for instance, perceptual beliefs about objects the
speaker is in no position to perceive. Moreover, if the speaker
has other beliefs that provide him or her with very good reasons
for believing something false, optimizing agreement across all

93

Agreement Maximization

Alliteration

of his or her beliefs might well require ascription of outright


mistakes.
Optimizing agreement provides an interpreter with a method
for arriving at correct interpretations because of the way belief
content is determined, Davidson holds. The arguments for this
claim have changed over the years; initially, the idea was that a
belief is identified by its location in a pattern of beliefs; it is this
pattern that determines the subject matter of the belief, what the
belief is about (1975, 168). Later, however, the role played by
causal connections between objects and events in the world and
the beliefs of speaker and interpreter becomes more and more
prominent: In the most basic, perceptual cases, the interpreter
interprets sentences held true (which is not to be distinguished
from attributing beliefs) according to the events and objects in
the outside world that cause the sentence to be held true ([1983]
2001,150). In the later Davidson, the account of content determination underlying the method of charitable interpretation
takes the form of a distinctive, social and perceptual meaning
externalism: In the most basic cases, the objects of thought
are determined in a sort of triangulation as the shared causes
of the thoughts of two interacting persons, for instance, a child
and its teacher (cf. Davidson [1991] 2001, 2001). According to
Davidson, such triangulation is a necessary condition for thought
with empirical content; moreover, he derives a quite far-reaching
epistemic antiskepticism from it and claims that belief is in its
nature veridical ([1983] 2001, 146; cf. also [1991] 2001, 211ff).
Probably the most influential argument against the idea that
any kind of maximizing agreement results in correct interpretation derives from Saul Kripkes attack on description theories
of proper names. According to such theories, the referent of a
proper name, for instance, Gdel is determined by a description, or cluster of descriptions, held true by the speaker(s), for
instance, the description the discoverer of the incompleteness
of arithmetic. Kripke argued, among other things, that such
theories fail because all of the relevant descriptions, all of the relevant beliefs that a speaker, or even a group of speakers, holds
about Gdel could turn out to be false (cf. 1972, 83ff). Kripke gave
an analogous argument for natural-kind terms such as gold or
tiger, and many philosophers today believe that these arguments
can be generalized even further. While it is quite clear, however,
that most of the descriptions a speaker associates, for instance,
with a name could turn out to be false when taken one by one,
it is far less obvious that all (or most, or a weighted majority) of
them could do so at the same time. According to Davidson, for
instance, optimizing agreement amounts to reference determination by epistemically weighted beliefs. A significant number
of these are very elementary beliefs like the belief that Gdel
was a man, that he was human, that he worked on logic, that he
lived on Earth, and so on. If a speaker did not believe any of these
things about Gdel, it has been argued with Davidson, it is far
less clear that this speaker was in fact talking about Gdel: Too
much mistake simply blurs the focus (Davidson 1975, 168).
Kathrin Gler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Davidson, Donald. 1968. On saying that. In Davidson 1984, 93108.
. 1973. Radical interpretation. In Davidson 1984, 12539.

94

. 1975. Thought and talk. In Davidson 1984, 15570.


. [1980] 2004. A unified theory of thought, meaning, and action.
Problems of Rationality, 15166. Oxford: Clarendon Press.
. [1983] 2001. A coherence theory of truth and knowledge.
Subjective, Intersubjective, Objective, 13753. Oxford: Clarendon Press.
. 1984. Inquiries into Truth and Interpretation. Oxford: Clarendon
Press.
. [1991] 2001. Three varieties of knowledge. Subjective,
Intersubjective, Objective, 20520. Oxford: Clarendon Press.
. 2001. Externalisms. Interpreting Davidson, ed. P. Kotatko, P.
Pagin, and G. Segal, 116. Stanford, CA: CSLI.
Gler, Kathrin. 2006. Triangulation. The Oxford Handbook of Philosophy
of Language, ed. E. Lepore and B. Smith, 100619. Oxford: Oxford
University Press.
Grandy, Richard. 1973. Reference, meaning, and belief. Journal of
Philosophy 70: 43952.
Kripke, Saul. 1972. Naming and Necessity. Cambridge: Harvard University
Press.

ALLITERATION
Linguistically, alliteration, also known as initial or head rhyme,
is defined as the selection of identical syllable onsets within a
specific phonological, morphosyntactic, or metrical domain.
It is usually coupled with stress, as in the three Rs in education: reading, writing, and arithmetic. Etymologically, the term
alliteration (from L. ad- to + littera letter) includes the repetition of the same letters at word beginnings; its dual association
with sounds and letters reflects a common cognitive crisscrossing between spoken and written language in highly literate
(see literacy) societies well illustrated by the famous phrase
apt alliterations artful aid, where the alliteration is primarily
orthographic.
Alliteration based on the sameness of letters is found in visual
poetry, advertising, and any form of playful written language.
Phonologically based alliteration is a frequent mnemonic and
cohesive device in all forms of imaginative language: Examples
from English include idioms (beat about the bush), reduplicative
word-formation (riffraff), binominals (slowly but surely), catch
phrases, refrains, political slogans, proverbs, and clichs. In
verse, alliteration serves both as ornamentation and as a structural device highlighting the metrical organization into feet, cola,
verses, and lines. Along with rhyme, alliteration is a common feature of folk and art verse in languages as diverse as Irish, Shona,
Mongolian, Finnish, and Somali.
The most frequent type of alliteration requires identity of the
onsets of stressed syllables, which makes it a preferred poetic
device in languages with word-initial stress, such as the older
Germanic languages. Within the Germanic tradition, metrically
relevant alliteration occurs on the stressed syllables of the first
foot of each verse (or half-line), where it is obligatory. For Old
English, the language of the richest and most varied surviving alliterative poetry in Germanic, the second foot of the first
half-line may also alliterate. Alliteration is disallowed on the last
stressed syllable in the line.
In Old English verse, alliteration appears with remarkable regularity: Only 0.001% of the verses lack alliteration and
less than 0.05% contain unmetrical alliteration (Hutcheson
1995, 169). Alliteration is, therefore, a reliable criterion used by

Ambiguity
modern editors to determine the boundaries of verses and lines,
though no such divisions exist in the manuscripts. The reinvented alliterative tradition of fourteenth-century England also
uses alliteration structurally, while its ornamental function is
enhanced by excessive verse-internal and run-on alliteration.
As a cohesive device in verse, alliteration refers to the underlying distinctions in the language and relies on identity of phonological categories. The interpretation of onset identity for the
purpose of poetic alliteration varies from tradition to tradition
and can include whole clusters, optionally realized segments,
and even the whole syllable up to the coda. In Germanic, all
consonants alliterated only with one another, the clusters st-,
sp-, sk- could not be split, and all orthographic stressed vowels
alliterated freely among themselves, most likely because their
identity was signaled by the presence of a prevocalic glottal stop
in stressed syllable onsets.
Donka Minkova
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Fabb, Nigel. 1997. Linguistics and Literature: Language in the Verbal Arts
in the World. Oxford: Blackwell.
Hutcheson, Bellenden Rand. 1995. Old English Poetic Metre.
Cambridge: D. S. Brewer.
Minkova, Donka. 2003. Alliteration and Sound Change in Early English.
Cambridge: Cambridge University Press.

AMBIGUITY
Ambiguity refers to the potential of a linguistic expression to have
more than one meaning. Although many expressions (words,
phrases, and even sentences) are ambiguous in isolation, few
remain so when used in a particular context. In fact, people typically resolve all ambiguities without even detecting the potential
for other interpretations. Ambiguity does not imply vagueness;
rather, ambiguity gives rise to competing interpretations, each of
which can be perfectly concrete. Although ambiguity is pervasive
and unavoidable in natural languages, artificial languages developed for mathematics, logic, and computer programming strive
to eliminate it from their expressions.
Ambiguity can be lexical, structural, referential, scopal,
or phonetic. The examples of these phenomena that follow
include well-known classics in English.
Lexical ambiguity refers to the fact that some words, as written or spoken, can be used in different parts of speech (see
word classes) and/or with different meanings. For example,
duck can be used as a noun or a verb and, as a noun, can refer to
a live animal or its meat. Structural ambiguity arises when different syntactic parses give rise to different interpretations. For
example, in addition to being lexically ambiguous, They saw her
duck is also structurally ambiguous:
1. They saw [NP her duck] (the bird or its meat belongs to her)
2. They saw [NP her] [VP duck] (the ducking is an action she carries
out)

A common source of structural ambiguity involves the attachment site for prepositional phrases, which can be at the level
of the nearest noun phrase (NP) or the clause. In the sentence

Danny saw the man with the telescope, either Danny used the
telescope to help him see the man (3) or the man whom Danny
saw had a telescope (4).
3. Danny [VP saw [NP the man] [PP with the telescope]]
4. Danny [VP saw [NP the man [PP with the telescope]]]

Referential ambiguity occurs when it is not clear which entity


in a context is being referred to by the given linguistic expression.
Although deictics (see deixis), such as pronouns, are typical
sources of referential ambiguity, full noun phrases and proper
nouns can also give rise to it.
5. (at a boys soccer game) He kicked him! Who kicked who?
6. (at a boat race) That boat seems to be pulling ahead. Which
one?
7. (in a university corridor) Im off to meet with Dr. Sullivan.
Chemistry or math? (There are two Dr. Sullivans in different departments.)

Scopal ambiguity occurs when a sentence contains more than


one quantified NP and the interpretation depends on the relative scopes of the quantifiers. For example, Some children saw
both plays can mean that a) there exist some children such that
each of them saw both plays or b) both plays were such that each,
individually, was seen by some children but not necessarily the
same children. Phonetic ambiguity arises when a given sound
pattern can convey different words, for example, two ~ too ~ to;
new deal ~ nude eel.
Although people typically do not notice the ambiguities
that they effortlessly resolve through context, they are certainly
aware of the potential for ambiguity in language. In fact, such
awareness is a precondition for getting the joke in Abbott and
Costellos Whos on First? skit or in headlines like Iraqi Head
Seeks Arms.
Whereas ambiguity does not frequently hinder effective
communication among people, it is among the biggest hurdles
for the machine processing of language. This is not surprising if
one considers how much reasoning is required to resolve ambiguity and how much knowledge of language, the context, and
the world must underpin such reasoning. As an example of the
large scale of the task, consider the short sentence The coach
lost a set, which you probably interpreted to mean the person
who is the trainer of some athletic team experienced the loss of
a part of a match in an athletic competition (whether the coach
was playing or the team was playing is yet another ambiguity).
Other interpretations are also valid, given specific contexts. For
example, the person who is the trainer of some team might have
lost a set of objects (keys, golf clubs) or a railroad car might have
lost a set of objects (door handles, ball bearings). If this sentence
were used as input to an English-Russian machine translation
system that relied on a standard English-Russian dictionary,
that system would have to select from among 15 senses of coach,
11 senses of lose, and 91 senses of set a grand total of 15,015
combinations, if no further knowledge were brought to bear. Of
course, all machine translation systems incorporate some heuristic knowledge, and lexicons developed for natural language
processing typically do not permit the amount of sense splitting
found in dictionaries for people. On the other hand, it is common

95

Amygdala
for sentences to contain upward of 20 words, in which case there
is still the threat of combinatorial explosion.
Marjorie J. McShane
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Cruse, D. A. 1986. Lexical Semantics. Cambridge: Cambridge University
Press. Includes features of and tests for ambiguity.
Small, Steven, Garrison Cottrell, and Michael Tanenhaus, eds. 1988.
Lexical Ambiguity Resolution: Perspective from Psycholinguistics,
Neuropsychology and Artificial Intelligence. San Mateo, CA: Morgan
Kaufmann.
Zwicky, Arnold M., and Jerrold M. Sadock. 1975. Ambiguity Tests and
How to Fail Them. In Syntax and Semantics, ed. J. Kimball, IV: 136.
New York: Academic Press. Discusses tests to distinguish ambiguity
from lack of specification.

AMYGDALA
Studies in animals have established a clear role for the amygdala
in social and emotional behavior, especially as related to fear
and aggression (Le Doux 1996). Human studies, including lesion
studies, electrophysiology, and functional neuroimaging,
have further elucidated the role of the amygdala in the processing of a variety of emotional sensory stimuli, as well as its relationship to behavioral and cognitive responses (Adolphs 2001).
These responses not only guide social behavior but also aid in the
acquisition of social knowledge. The focus of this entry is on the
amygdala and its role in the processing of language, in particular language relevant to social and emotional behavior (see also
emotion and language and emotion words).
The amygdala is an almond-shaped group of neurons located
in the rostral medial temporal region on both left and right sides
of the brain (see left hemisphere language processing and right hemisphere language processing). It
has reciprocal connections to regions, such as the hypothalamus, that are important for coordinating autonomic responses
to complex environmental cues for survival, as well as premotor and prefrontal areas that are necessary for rapid motor and
behavioral responses to perceived threat. Visual, somatosensory, and auditory information is transmitted to the amygdala
by a series of indirect, modality-specific thalamocorticoamygdalar pathways, as well as by direct thalamoamygdalar pathways.
Within the amygdaloid complex, information processing takes
place along numerous highly organized parallel pathways with
extensive intraamygdaloid connections. The convergence of
inputs in the lateral nucleus enables stimulus representations
to be summated. Specific output pathways from the central
nucleus and amygdalohippocampal area mediate complementary aspects of learning and behavioral expressions connected
with various emotional states. The amydgala is thus well positioned to play a role in rapid cross-modal emotional recognition.
It is important for the processing of emotional memory and for
fear conditioning.
In addition, anatomical studies of the primate amygdala demonstrate connections to virtually all levels of visual processing in
the occipital and temporal cortex (Amaral 2003). Therefore,
the amygdala is also critically placed to modulate visual input,
based on affective significance, at a variety of levels along the

96

cortical visual processing stream. Hence, through its extensive


connectivity with sensory processing regions, the amygdala is
ideally located to influence perception based on emotion.

Language
In order to survive in a changing environment, it is especially
important for the organism to remember events and stimuli
that are linked with emotional consequences. Furthermore, it is
important to be vigilant of emotional stimuli in the environment
in order to allow for rapid evaluation of and response to these
emotional stimuli. In humans, emotional cues are transmitted
linguistically, as well as through body posture, voice, and facial
expression (see gesture).
In the first imaging study to examine language and the
amygdala, a modified Stroop task was utilized, along with a
high-sensitivity neuroimaging technique, to target the neural substrate engaged specifically when processing linguistic
threat (Isenberg et al. 1999). Healthy volunteer subjects were
instructed to name the color of words of either threat or neutral
valence, presented in different color fonts, while neural activity
was measured by using positron emission tomography. Bilateral
amygdalar activation was significantly greater during color naming of threat words than during color naming of neutral words
(see Color Plate 1). Associated activations were also noted in
sensory-evaluative and motor-planning areas of the brain. Thus,
our results demonstrate the amygdalas role in the processing
of danger elicited by language. In addition, the results reinforce
the amygdalas role in the modulation of the perception of, and
response to, emotionally salient stimuli. This initial study further
suggests conservation of phylogenetically older mechanisms of
emotional evaluation in the context of more recently evolved linguistic function. In a more recent study that examines the neural
substrates involved when subjects are exposed to an event that is
verbally linked to an aversive outcome, activation is observed in
the left amygdala (Phelps et al. 2001, 43741).
This activation correlated with the expression of the fear
response as measured by skin conductance response, a peripheral measure of arousal. The laterality of response may relate to
the explicit nature of the fear, as well as to the fact that the stimulus is learned through verbal communication. Fears that are
simply imagined and anticipated nonetheless have a profound
impact on everyday behavior. The previous study suggests that
the left amygdala is involved in the expression of fear when anticipated and conveyed in language. Another study that sought to
examine the role of the amygdala in the processing of positive as
well as negative valence verbal stimuli also demonstrated activity in the left amygdala (Adolphs, Baron-Cohen, and Tranel 2002,
126474). During magnetic resonance (MR) scanning, subjects
viewed high-arousal positive and negative words and neutral
words. In this study, activity was found in the left amygdala while
subjects viewed both negative and positive words in comparison
to neutral words. Taken together, these studies suggest that the
amygdala plays a role in both positive and negative emotional
responses. Furthermore, they suggest that the left amygdala may
be preferentially involved in the processing of emotion as conveyed through language.
Lesion studies have generally suggested that the amygdala
is not essential for recognizing or judging emotional and social

Analogy
information from explicit, lexical stimuli, such as stories (see

narrative, neurobiology of) (Amaral 2003, 33747).


However, while recognition of emotional and social information
may be relatively preserved in amygdalar damage, the awareness that unpleasant emotions are arousing appears to be lost. In
a lesion study, normal subjects judge emotions such as fear and
anger to be both unpleasant and highly arousing; however, patient
S. M. 046, who sustained early amygdalar damage, judged these
same stimuli to be unpleasant but of low arousal. For example,
when told a story about someone driving down a steep mountain
who had lost the car brakes, she identified the situation as unpleasant but also gave a highly abnormal judgment that it would make
one feel sleepy and relaxed. It is interesting to note that S. M. 046
was able to judge arousal normally from positive emotions.
The human amygdala is important both for the acquisition
and for the online processing of emotional stimuli. Its role is disproportionate for a particular category of emotional information,
that pertaining to the evaluation of potential threat in the environment. The amygdalas role in enhanced, modality-specific
processing required for the rapid evaluation and response to
threat is highlighted. Furthermore, this review suggests conservation of phylogenetically older limbic mechanisms of emotional
evaluation in the context of more recently evolved language.
Nancy B. Isenberg
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Adolphs, R. 2001. The neurobiology of social cognition. Current
Opinions in Neurobiology 11.2: 2319.
Adolphs, R., S. Baron-Cohen, and D. Tranel. 2002. Impaired recognition
of social emotions following amygdala damage. Journal of Cognitive
Neuroscience 14.8: 126474.
Amaral, D. G. 2003. The amygdala, social behavior, and danger detection.
Annals of the New York Academy of Sciences 1000 (Dec.): 33747.
Freese, J. L., and D. G. Amaral. 2006. Synaptic organization of projections
from the amygdala to visual cortical areas TE and V1 in the macaque
monkey. Journal of Comparative Neurology 496.5: 65567.
Isenberg, N., D. Silbersweig, A. Engelien, S. Emmerich, K. Malavade, B.
Beattie, A. C. Leon, and E. Stern. 1999. Linguistic threat activates the
human amygdala. Proceedings of the National Academy of Sciences
USA 96.18: 104569.
Le Doux, J. 1996. The Emotional Brain: The Mysterious Underpinnings of
Emotional Life. New York: Touchstone.
Phelps, E. A. 2006. Emotion and cognition: Insights from studies of the
human amygdala. Annual Review of Psychology 57: 2753.
Phelps, E. A., K. J. OConnor, J. C. Gatenby, J. C. Gore, C. Grillon, and M.
Davis. 2001. Activation of the left amygdala to a cognitive representation of fear. Nature Neuroscience 4.4: 43741.

ANALOGY
Two situations are analogous if they share a common pattern of
relationships among their constituent elements, even though
the elements are dissimilar. Often one analog, the source, is more
familiar or better understood than the second analog, the target
(see source and target). Typically, a target situation serves
as a retrieval cue for a potentially useful source analog. A mapping, or set of systematic correspondences aligning elements
of the source and target, is then established. On the basis of the
mapping, it is possible to derive new inferences about the target.

In the aftermath of analogical reasoning about a pair of cases,


some form of relational generalization may take place, yielding a
schema for a class of situations (Gick and Holyoak 1983).

Psychological Research
Within psychology, work in the intelligence tradition focused on
four-term or proportional analogies, such as ARM: HAND :: LEG: ?
Charles Spearman (1946) reviewed studies that found high correlations between performance in solving analogy problems and
the g factor (general intelligence). The ability to solve analogylike problems depends on a neural substrate that includes subareas of the prefrontal cortex (Bunge, Wendelken, and Wagner
2005; see frontal lobe). Although there have been reports of
great apes being successfully trained to solve analogy problems,
these results are controversial (Oden, Thompson, and Premack
2001). Complex relational thinking appears to be a capacity that
emerged in homo sapiens along with the evolutionary increase
in size of the frontal cortex. The ability to think relationally
increases with age (Gentner and Rattermann 1991). Greater
sensitivity to relations appears to arise with age due to a combination of incremental accretion of knowledge about relational
concepts (Goswami 1992), increases in working memory
capacity (Halford 1993), and increased ability to inhibit misleading featural information (Richland, Morrison, and Holyoak
2006). Analogy plays a prominent role in teaching mathematics
(Richland, Zur, and Holyoak 2007).
Dedre Gentner (1983) developed the structure-mapping theory of analogy, emphasizing that analogical mapping is guided
by higher-order relations relations between relations. Keith
Holyoak and P. Thagard (1989) proposed a multiconstraint theory, hypothesizing that people find mappings that maximize
similarity of corresponding elements and relations, structural
parallelism, and pragmatic importance for goal achievement.
Several computational models of human analogical thinking
have been developed. Two influential models are SME (Structure
Mapping Engine; Falkenhainer, Forbus, and Gentner 1989),
based on a classical symbolic architecture, and LISA (Learning
and Inference with Schemas and Analogies; Hummel and
Holyoak 2005), based on a neural-network architecture. LISA has
been used to simulate some effects of damage to the frontal and
temporal cortex on analogical reasoning (Morrison et al. 2004).

Analogy and Language


Analogy is related to metaphor and similar forms of symbolic
expression in literature and everyday language. In metaphors,
the source and target domains are always semantically distant
(Gentner, Falkenhainer, and Skorstad 1988). Rather than simply
comparing the source and target, the target is identified with the
source (Holyoak 1982), either directly (e.g., Juliet is the sun)
or by applying a predicate drawn from the source domain to the
target (e.g., The romance blossomed). As a domain-general
learning mechanism linked to human evolution, analogy offers
an alternative to strongly nativist views of language acquisition
(Vallauri 2004; see innateness and innatism). Gentner and
L. L. Namy (2004) review evidence that analogical comparison
plays important roles in speech segmentation, word learning,
and possibly acquisition of grammar.
Keith Holyoak

97

Analogy: Synchronic and Diachronic


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bunge, Silvia, C. Wendelken, and A. D. Wagner. 2005. Analogical reasoning and prefrontal cortex: Evidence for separable retrieval and integration mechanisms. Cerebral Cortex 15: 23949.
Falkenhainer, Brian, K. D. Forbus, and D. Gentner. 1989. The Structuremapping engine: Algorithm and examples. Artificial Intelligence
41: 163.
Gentner, Dedre. 1983. Structure-mapping: A theoretical framework for
analogy. Cognitive Science 7: 15570.
Gentner, Dedre, B. Falkenhainer, and J. Skorstad. 1988. Viewing metaphor as analogy. In Analogical Reasoning: Perspectives of Artificial
Intelligence, Cognitive Science, and Philosophy, ed. D. Helman, 1717.
Dordrecht, the Netherlands: Kluwer.
Gentner, Dedre, K. J. Holyoak, and B. N. Kokinov , eds. 2001. The Analogical
Mind: Perspectives from Cognitive Science. Cambridge, MA: MIT Press.
This book contains survey articles on topics in analogy.
Gentner, Dedre, and L.L. Namy. 2004. The role of comparison in childrens early word learning. In Weaving a Lexicon, ed. D. Hall and S.
Waxman, 53368. Cambridge, MA: MIT Press.
Gentner, Dedre, and M. Rattermann. 1991. Language and the career of
similarity. In Perspectives on Thought and Language: Interrelations
in Development, ed. S. Gelman and J. Byrnes, 22577.
Cambridge: Cambridge University Press.
Gick, Mary, and K. J. Holyoak. 1983. Schema induction and analogical
transfer. Cognitive Psychology 15: 138.
Goswami, Usha. 1992. Analogical Reasoning in Children. Hillsdale,
NJ: Erlbaum.
Halford, Graeme. 1993. Childrens Understanding: The Development of
Mental Models. Hillsdale, NJ: Erlbaum.
Holyoak, Keith. 1982. An analogical framework for literary interpretation. Poetics 11: 10526.
Holyoak, Keith, and P. Thagard. 1989. Analogical mapping by constraint
satisfaction. Cognitive Science 13: 295355.
. 1995. Mental Leaps: Analogy in Creative Thought. Cambridge,
MA: MIT Press. This book provides a broad introduction to the nature
and uses of analogy.
Hummel, John, and K. J. Holyoak. 2005. Relational reasoning in a neurally-plausible cognitive architecture: An overview of the LISA Project.
Current Directions in Cognitive Science 14: 1537.
Morrison, Robert, D. C. Krawczyk, K. J. Holyoak, J. E. Hummel, T. W.
Chow, B. L. Miller, and B. J. Knowlton. 2004. A neurocomputational
model of analogical reasoning and its breakdown in frontotemporal
lobar degeneration. Journal of Cognitive Neuroscience 16: 26071.
Oden, David, R. K. R. Thompson, and D. Premack. 2001. Can an ape reason analogically? Comprehension and production of analogical problems by Sarah, a chimpanzee (Pan Troglodytes). In Gentner, Holyoak,
and Kokinov 2001, 47197.
Richland, Lindsey, R. G. Morrison, and K. J. Holyoak. 2006. Childrens
development of analogical reasoning: Insights from scene analogy
problems. Journal of Experimental Child Psychology 94: 24971.
Richland, Lindsey, O. Zur, and K. J. Holyoak. 2007. Cognitive supports
for analogy in the mathematics classroom. Science 316: 11289.
Spearman, Charles. 1946. Theory of a general factor. British Journal of
Psychology 36: 11731.
Vallauri, Edoardo. 2004. The relation between mind and language: The
innateness hypothesis and the poverty of the stimulus. Linguistic
Review 21: 34587.

ANALOGY: SYNCHRONIC AND DIACHRONIC


Analogy involves two (or more) systems, A and B, which are constituted by their respective parts, that is, a1, a2, a3 (etc.) and b1,

98

b2, b3 (etc.). There is some relation R between a1, a2, and a3,
expressed as R(a1,a2,a3), just as there is another such relation
S between b1, b2, and b3, expressed as S(b1,b2,b3). For A and
B to be analogous, it is required that R and S be exemplifications of the same abstract structure X, as evidenced by a mapping between a1/a2/a3 and b1/b2/b3. This is what is meant by
saying that analogy (e.g., between A and B) is a structural similarity, or a similarity between relations (e.g., R and S). It is not
a material similarity, or a similarity between things (e.g., a1
and b1). More and more abstract analogies are constituted by
similarities between similarities between relations between
things.
In its purely synchronic use (see synchrony and diachrony), analogy is understood to be the centripetal force that
holds the units of a structure together. To simplify an example
given by N. S. Trubetzkoy ([1939] 1958, 6066), in a structure
containing just /p/, /b/, /t/, and /d/, the phoneme /p/ acquires
the distinctive features voiceless and bilabial by being contrasted
with, respectively, (voiced) /b/ and (dental) /t/. The relation
between the pairs /p/ & /b/ and /t/ & /d/ is the same, and so is
the relation between /p/ & /t/ and /b/ & /d/, which means by
definition that there is in both cases an analogy between the
two pairs. This type of analogy-based analysis applies to any wellarticulated structure, linguistic or nonlinguistic: A unit is what
the other units are not (as /p/ is neither /b/ nor /t/ nor, of course,
/d/); and this otherness is based on corresponding oppositions
(like voiceless vs. voiced and bilabial vs. dental).
Synchronic analogy may be characterized as analogy-asstructure. Its counterpart is analogy-as-process,that is, discovery,
manipulation, or invention of structural similarity. Traditionally,
language acquisition was thought to be based on analogy: From
innumerable sentences heard and understood [the child] will
abstract some notion of their structure which is definite enough
to guide him in framing sentences of his own (Jespersen [1924]
1965, 19). After a period of neglect, this traditional view has
again become fashionable in some quarters (Pinker 1994, 417;
Tomasello 2003, 1639).
Only if analogy-as-process leaves a permanent trace that deviates from the current norm is there reason to speak of language
change, the province of diachronic linguistics. Traditionally, the
term analogical change was restricted to morphology, or to cases
where irregularities brought about by sound change are eliminated so as to achieve, or to approach, the goal of one meaning
one form. However, this same goal is involved in such large-scale
changes as have generally been ascribed to a need for harmony
or symmetry. In syntactic change, analogy consists in extending a
reanalyzed structure to new contexts (Anttila 1989, 1024).
Esa Itkonen
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anttila, Raimo. 1989. Historical and Comparative Linguistics.
Amsterdam: Benjamins.
Jespersen, Otto. [1924] 1965. Philosophy of Grammar. London: Allen and
Unwin.
Pinker, Stephen. 1994. The Language Instinct. New York: Morrow.
Tomasello, Michael. 2003. Constructing a Language. A Usage-Based
Theory of Language-Acquisition. Cambridge: Harvard University
Press.

Analyticity
Trubetzkoy, N. S. [1939] 1958. Grundzge der Phonologie. Gttingen:
Vandenhoeck and Ruprecht.

ANALYTICITY
Analyticity is a property that a statement has when its truth is
in some special way determined by its meaning. Many believe
that no such property exists. Led by W. V. O. Quine (1953), philosophers complain that no one has been able to define the concept of analyticity in a way that is precise and that also fulfills the
purposes to which it is typically put.
To some people, it seems that all uncles are either married or have a sibling is true just because of the meaning of its
constituent words, most prominently because of the meaning
of uncle. But all uncles are less than eight feet tall is true not
because of meanings but because of how the world has turned
out to be. The first sort of statement is said to be analytic, the
second synthetic.
This distinction has far-reaching interest and application.
Empiricists have always had difficulty accounting for the seemingly obvious fact that the truths of logic and mathematics are
both necessary (i.e., they could not be false) and a priori (i.e., they
are known independently of sensory experience). For most of the
twentieth century, it was agreed that analyticity could explain
this obvious fact away as a merely linguistic phenomenon.
The idea that all necessity could be explained away by analyticity fell out of fashion when S. Kripke (1980) convinced most
philosophers that some necessary truths are neither analytic nor
a priori (e.g., water = H2O). But as L. BonJour (1998, 28) points
out, many still think that the unusual modal and epistemic status of logic and mathematics is due to a special relation between
truth and meaning.
However we apply the concept, trouble for analyticity begins
when we remind ourselves that every truth depends, to some
extent, on the meanings of its constituent terms. If the word
uncle had meant oak tree, then both previous examples would
be false. In response, it is said that an analytic truth is one whose
truth depends solely on the meanings of its terms. Our linguistic
conventions alone make the first example true, and the second is
true party because of meaning and partly because of the way the
world is. But how can linguistic convention alone make something true?
We can distinguish the sentence all uncles are either married or have a sibling from the proposition that this sentence
now expresses. Meaning or linguistic convention alone makes
this sentence true in the following way. Given that this sentence
expresses the proposition that it does (i.e., given our current linguistic conventions), it is true. This cannot be said of our other
example. That this sentence means what it does is not sufficient
to determine its truth or falsity (see also sentence meaning).
The world plays a part.
There are three serious problems. First, if this is what it is for
a sentence to be true solely in virtue of its meaning, then it is just
another way of saying that it expresses a necessary truth, and that
tells us nothing about how we know it. Thus, appeal to analyticity
cannot explain the necessity and a priority of logic, mathematics, or anything else. Second, the proposition now expressed
by our first example would be true no matter how we ended up

expressing it. Thus, our current linguistic conventions do not


make it true. Third, our first example certainly does say something about the world. Indeed, it says something about every
object in the universe. If it is an uncle, then either it is married
or it has a sibling.
Some philosophers have called the sort of analyticity discussed so far, where things are said to be true in virtue of meaning, metaphysical analyticity. This is distinguished from epistemic
analyticity. A statement is epistemically analytic when understanding it suffices for being justified in believing it (Boghossian
1996). While these considerations make trouble for metaphysical analyticity, they allegedly leave its epistemic counterpart
untouched.
The purpose of introducing epistemic analyticity is similar to
that of its older ancestor. The hope is that mathematical, logical,
and conceptual truths can be designated as a priori without postulating a special faculty of reason or intuition. This is done by
building certain kinds of knowledge in as preconditions for possessing or understanding concepts. If part of what it is to understand the word uncle is to be disposed to accept that all uncles
are either married or have a sibling, then it could be argued
that once we understand that sentence, we know that it is true.
No experience (beyond what is required for understanding) is
necessary.
The best candidates for epistemically analytic truths are simple truths of logic. But even the most obvious logical truths are
not immune to challenge. For example, a few philosophers and
logicians have claimed that some statements can be both true
and false (Preist 1987) and that modus ponens is invalid (McGee
1985). Yet these sophisticated theoreticians certainly understand
the meanings of their own words. Therefore, acceptance of some
specific truth of logic is not necessary for understanding any logical concept. And since we might someday have good reason to
reject any particular truth of logic while continuing to understand
our logical concepts, understanding some logical concept is not
sufficient for being justified in believing any particular truth of
logic. And if logic is not epistemically analytic, nothing is.
These considerations make the existence of analyticity dubious. But there still appears to be a deep difference between the
two examples. If there is really a difference, it is not one of true in
virtue of meaning versus true in virtue of reality, but one of necessary and a priori versus contingent and empirical.
Michael Veber
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
BonJour, L. 1998. In Defense of Pure Reason. Cambridge: Cambridge
University Press.
Boghossian, P. 1996. Analyticity reconsidered. Nous 30.3: 36091.
Kripke, S. 1980. Naming and Necessity. Cambridge: Harvard University
Press.
McGee, V. 1985. A counterexample to modus ponens. Journal of
Philosophy 82: 46271.
Priest, G. 1987. In Contradiction. Boston: Kluwer.
Quine, W. V. O. 1953. Two dogmas of empricism. In From a Logical
Point of View, 2046. Cambridge: Harvard University Press.
Veber, M. 2007. Not too proud to beg (the question): Why inferentialism cannot account for the a priori. Grazer Philosophische Studien
73: 11331. A critique of Boghossian 1996 and similar views.

99

Anaphora

ANAPHORA
Languages have expressions whose interpretation may involve
an entity that has been mentioned before: Subsequent reference to
an entity already introduced in discourse approximates a general
definition of the notion of anaphora (Safir 2004, 4). This works
well for core cases of nominal anaphora as in (1) (Heim 1982):
(1) a. This soldier has a gun. Will he shoot?
b. Every soldier has a gun. ??Will he shoot?

He in (1a) can be interpreted as the same individual as this soldier.


In (1b) every soldier is quantificational, hence does not denote an
entity he can refer to, which makes anaphora impossible. Possible
discourse antecedents are as diverse as soldiers, water, beauty,
headaches, dissatisfaction, and so on. In addition to nominal
expressions, sentences, verb phrases, prepositional phrases,
adjective phrases, and tenses also admit anaphoric relations.
Thus, the notion discourse entity must be broad enough to capture all these cases of anaphora yet restrictive enough to separate
them from quantificational cases such as every soldier.
The notion of anaphora is closely related to the notion of
interpretative dependency. For instance, in (2), he can depend for
its interpretation on every soldier, and here, too, it is said that he
is anaphorically related to every soldier.
2.

Every soldier who has a gun says he will shoot.

However, (1) versus (2) shows that two different modes of interpretation must be distinguished: i) directly assigning two (or
more) expressions the same discourse entity from the interpretation domain (ID) as a value: co-reference as in (1a), and ii)
interpreting one of the expressions in terms of the other by grammatical means: binding (Reinhart 1983). This contrast is represented in (3).

Coreference in (3a) is restricted in terms of conditions on discourse entities, binding in (3b) in terms of grammatical configuration. Expr1 can only bind expr2 if it c-commands the latter
(Reinhart 1983). This condition is met in (2), but not in (1b),
hence the contrast.
Virtually all languages have words and expressions that are referentially defective they cannot be used deictically (see deixis).
In much of the linguistic literature, these are called anaphors, as
they appear specialized for anaphoric use. Examples vary from
English himself, Dutch zich(zelf), Icelandic sig, Russian sebja,
Chinese (ta) ziji, to Georgian tav tavis, and so on. Such expressions cannot be assigned a discourse value directly. Rather, they
must be bound, often in a local domain approximately the
domain of the nearest subject but subject to variation in terms
of specific anaphor type and language. Furthermore, under

100

Animacy
restrictive conditions, some of these expressions may yet allow a
free logophoric interpretation.
It is an important task to arrive at a detailed understanding
of the ways in which languages encode interpretive relations
between their expressions and of the division of labor between
the components of the language system involved.
Eric Reuland
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht,
the Netherlands: Foris.
Heim, Irene. 1982. The semantics of definite and indefinite noun phrases.
Ph.D. diss., University of Massachusetts at Amherst.
Reinhart, Tanya. 1983. Anaphora and Semantic Interpretation.
London: Croom Helm.
Reuland, Eric. 2010. Anaphora and Language Design. Cambridge: MIT
Press.
Safir, Ken. 2004. The Syntax of Anaphora. Oxford: Oxford University
Press.

ANIMACY
Languages often treat animate and inanimate nouns differently.
Animacy can affect many aspects of grammar, including word
order, and verbal agreement. For example, in Navajo, the
more animate noun must come first in the sentence (Hale 1973),
and in some Bantu languages, a more animate object must come
before a less animate object. Verbs are more likely to agree with
more animate nouns (Comrie 1989). Animacy can also affect the
choice of case, preposition, verb form, determiner (article), or
possessive marker (Comrie 1989).
What counts as animate differs cross-linguistically. The grammatical category of animates may include certain objectively
inanimate things, such as fire, lightning, or wind. Languages may
make additional distinctions among pronouns, proper nouns,
and common nouns, or between definites and indefinites, and
these are sometimes viewed as part of an animacy hierarchy by
linguists (Comrie 1989). person and number distinctions may
also be included in an animacy hierarchy in this broader sense.
For example, according to Michael Silverstein (1976), subjects
with features at the less animate end of the Animacy Hierarchy in
(1) are more likely to be marked with morphological case, while
the reverse holds of objects.
(1)

Animacy Hierarchy
1pl > 1sing > 2pl > 2sing > 3human.pl > 3human.sing >
3anim.pl > 3anim.sing > 3inan.pl > 3inan.sing

Dyirbal exhibits this pattern in that only third person subjects have morphological case, whereas all human objects do.
Silverstein (1976) postulates that the function of such differential case marking is to flag less animate subjects and more animate objects to avoid ambiguity. It is interesting to note that
the patterns of such differential animacy marking are far more
complex and diverse cross-linguistically for objects than for subjects (Aissen 2003). This may be traced to a relation between animacy and object shift, which produces an associated change in
case or verbal agreement (Woolford 2000, 2007). The less diverse
animacy effects on subject case, which do not affect agreement

Animal Communication and Human Language


(Comrie 1991), may be purely morphological, markedness effects
(Woolford 2007).
Ellen Woolford
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aissen, Judith. 2003. Differential object marking: Iconicity vs. economy.
Natural Language and Linguistic Theory 21: 43583.
Comrie, Bernard. 1989. Language Universals and Linguistic Typology.
Oxford: Blackwell.
. 1991. Form and function in identifying cases. In Paradigms, ed.
F. Plank, 4155. Berlin: Mouton de Gruyter.
Hale, Kenneth L. 1973. A note on subject-object inversion in Navajo. In
Issues in Linguistics: Papers in Honor of Henry and Rene Kahane, ed. B.
B. Kachru et al., 300309. Urbana: University of Illinois Press.
Silverstein, Michael. 1976. Hierarchy of features and ergativity. In
Grammatical Categories in Australian languages, ed. R. M. Dixon,
11271. Canberra: Australian Institute of Aboriginal Studies.
Woolford, Ellen. 2000. Object agreement in Palauan. In Formal Problems
in Austronesian Morphology and Syntax, ed. I. Paul , V. Phillips, and L.
Travis, 21545. Dordrecht, the Netherlands: Kluwer.
. 2007. Differential subject marking at argument structure, syntax, and PF. In Differential Subject Marking, ed. H. de Hoop and P. de
Swart, 1740. Dordrecht: Springer.

ANIMAL COMMUNICATION AND HUMAN LANGUAGE


An understanding of the communicative capacities of other animals is important on its face both for an appreciation of the place
of human language in a broader context and as a prerequisite to
discussion of the evolution of language (see, for example, evolutionary psychology). On closer examination, however,
the differences between human language and the systems of
other animals appear so profound as to make both projects more
problematic than they appear at first.
In the 1950s and 1960s, ethologists like Konrad Lorenz and
Niko Tinbergen revolutionized behavioral biologists views of
the cognitive capacities of animals, but consideration of animal communication focused on the properties of quite simple
systems. A prime example of communication in the texts of
the time was the stickleback. A crucial component of the mating behavior of this common fish is the pronounced red coloration of the males underbelly when he is in mating condition,
which furnishes a signal to the female that she should follow him
to his preconstructed nest, where her eggs will be fertilized. On
this model, communication was viewed as behavioral or other
signals emitted by one organism, from which another organism
(typically, though not always, a conspecific) derives some information. The biological analysis of communication thus came to
be the study of the ways in which such simple signals arise in the
behavioral repertoire of animals and come to play the roles they
do for others who perceive them. Those discussions make little if
any contact with the analysis of human language.
In the intervening half century, we have come to know vastly
more about the nature and architecture of the human language faculty and to have good reason to think that much of it
is grounded in human biology. One might expect, therefore, to
find these concerns reflected in the behavioral biology literature.
A comprehensive modern textbook on animal communication

within this field (e.g., Bradbury and Vehrenkamp 1998) reveals


a great deal about the communicative behavior of many species
and its origins, but within essentially the same picture of what
constitutes communication in (nonhuman) animals, confined to
unitary signals holistically transmitted and interpreted. Little if
any of what we have come to know about human linguistic communication finds a place here.
Biologists have not, in general, paid much attention to the
specifics of linguistic research (though their attention has been
caught by the notion that human language is importantly based
in human biology) and are often not as sophisticated as one
might wish about the complexity of natural language. But the
consequences may not be as serious as linguists are inclined to
think. In fact, the communicative behavior of nonhumans in general is essentially encompassed within the simple signal-passing
model. The complexities of structure displayed by human language are apparently quite unique to our species and may not
be directly relevant to the analysis of animal communication
elsewhere.

What (Other) Animals Do


Communication in the sense of emission and reception of informative signals is found in animals as simple as bacteria (quorum
sensing). Most familiar, perhaps, are visual displays of various
sorts that indicate aggression, submission, invitations to mate,
and so on. In some instances, these may involve quite complex
sequences of gestures, reciprocal interactions, and the like, as in
the case of the nesting and mating behavior of many birds. In
others, a simple facial expression, posture, or manner of walking
may provide the signal from which others can derive information
about the animals intentions and attitudes.
These differences of internal structure are, of course, crucial
for the correct expression and interpretation of a particular signal, but they play little or no role in determining its meaning. That
is, the individual components of the signal do not in themselves
correspond to parts of its meaning, in the sense that varying one
subpart results in a corresponding variation in what is signaled.
Animal signals, however complex in form (and however elaborate
the message conveyed), are unitary wholes. An entire courtship
dance, perhaps extending over several minutes or even longer,
conveys the sense I am interested in mating with you, providing
a nesting place, and care for our offspring. No part of the dance
corresponds exactly to the providing care part of the message;
the message cannot be minimally altered to convey I am interested in mating but not in providing care for our offspring, I was
interested in mating (but am no longer), and so on. Variations in
intensity of expression can convey (continuous) variations in the
intensity of the message (e.g., urgency of aggressive intent), but
that is essentially the only way messages can be modulated.
The most widely discussed apparent exception to this generalization is the dance language of some species of honeybees.
The bees dance conveys information about a) the direction, b)
the distance, and c) the quality of a food source (or potential
hive site), all on quasi-continuous scales and each in terms of
a distinct dimension of the dance. Although the content of the
message here can be decomposed, and each part associated
with a distinct component of the form of the signal, there is no
element of free combination. Every dance necessarily conveys

101

Animal Communication and Human Language


exactly these three things, and it is only the relative value on each
dimension that is variable. As such, the degree of freedom available to construct new messages is not interestingly different from
that involved in conveying different degrees of fear or aggression
by varying degrees of piloerection.
Visual displays do not at all exhaust the modalities in which
animal communication takes place, of course. Auditory signals
are important to many species, including such classics of the animal communication literature as frog croaks and the calls and
songs of birds (see birdsong and human language). In
some species, portions of the auditory spectrum that are inaccessible to humans are involved, as in the ultrasound communication of bats, some rodents, and dolphins, and the infrasound
signals of elephants. Chemical or olfactory communication is
central to the lives of many animals, including moths, mice, and
lemurs, as well our pet cats and dogs. More exotic possibilities
include the modulation of electric fields generated (and perceived) by certain species of fish.
In some of these systems, the internal structure of the signal
may be quite complex, as in the songs of many oscine songbirds,
but the general point still holds: However elaborate its internal
form, the signal has a unitary and holistic relation to the message
it conveys. In no case is it possible to construct novel messages
freely by substitutions or other ways of varying aspects of the signals form.
In most animals, the relation of communicative behavior to
the basic biology of the species is very direct. Perceptual systems
are often quite precisely attuned to signals produced by conspecifics. Thus, the frogs auditory system involves two separate
structures (the amphibian papilla and the basilar papilla) that are
sensitive to acoustic signals, typically at distinct frequencies. The
frequencies to which they are most sensitive vary across species
but are generally closely related to two regions of prominence
in the acoustic structure of that species calls. Mice (and many
other mammals) have two distinct olfactory organs, projecting
to quite distinct parts of the mouse brain. The olfactory epithelium is responsive to a wide array of smells, but the vomeronasal
organ is sensitive primarily to the pheremones that play a major
role in communication and social organization. In this case, as in
many, many others, the perceptual system is matched to production in ways that optimize the organisms sensitivity to signals
that play a crucial ecological role in the life of the animal.
The essential connection between a species system of communication and its biology is also manifested in the fact that
nearly all such systems are innately specified. That is, the ability
to produce and interpret relevant signals emerges in the individual without any necessary role of experience. Animal communication is not learned (or taught) but, rather, develops (in the
absence of specific pathology, such as deafness) as part of the
normal course of maturation. Animals raised under conditions
in which they are deprived of exposure to normal conspecific
behavior will nonetheless communicate in the fashion normal to
their species when given a chance.
Exceptions to this generalization are extremely rare, apart
from human language. Vocal learning, in particular, has been
demonstrated only to a limited extent in cetaceans and some bats
and, more extensively, in 3 of the 27 orders of birds. The study of
birds, especially oscine songbirds, is particularly instructive in

102

this regard. In general, their song is learned on the basis of early


exposure to appropriate models, from which they in turn compose their own songs. It is interesting to note there appear to be
quite close homologies in the neurophysiology of vocal learning and perhaps even in its underlying genetic basis between
birds and humans, although what is learned in birds is a unitary, holistic signal like those in other nonhuman communication systems, rather than individual lexical items subject to free
recombination to produce different meanings.
There is much variation across bird species, but a clear generalization emerges: For each, there is a specific range of song
structures that individuals of that species can learn. Experience
plays a role in providing the models on which adult song is based,
but (with the exception of a few very general mimics, such as the
lyrebird) this role is quite narrowly constrained by the songlearning system of the individual species.

What Humans Do, and How It Is Different


Like the systems of communication of other animals, human
language is deeply embedded in human biology. Unlike others,
however, it provides an unbounded range of distinct, discrete
messages. Human language is acquired at a specific point in
development from within a limited range of possibilities, similar to the acquisition of song in birds. Unlike the communicative
signals of other species, human language is under voluntary control, with its underlying neurobiology concentrated in cortical
structures, as opposed to the subcortical control characteristic of
those other species that have been studied in this regard.
Human language is structurally a discrete combinatorial system, in which elements from a limited set combine in a recursive,
hierarchical fashion to make an unlimited number of potentially
novel messages (see recursion, iteration, and metarepresentation). The combinatorial structure of language is governed by two quite independent systems: A small inventory of
individually meaningless sounds combine to make meaningful
words, on the one hand (phonology), while these words are
combined by a quite different system to make phrases, clauses,
and sentences (see syntax). These properties (discrete combination, recursive hierarchical organization, and duality of patterning) are not simply idiosyncratic ornaments that could in
principle be omitted without affecting the overall communicative
capacity of the system. Rather, they are what make large vocabularies practical and unbounded free expression possible. Contrast
the unlimited range of potentially novel utterances that any (normal) speaker of a language can produce, and another speaker of
the same language comprehend, with the strictly limited range of
meaningful signals available to other organisms. No other form of
communication found in nature has these properties. Although
song in some species of birds does display a limited amount of
phonological combinatoriality, there is no analog even here to
meaningful syntax. Human language, and especially its syntactic
organization, is quite unique in the animal world.
Furthermore, efforts to teach systems with these essential
properties to other animals have not succeeded. Despite widespread claims to the contrary in the popular literature, there is
no evidence that any nonhuman animal is capable of acquiring
and using such a system. This should not be seen as particularly
surprising. If language is indeed embedded in human biology,

Aphasia
there is no reason to expect it to be accessible to organisms with
a different biological endowment, anymore than humans are
capable of acquiring, say, the echolocation capacities of bats, a
system that is equally grounded in the specific biology of those
animals.

Conclusion
Human language is often considered as simply one more instantiation of the general class of animal communication systems.
Indeed, like others it appears to be highly species specific.
Although relevant experience is required to develop the system
of any particular language, the overall class of languages accessible to the human learner is apparently highly constrained, and
the process of language learning is more like genetically governed maturation than like learning in general. The structural
characteristics of human language are quite different from those
of other communication systems, and it is the freedom of expression subserved by those distinctive properties that gives language
the role it has in human life. (See also primate vocalizations
and grooming, gossip, and language).
Stephen R. Anderson

emission tomography (PET), and functional MRI (fMRI) have


helped define functions of core speech and language areas.

Aphasia Syndromes
Aphasia has traditionally been categorized into seven subtypes,
including Brocas, Wernickes, global, anomic, conduction,
transcortical sensory, and transcortical motor. These aphasia
variants are characterized by different patterns of speech fluency,
auditory comprehension, repetition, and naming. Although
patients may be classified as having one type of aphasia in the
early period after a brain injury, this classification may change as
language problems resolve with time and treatment.
Brocas aphasia is characterized by a constellation of symptoms, including slow, halting speech with impaired grammar;
disturbed auditory comprehension for grammatically complex
phrases and sentences; and poor repetition. Word-finding problems and difficulty with reading and writing are common. Motor
speech disorders, such as apraxia of speech, a disorder of articulatory planning or coordination, and dysarthria, an impairment
in muscle strength, tone, or coordination, very often co-occur.
Patients with Brocas aphasia often talk in a series of nouns and
verbs, as is the case in the following sample from a patient describing the picnic scene from the Western Aphasia Battery (WAB):

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Anderson, Stephen R. 2004. Doctor Dolittles Delusion: Animals and the
Uniqueness of Human Language. New Haven, CT: Yale University
Press.
Bradbury, J. W., and Sandra Vehrenkamp. 1998. Principles of Animal
Communication. Sunderland, MA: Sinauer.
Hauser, Marc D. 1996. The Evolution of Communication. Cambridge,
MA: MIT Press.

APHASIA
Aphasia is a language impairment caused by brain injury that
affects speech content, auditory comprehension, reading, and
writing to varying degrees. Mild aphasia may result in occasional
word-finding problems, while more severe forms can cause profound deficits in all language domains. Aphasia differs from
developmental disorders in that it occurs after a brain injury to a
person with otherwise normal language skills.
Typically, aphasia results from damage to the left hemisphere of the brain due to stroke, traumatic brain injury, tumor,
or degenerative neurological disease. Nearly all right-handed
individuals and most left-handers are thought to have core linguistic functions semantics, phonology, syntax, and
morphology lateralized to the left hemisphere, while other
aspects of language, specifically prosody and pragmatics, are
associated with the right hemisphere. Approximately one
million people in the United States are afflicted with aphasia,
which is a prevalence rate similar to that of Parkinsons disease.
Roughly 80,000 more acquire aphasia annually.
Recent advances in the study of language have provided greater
insight into aphasia syndromes. Modern neuroimaging technology has helped refine classic models of language localization,
as well as our understanding of aphasia treatment and recovery.
In particular, brain-imaging techniques such as magnetic resonance imaging (MRI), computerized tomography (CT), positron

I know it tree house car man with uh woman kid


over here with flag i can know it nice sun shiny day
[unintelligible word]

Patients with Brocas aphasia can participate relatively well


in everyday conversation by using single words or phrases,
often combined with meaningful gestures and facial expressions. Some patients use writing and drawing to compensate for
restricted verbal output. As is true with all aphasia syndromes,
there is a wide range of symptom severity. Those with severe
Brocas aphasia may present with such profound verbal deficits
that their speech is limited to a single recurrent utterance (e.g.,
yeah, yeah). In these patients, comprehension is usually preserved for short, simple phrases, but is significantly impaired for
more complex information.
Classic aphasia models assume that lesions to brocas area
(See Figure 1) result in Brocas aphasia, but research has indicated that this is not always the case. Reports as early as 1870
documented cases that did not support this linear relationship
(e.g., Bateman 1870; Marie 1906; Moutier 1908; Dronkers et al.
2007). Modern research has found that chronic Brocas aphasia
typically results from large lesions that encompass left frontal
brain regions, underlying white matter, the insula, and the anterior parietal lobe. Lesions restricted to Brocas area tend to
cause transient mutism that spontaneously resolves within days
or weeks (Mohr 1976). In some cases, Brocas aphasia can occur
without damage to Brocas area (e.g., Basso et al. 1985; Mazzocchi
and Vignolo 1979).
Patients with Wernickes aphasia present a reverse pattern of
symptoms when compared to those with Brocas aphasia: While
speech is fluent, comprehension is impaired. Patients speak in
a normal or rapid rate. However, they often use meaningless
words, jargon, or semantic paraphasias (e.g., using a related
word, bike for a target word, car). Reading and writing may
be similarly disrupted. The following exemplifies the speech

103

Aphasia

Figure 1. Several of the key brain regions affected


in aphasia. Areas depicted as typical lesions are
derived from patient data obtained at the Center
for Aphasia and Related Disorders.

content of a patient with Wernickes aphasia describing the


WAB picnic scene:
And the man and hers and Ill say I dont think shes working.
Theyre not doing the thing. Then the ladder then the tree
and the /let/ [points to kite] and lady here [points to boy]
have to clean that.

In contrast to patients with Brocas aphasia, those with


Wernickes aphasia may understand very little in conversation
because of their impaired comprehension of single words. In
addition, successful communication is made challenging by verbal output that is empty, coupled with an inability to monitor
speech content. Using visual information to compensate for comprehension deficits is often beneficial (e.g., providing pictures,
drawing, or writing key words during conversational exchanges).
Persisting cases of Wernickes aphasia are not caused by injury
to wernickes area alone but, rather, by much larger lesions
affecting most of the middle temporal gyrus and underlying white
matter (Dronkers, Redfern, and Ludy 1995; see Figure 1). Such
damage amounts to a poorer prognosis for recovery. Patients
with lesions confined to Wernickes area tend to have symptoms
of Wernickes aphasia that resolve, resulting in milder forms of
aphasia, most often conduction aphasia or anomic aphasia, if
the lesion spares the middle temporal gyrus.
Conduction aphasia is a fluent aphasia characterized by an
inability to repeat. Auditory comprehension is relatively preserved, and patients use speech that is largely understandable
but may be rife with phonemic paraphasias (substituting sounds
in words, e.g., netter for letter). While high-frequency words
and short phrases may be repeated accurately (e.g., the telephone is ringing), low-frequency items are more difficult (e.g.,
first British field artillery). Patients may retain the meaning
of such phrases, owing to their preserved comprehension, but
the phonological trace is disrupted, thereby disturbing verbatim repetition. The following typifies the speech of a patient with
conduction aphasia, again describing the WAB picnic scene:
Well theres a house near a clearing, evidently its on the water.
Further, theres a stick with a banner in the foreground
[referring to the flag]. I dont know what thats called a pier
a tier? Theres a bucket and a /kovel/. It looks like theres
someone playing in the water.

104

Initial reports that conduction aphasia arose from lesions


to the arcuate fasciculus (the white matter tract connecting
Wernickes and Brocas areas; see Figure 1) have been refined
over the years. Modern studies have shown that conduction
aphasia results most often from lesions to the posterior superior temporal gyrus (Dronkers et al. 1998), the auditory cortex
(Damasio and Damasio 1980), or periventricular white matter
underlying the supramarginal gyrus (Sakurai et al. 1998).
Global aphasia, the most severe syndrome, is characterized
by profound impairments in all language modalities. Speech,
auditory comprehension, naming, repetition, reading, and writing are all affected, leaving the patient with very little functional
communication. Speech may be limited to single stereotyped
or automatic words and phrases (e.g., yes, no, I dont know).
Auditory comprehension may be impaired for even simple yes/
no questions. Such a severe loss of language typically results
from a large cortical lesion, encompassing the frontal, temporal,
and parietal lobes. Patients often rely on preserved nonverbal
skills to aid in communication (e.g., the recognition of pictures
and gestures to support auditory comprehension and the ability
to draw or gesture to aid in expression).
Anomic aphasia, the mildest of the syndromes, results in
word-finding deficits (anomia), while other language skills are
typically well preserved. When attempting to find a target word,
patients with anomic aphasia may describe its function or use a
synonym. Speech may be slow and halting, due to anomia, but
grammar is unaffected. Anomic aphasia can result from lesions
anywhere within the perisylvian region.
The transcortical aphasias are rare and characterized by a
preserved ability to repeat, despite impairments in other language domains. Transcortical motor aphasia (TCMA) is similar to Brocas aphasia, in that patients present with nonfluent
speech and relatively intact comprehension, but repetition skills
are markedly well preserved. Lesions typically spare core language areas, are smaller than those that cause Brocas aphasia,
and are restricted to anterior and superior frontal lobe regions.
Although patients may be mute initially, their symptoms tend
to resolve quickly, resulting in anomic aphasia. Patients with
transcortical sensory aphasia (TCSA) present much like patients
with Wernickes aphasia, with empty, fluent speech and poor
comprehension, but they too retain a striking ability to repeat.
Lesions typically involve portions of the posterior temporal

Aphasia

Areal Distinctness and Literature

and parietal regions, but tend to be much smaller than those of


Wernickes aphasia. Acute symptoms usually resolve to produce
an anomic aphasia.
While aphasia most often occurs suddenly, as the result
of injury, a degenerative form of aphasia was first described
over a century ago by Arnold Pick, a Czech neurologist, and
later expanded upon by Marsel Mesulam in a landmark paper
in which he described six patients who presented with language deficits, in the absence of other behavioral abnormalities
(Mesulam 1982). Speech or language deficits remained the only
impairment for the first two years in these patients, but as the
disease progressed, more generalized dementia emerged. This
progressive disorder was distinct from other dementias, such as
Alzheimers disease, because language problems, rather than
memory complaints, were the most salient symptoms.
Since then, numerous cases of what is now termed primary progressive aphasia (PPA) have been described, in which
patients present with both fluent and nonfluent variants of the
disorder (Snowden, et al. 1992; Gorno-Tempini et al. 2004).
Neuroimaging typically shows left perisylvian atrophy, encompassing frontal regions in progressive nonfluent aphasia and
anterior temporal and temporo-parietal regions in the more fluent semantic dementia and logopenic variants. There are many
underlying pathologies that cause the clinical syndrome of PPA,
including Picks disease, progressive supranuclear palsy, corticobasal degeneration, dementia lacking distinctive pathology,
and Alzheimers disease.

Treatment for Aphasia


Critical reviews of aphasia treatment studies (e.g., Bhogal,
Teasell, and Speechley 2003; Holland et al. 1996) have shown
that treatment can be effective in improving language skills past
the point that might be expected from spontaneous recovery
alone. Although it remains difficult to predict the treatment that
will result in the greatest amount of change for an individual,
there are many options from which to choose.
Patients with aphasia are typically referred to speech language pathologists for diagnostic testing aimed at developing
treatment goals. Therapy may focus on improving impaired skills
or developing compensatory strategies to overcome obstacles to
successful communication. Patient-specific factors (e.g., aphasia
severity, cognitive ability, general health, and motivation) also
influence treatment decisions. Research is inconclusive, however, as to the prognostic weight that these variables contribute
to recovery and treatment planning for an individual.
Nina F. Dronkers, Jennifer Ogar
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Basso, A., A. R. Lecours, S. Moraschini, and M. Vanier. 1985. Anatomoclinical correlations of the aphasias as defined through computerized
tomography: On exceptions. Brain and Language 26: 20129.
Bateman, F. 1870. On Aphasia. London: Churchill.
Bhogal, S. K., R. Teasell, and M. Speechley. 2003. Intensity of aphasia
therapy, impact on recovery. Stroke 34.4: 98793.
Damasio, H., and A. R. Damasio. 1980. The anatomical basis of conduction aphasia. Brain 103: 33750.
Dronkers, N. F., B. B. Redfern, and C. A. Ludy. 1995. Lesion localization
in chronic Wernickes aphasia. Brain and Language 51: 6265.

Dronkers, N. F., O. Plaisant, M. T. Iba-Zizen, and E. A. Cabanis. 2007.


Paul Brocas historic cases: High resolution MR imaging of the brains
of Leborgne and Lelong. Brain 130: 143241.
Dronkers N. F., B. B. Redfern, C. Ludy, and J. Baldo. 1998. Brain regions
associated with conduction aphasia and echoic rehearsal. Journal of
the International Neuropsychological Society 4: 234.
Gorno-Tempini, M. L., N. F. Dronkers, K. P. Rankin, et al. 2004. Cognition
and anatomy in three variants of primary progressive aphasia. Annals
of Neurology 55: 33546.
Holland, A. L., D. S. Fromm, F. DeRuyter, M. Stein. 1996. Treatment efficacy. Journal of Speech and Hearing Research 39.5: S2736.
Marie P. 1906. Revision de la question de laphasie: La troisieme circonvolution frontale gauche ne joue aucun role special dans la fonction du
langage. Semaine Medicale 26: 2417.
Mazzocchi, F., and L. A. Vignolo. 1979. Localization of lesions in
aphasia: Clinical CT-scan correlations in stroke patients. Cortex
15: 62754.
Mesulam, M. M. 1982. Slowly progressive aphasia without generalized
dementia. Annals of Neurology 11: 5928.
Mohr, J. P. 1976. Brocas area and Brocas aphasia. In Studies in
Neurolinguistics. Vol. 1. Ed. H. Whitaker and H. Whitaker, 20133. New
York: Academic Press.
Moutier, F. 1908. LAphasie de Broca. Paris: Steinheil.
Sakurai, Y., S. Takeuchi, E. Kojima, et. al. 1998. Mechanism of shortterm memory and repetition in conduction aphasia and related cognitive disorders: a neuropsychological, audiological and neuroimaging
study. Journal of Neurological Sciences 154.2: 18293.
Snowden, J. S., D. Neary, D. M. Mann, et al. 1992. Progressive language
disorder due to lobar atrophy. Annals of Neurology 31: 17483.

AREAL DISTINCTNESS AND LITERATURE


There are two criteria for determining whether a linguistic property is a universal. First, it must occur across languages with a
frequency greater than chance. Second, the presence of the property in some of these languages should not have been caused by
its presence in other languages. In linguistics, the causal criterion is often operationally specified into two subcriteria genetic
and areal distinctness, which is to say, distinctness in origin and
in cross-language interaction.
Researchers in literary universals also adopt the preceding criteria. However, literature is different from language
in being more readily open to influence. Specifically, the operational criterion of areal distinctness becomes much more difficult
to satisfy in the case of literature. Even a single work, transported
across continents, may produce significant changes in the recipient literature.
There are three ways of responding to this problem. The first is
to focus on literary works produced before the period of extensive
global interaction. Research of this sort must form the primary basis
for any serious study of literary universals. Moreover, such research
indicates that there are some significant universals, for example the
narrative universals of heroic, romantic, and sacrificial tragicomedy. However, this approach to areal distinctness cannot be as
rigorous as one might like. Global interaction extends back through
the formation of all the major literary traditions.
The second response involves a more nuanced approach for
isolating influence from a source tradition to a recipient tradition. Here, we may distinguish between self-conscious and
implicit learning. Self-conscious learning can occur with a single

105

Art, Languages of
exposure to salient features of a literary work. Implicit learning,
however, is likely to require many exposures, commonly while
immersed in the culture and language of the source tradition. In
isolating literary universals, then, we may take into account the
degree to which a particular property is likely to have been transported from one tradition to another by learning of either sort,
given the degree of contact between the traditions. For example,
the practice of dramatic performance may be transmitted from
one tradition to another through limited interaction, as this may
be learned through a single exposure. The same point does not
hold for background imagery.
Finally, we may wish to expand our study of cross-cultural
patterns to actual borrowings. Here, too, it is crucial to distinguish different types of influence. We may roughly divide influence into two categories hegemonic and nonhegemonic.
Hegemonic influence occurs when the source tradition has
greater economic power (e.g., in the publication and distribution of literary works), more pervasive control of government
or education, or a higher level of prestige (due, for example, to
military strength), or when it is otherwise in a position of cultural domination over the recipient society. Obvious cases are
to be found in colonialism. Common properties that result
from non-hegemonic influences are not universals themselves.
However, they may tell us something about cross-cultural aesthetic or related propensities. Common properties that result
from hegemonic influences, in contrast, may simply reflect the
effects of power.
Patrick Colm Hogan
WORK CITED AND SUGGESTIONS FOR FURTHER READING
Comrie, Bernard. 1981. Language Universals and Linguistic Typology.
Chicago: University of Chicago Press.
Hogan, Patrick Colm. 2005. Literary universals and their cultural traditions: The case of poetic imagery. Consciousness, Literature, and
the Arts 6.2. Available online at: http://www.aber.ac.uk/cla/archive/
hogan.html.

ART, LANGUAGES OF
Languages of Art, a book by Nelson Goodman (190698), was
first published in 1968, with a second edition in 1976. The present entry focuses solely on this book, which raises interesting
questions about language in a general sense and its role in
aesthetic experience. This entry does not attempt to contextualize Goodmans book relative to his philosophy, for which see
Daniel Cohnitz and Marcus Rosenberg (2006) and Catherine Z.
Elgin (1992); Goodmans later and related Ways of Worldmaking
is also recommended (Goodman and Elgin 1978).
By languages (of art), Goodman means more generally symbol systems; natural language is one of the symbol systems, which
include, for example, musical notation or the symbol system of
cubist painting. Certain characteristics of symbol systems, when
used in an artwork, place cognitive demands on its audience,
which make the artwork good to think (to borrow Claude LviStrausss term). The symbol systems from which artworks are
composed enable us to be exploratory, drawing on our cognitive
(including emotional) resources. This is because artworks are

106

made from symbol systems that have one or more of the symptoms of the aesthetic: syntactic density, semantic density, syntactic repletenesss, and exemplification. These notions are defined
later in this entry.
According to Goodman, A symbol system consists of a symbol scheme correlated with a field of reference ([1968] 1976,143).
Goodmans primary interest in defining a symbol system is to
differentiate the notational from the non-notational schemes,
where notation is a technical notion to which his Chapter 4
is devoted. His concern about notations follows from a concern
about forgeries and fakes and with the fact that some types of
art (such as painting) can be faked while others (such as performance of a specific piece of music) cannot. Where a work is
defined by compliance to a score (i.e., it has a notation), it cannot
be faked; such works are called allographic. Where a work is not
defined by compliance to a score, as in the case of a painting, its
authenticity can be established only by tracing the history of its
production back to its origin, and this permits faking; such works
are called autographic.
A symbol system is built on a symbol scheme, which consists of characters (and usually modes of combination for these
characters). For example, for a natural language, a character is a
class of marks, where marks might include anything from single
sounds or letters up to whole spoken or written texts, as in the letter P, a character that is the class whose members are all the writings-down of the letter P. Symbol systems are either notations or
not notations. If the symbol system is a notation, the characters
of which its scheme is comprised must meet two conditions, as
follows:
(1) For a character in a notation, the members can be
interchanged, where different characters can be true copies of one another; this is called the condition of characterindifference and is true, for example, of letters of the English
alphabet.
(2) Characters in a notation must be finitely differentiated
or articulate; for a mark that does not belong to two characters, it must be theoretically possible to determine that it does
not belong to at least one of them (this is explained further
shortly).
(1, 2) are the two syntactic requirements that define a symbol system as notational.
Characters in a scheme are correlated with things outside
the scheme. For example, the marks that make up a score are
correlated with elements in the performance of the score; the
mark that is a written word is correlated with a pronunciation of
that word; and the mark that is a written word is (also and independently) correlated with that words referent. Goodman uses
the term complies and says that the performance complies
with the score, or the referent, or pronunciation, complies with
the written word. The set of things that comply with an inscription (e.g., the set of things named that can be denoted by a
name) is called the compliance class of the inscription. For the
symbol system to be a notation, it must first include a symbol
scheme that is notational (i.e., that satisfies the two syntactic
conditions), and it must also satisfy three semantic conditions,
as follows.

Art, Languages of
(3) Notational systems must be unambiguous; it must be
clear which object complies with each unique element of the
scheme.
(4) In a notation, compliance classes must be disjoint; for
example, a performance cannot comply with two different
scores.
(5) A notational system must be semantically finitely differentiated; for an object that does not comply with two characters, it must be theoretically possible to determine that the
object does not comply with at least one of them.
The notion of finite differentiation is important both in the
syntax and semantics of notational systems; finite differentiation
is articulation, and its lack constitutes density. As we will see,
though articulation is important for a notational system, density
is more generally important in defining works as aesthetic. Finite
differentiation requires gaps between elements in the system
(between characters, or between compliants); if between two
adjacent elements a third can always be inserted, the scheme
lacks finite differentiation. For example, a scheme lacks finite
differentiation if it has two characters, where all marks not longer than one inch belong to one character and all longer marks
belong to the other, and where marks can be of any length.
Between a mark belonging to the character of marks not longer
than one inch and a mark belonging to the character of longer
marks, it is always (theoretically) possible to have a third that
falls between them (this ever-diminishing between-space is a
kind of Derridean mise-en-abme).
A symbol system is called a notation if it meets the five conditions. Goodman asks whether various types of symbol systems
that have been developed in the arts are notations. (A type of
artistic practice may be non-notational just because no notation
has been developed for it; in principle, notations might be developed for all of them, but in practice they have not been.) A traditional musical score is a character in a notational system. The
compliants of the score are the performances, which collectively
constitute the work of music. Similar comments are made for
Labanotation, a scoring system developed for dance. A literary
work is a character in a notational scheme (but not a character
in a notational system): Like the language from which it is composed, it meets the syntactic requirements for a notation, but not
the semantic requirements. A painting is a work that is in a symbol system that is not notational.
Having developed these notions, Goodman uses them as
a way of defining a representation, a problem raised and not
solved in the first part of the book, where, for example, he argues
that we cannot distinguish a representation by criteria such
as resemblance to the represented object. Representation for
Goodman is distinct from description (i.e., the term representation does not correspond to current cognitive science or linguistic
uses, in which propositions or tree structures are representations). A description uses a symbol scheme that is (syntactically)
articulate, whereas a representation uses a symbol system that is
dense (or uses symbols from a dense part of a symbol scheme).
He distinguishes between two types of dense (representational)
schemes, differentiating a diagram from a picture. His example is
a pair of representations that are visually identical, consisting of a
peaking line, one an electrocardiogram and the other a picture

of the outline of Mount Fuji. What makes the electrocardiogram


a diagram is that not every aspect of its form is relevant; the line
can vary in thickness or color without constituting a different
character. In contrast, every aspect of the form of the picture is
relevant; pictures have much fewer contingent features than diagrams, and pictures are thus said to be relatively (syntactically)
replete. The difference between diagram and picture is a matter
of degree; repleteness is a relative characteristic.
Goodman concludes his book by using these notions to
develop four symptoms of the aesthetic. Objects have aesthetic symptoms when they use symbol systems that are syntactically dense, semantically dense, and syntactically replete.
The fourth symptom of the aesthetic is that aesthetic objects
exemplify. (In Ways of Worldmaking, Goodman explores the
notion of style and proposes that the style of an artwork is one
of the referents exemplified by its symbols, where style consists
of those features of the symbolic functioning of a work that are
characteristic of author, period, place or school [1978, 35]. In the
same book, he introduces a fifth symptom of the aesthetic, which
is multiple and complex reference.) Note that the first three of
these symptoms are characteristic of non-notational symbol systems; all three are associated with density in some more general sense, which, Goodman says arises out of, and sustains, the
unsatisfiable demand for absolute precision, thus engaging our
interest in aesthetic works ([1968] 1976, 253).
Goodman concludes his discussion by asking what gives an
aesthetic object its value, both relative to other aesthetic objects
and, more generally, to us: What makes us want to know it? He
argues that aesthetic objects invite our interest by asking us to
understand what they are, including how their symbol systems
operate, and what they exemplify; these tasks are made particularly difficult by the four symptoms of the aesthetic, which thus
particularly stimulate our interest in aesthetic objects. He summarizes three criteria, drawn from general ideas about aesthetics: Engagement with artworks improves our fitness to cope with
the world, just manifests our playfulness (i.e., homo ludens),
or communicates special kinds of knowledge to us. These are
partial insights into the primary purpose of our engagement with
aesthetic objects: The primary purpose is cognition in and for
itself; the practicality, pleasure, compulsion, and communicative utility all depend on this ([1968] 1976, 258). The symbol
systems or languages of art serve this purpose, allowing
for the possibility of producing symbolic objects that engage us.
Furthermore, the characteristic density of the symbolic systems
used in artworks, and the characteristic unparaphrasability of
what they express both permit a person to reenter the same artwork and repeatedly to discover new things in it.
Nigel Fabb
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Cohnitz, Daniel, and Marcus Rosenberg. 2006. Nelson Goodman.
Montreal: McGill-Queens University Press.
Elgin, Catherine Z. 1992 . Depiction. In A Companion to Aesthetics, ed.
David Cooper, 11316. Oxford: Blackwell.
. 1992. Nelson Goodman. In A Companion to Aesthetics, ed. David
Cooper, 1757. Oxford: Blackwell.
Goodman, Nelson. [1968] 1976. Languages of Art. 2d ed.
Indianapolis: Hackett.

107

Articulatory Phonetics
. 1988. Reconceptions in Philosophy and Other Arts and Sciences.
London: Routledge.
Goodman, Nelson, and Catherine Z. Elgin. 1978. Ways of Worldmaking.
Indianapolis: Hackett.

ARTICULATORY PHONETICS
Articulatory phonetics is that part of phonetics that studies how
speech is produced by the lips, tongue, velum (soft palate), larynx, and lungs to alter the air pressures and airflows and turbulent
noises in the vocal tract and to create the air spaces that yield the
resonances that differentiate speech sounds. The basic vocabulary
of articulatory phonetics is used for the taxonomic, that is, classificatory, description of speech sounds. For example, the initial
sound in the French word pre, father, would be described as
a voiceless bilabial stop, symbolized using the IPA (International
Phonetic Alphabet) symbol [p]. The initial sound in the English
word pear would be described as a voiceless aspirated bilabial
stop and symbolized using the IPA symbol [ph]. However, besides
this essential taxonomic function, articulatory phonetics studies the mechanisms (e.g., muscular, aerodynamic) that produce
speech and especially the how and why of variability in speech
sounds. The following is a brief overview of the subject; for indepth accounts, the readings listed below should be consulted.

History
The study of speech articulation and the development of a
descriptive terminology has an impressive history, with the first
surviving instance being the Adhyy of the Sanskrit grammarian, Pini (ca. 500 b.p.e.), who gave an articulatory account
of the relatively large sound inventory of Sanskrit. Other notable
achievements in the characterization of speech sounds were
given by many Greek grammarians, notably Dionysius Thrax
(first cent. b.p.e.); the Arab and Persian grammarians al Khalil
Ibn Ahmad and Sbawaihi of the eighth century, who described
the Arabic of their times; the First Grammarian of Iceland
(twelfth cent.); and the work commissioned by and credited to
the Korean King Sejong (fifteenth cent.), which provided not only
an articulatory description of Korean as spoken then but also a
transcription, now the official orthography for Korean, hangul,
which is partially iconic in its representation of how the sounds
are produced.
In Europe, the Baroque and modern eras saw dozens of
proposals for the description of speech sounds, for example,
by John Wilkins, Johan Conrad Amman, William Holder,
Francis Lodwick, Alexander J. Ellis, Robert Nares, Ernst Brcke,
Richard Lepsius, Alexander Melville Bell, Henry Sweet, and Otto
Jespersen. Although there is still some variation in the descriptive terms, works such as Catford (1977) and Maddieson (1984)
have helped to standardize the terminology.

The Basics
Speech articulations enable communication between speakers
and listeners because they create sound; it is the sound transmitted to listeners and the perception of these sounds that are
the ultimate goal in speaking. Descriptions of articulation are
intended to capture the gestures that create these distinctive elements in the speech code.

108

Sound is short-term variations or disturbances in ambient


air pressure. These pressure disturbances are created when air
moves from a region of high pressure to a region of low pressure.
There are three piston-like articulatory movements that can
create such pressure differentials with respect to atmospheric
pressure. These, which J. C. Catford calls the initiation mechanisms, are pulmonic, glottalic, and velaric. These mechanisms
can either create a positive pressure vis--vis atmospheric pressure, in which case they are called egressive, or a negative pressure, and then they are called ingressive.
Pulmonic egressive initiation is by far the most common. All
languages use it and most use it exclusively. The chest cavity, by
virtue of decreasing its volume as in normal respiratory expiration, compresses the air in the lungs, thus raising lung pressure
above that of the atmospheric pressure. Since speech necessarily involves valves that impede the exiting airflow (e.g., the
adducted vocal cords and/or whatever articulations are made in
the oral cavity), the pulmonic or subglottal pressures developed
in speech are much larger than those seen in quiet respiratory
expiration. Because such initiation is so common, it is normally
not included in the usual phonetic descriptions; for example,
the [p] in French pre, which would otherwise be described as
pulmonic expiratory voiceless bilabial stop, is usually designated
simply as voiceless bilabial stop.
Pulmonic ingressive initiation (so-called ingressive voice)
is possible and is encountered in many cultures, notably in
Scandinavia and France, where short interjections, ja, oui, non,
can be uttered on ingressive voice (usually with some breathiness), but although some sociolinguistic or pragmatic contrast
may be associated with this trait, no language documented so
far uses pulmonic ingressive initiation to make lexical contrasts.
Ingressive phonation may also be encountered as a (not very
effective) vocal disguise, and it is universally encountered as a
kind of coda to very young babies cries where the vocal cords are
still approximated but the respiration has shifted from expiratory
to inspiratory.
If the vocal cords are tightly closed and the larynx as a whole
is raised, acting like a piston, while there is a complete closure in
the oral cavity (and with the velum raised), a positive pressure
may be developed. Such sounds, glottalic egressives or ejectives,
are not uncommon, being found in various African languages
(from different language families), in some languages of South
and Central America and the Pacific Northwest (in the Americas),
and in the Caucasus. For example, Quechua bread is [tanta].
Glottalic ingressives or implosives involve the larynx most
commonly when the vocal cords are in voicing position being
lowered during the stop closure, thus creating a negative pressure in the oral cavity or at least moderating the buildup of positive pressure. Historically, such stops often come from voiced,
especially geminated (long), stops, for example, Sindhi /pauni/
lotus plant fruit < Prakrit *paba. Enlarging the oral cavity helps
to maintain a positive pressure drop across the glottis, which
favors voicing. Although ejective fricatives are attested, there are
no implosive fricatives probably because the noise of a fricative
is generated when the air jet expands after leaving the narrow
constriction. Such expansion would occur inside the vocal tract
if made implosively, and the sound would be attenuated by the
oral constriction.

Articulatory Phonetics
If an air pocket is trapped between the tongue and palate or
the tongue dorsum and lips, and the tongue is lowered, a large
negative pressure can be generated, which, when released, can
create quite a loud sound. Such sounds, so-called clicks have
velaric ingressive initiation. They are common in cultures all over
the world as interjections, signals to animals, and so on. However,
they are used as speech sounds to differentiate words only in a
few languages of southern and East Africa. They are very common in the Khoisan languages and a few neighboring Bantu languages. For example, Khoi one is [|ui] where the [|] is a dental
affricated click (the sound often symbolized in Western orthographies as tsk or tut). In the Khoisan languages, clicks can be
freely combined with pulmonic and/or glottalic egressives either
simultaneously or in clusters. Velaric egressives, where a positive pressure is created by upward and forward movement of the
tongue, are not found in any languages lexicon but are used in
some cultures as a kind of exaggerated spitting sound, where the
positive pressure creates a brief bilabial trill.
In general, after the mechanism of initiation is specified for
speech sounds, there are three main categories of terms to further characterize them: place, manner, qualifiers. For example,
the Russian [bratj] to take has in word-final position a voiceless
palatalized dental stop. The manner is stop, the place is dental, and voiceless and palatalized are qualifiers.

Manners
There are two fundamental categories of manners, obstruent and
sonorant, each with subcategories. An obstruent is a sound that
substantially impedes the flow of air in the vocal tract to a degree
that turbulent noise is generated either as continuous frication or
as a noise burst. Obstruents may be stops or fricatives. (Ejectives,
implosives, and clicks are inherent stops.) Sonorants, which do
not impede airflow, are subdivided generally into laterals, glides,
approximants, nasals, and vowels.
Stops present a complete blockage of the airflow, for example, the glottal stop in the name of the Hawaiian island Oahu
[oahu]. A special subclass of stops are affricates, which are stops
with a fricative release, as in the initial and final consonants of
English judge []. Another subclass, often categorized in
other ways, is comprised of trills, for example, the initial sound
of Spanish roja red [roxa].
Fricatives present a partial blockage, but with noise generated due to air being forced through a relative narrow constriction, for example, the velar fricative in Dutch groot large [xot].
Fricatives may be further divided into sibilants (s-like fricatives)
made by an apical constriction on or near the alveolar ridge. The
essential characteristic of this class, as opposed to other nonsibilant fricatives, is relatively loud high-frequency noise that exploits
the small downstream resonating cavity, for example, English
schist a category of rock [st] which contains two different
sibilant fricatives. Nonsibilant fricatives either have no downstream resonating cavity (e.g., the bilabial fricative in Japanese
Fuji name of the famous mountain [u i]) or, like the velar
fricative, are made further upstream of the alveolar region and so
have a longer downstream resonating cavity and, thus, lower frequencies. Presence or absence of voicing also affects fricatives
loudness: Even if paired voiced and voiceless fricatives have the
same degree of constriction, voiceless fricatives will have more

intense frication noise, because with no resistance to the airflow


at the glottis, the velocity of airflow will be greater at the oral constriction, and that also affects the degree and loudness from the
air turbulence.
Subcategories of sonorants include laterals, where the constriction is on one side of the palate, the other being open, for
example, the medial geminate (or long) alveolar lateral in Hindi
palla loose end of a sari used as a head covering [pla]. Nasals
are consonants made with a complete closure in the oral cavity (at any place farther forward of the uvular region) but with a
lowered velum, for example, Tswana [ku] sheep with an initial velar nasal. Glides and approximants have nonlateral oral
constrictions that are not sufficient to generate turbulence, for
example, the labial-velar glide at the beginning of the English
word [wd]. Vowels are considered to have the least constriction
(descriptive terms follow).

Place
The primary places of articulation of speech sounds, proceeding
from the farthest back place to the farthest forward: glottal, pharyngeal, uvular, velar, palatal, alveolar, dental, and labial. Some
of these places have already been illustrated. Finer place distinctions can easily be made if necessary by appending the prefixes
pre- and post-, and multiple simultaneous constrictions can be
differentiated by concatenating these terms as was done with the
labial-velar glide [w]. In most cases, these anatomical landmarks
on the upper side of the vocal tract are sufficient; if necessary, an
indication of the lower (movable) articulator can be specified, for
example, the voiced labial-dental fricative [v] as in French voir
to see [vw] (as opposed to, say, the voiced bilabial fricative
[] in Spanish cerveza beer [sesa]).

State of the Glottis


In most cases, specifying whether the vocal cords are apart and
not vibrating (voiceless), lightly approximated and vibrating
(voiced), or tightly pressed together (glottal stop) is sufficient.
However, voicing itself occasionally needs to be further differentiated as breathy (a more lax type of voicing lacking energy in the
higher harmonics), tense (rich in higher harmonics), or creaky
(irregular staccato type of phonation, also with much energy in
the higher frequencies, though since it is irregular, one cannot
clearly identify harmonics as such). Many of the Indic languages
employ a distinctive breathy voice associated with voiced stops,
for example, Hindi bhsh language [b::], and many languages and many speakers voiced phonation changes to creaky
at a point of low F0 in intonation. (Low F0 is a low rate of vibration of the vocal cords due to lesser tension giving rise to low
pitch or note of the voice.) Creaky voice is also a common variant of glottal stop.

Vowels
The descriptors for vowels deviate from those for consonants.
An imaginary quadrilateral space in the mouth (seen sagittally)
is posited, and vowels are said to have the high point of tongue
at regions in this space whose dimensions vertically are high
mid low and horizontally, front central back. (These may
also have further qualifiers. In French the high front unrounded
vowel contrasts with a high front rounded vowel [i] and with a

109

Articulatory Phonetics
high back rounded vowel [u], e.g., dit (s/he) said [di] vs. du of
the [dy] vs. doux soft [du].) Although ostensibly referring to
anatomical features, it is now accepted that these descriptors
actually correspond to acoustic-auditory features of vowels: The
height dimension correlates inversely with their first formant,
the front dimension correlates directly with their second formant, and unrounded-rounded correlates roughly with their
third formant. It is still technically possible to apply the traditional anatomical-physiological descriptors to vowels, in which
case [i] would be a close palatal vowel with spread lips, and [u] a
close velar vowel with lip rounding, and [] a pharyngeal vowel
with a widely open mouth shape. Other vowels would just be
variants of these with either less constriction or intermediate
places of constriction. There is merit in applying the anatomicalphysiological labels, for example, to explain the Danish dialectal
variant pronunciations [bi] ~ [bi] bee. The latter variant with
the voiceless palatal fricative can arise simply from the vowel terminus being devoiced (since, it will be recalled, the degree of turbulence is determined not only by the degree of closure but also
by the velocity of the airflow, which, in a voiceless vowel, is high
to generate turbulence at the point of constriction).

Secondary Articulations or Modifications


There are dozens of ways a speech sound can be qualified.
Typically, these are additional modifications or lesser constrictions that can be done simultaneously with the primary constriction, or are in such close temporal proximity to or invariably
linked to it that they are considered inherent to it. The label
breathy voiced is an example. Some additional examples (where
the italicized term is the qualifier): voiceless aspirated velar stop,
as in English key [khi]; the English phoneme //, often phonetically a voiceless post-alveolar labialized fricative as in ship [wip];
the nasalized mid-front vowel as in French faim hunger [f].

Prosody, Tone, Intonation


The terminology describing distinctive uses of voice pitch and
relative timing of sounds (and, perhaps, different voice qualities)
is still relatively nonstandardized except in the case of tones. The
International Phonetic Associations transcription recognizes
a variety of possible tone shapes, for example, Thai [kha\|] (with
falling tone) servant versus [kha/|] (with rising tone) leg. Here,
the vertical line is supposed to represent the range of voice pitch
characteristic of the speaker, the sentence context, and so on,
and the attached line the range and direction of the distinctive
pitch modulation.

Beyond Taxonomy
The conventional descriptions of speech just reviewed form the
basis for scientific investigations of considerable sophistication
and with applications in fields as diverse as medicine (especially
speech pathology), man-machine communication, first (and
subsequent) language learning, and phonology. These investigations involve study of more than just the anatomical-physiological
character of speech sounds, but also, as was hinted at in the
preceding discussion, speech aerodynamics, speech acoustics,
speech perception, and neurophonetics. Space allows just one
example in the area of phonology: Medial stops emerge seemingly
out of nowhere in words such as glimpse < gleam + s (nominalizing

110

Artificial Languages
suffix), dempster judge < deem + ster, Thompson > Thom + son,
youngster [jkst] < young [j] + ster. One has to ask where the
medial stop came from in these nasal + fricative clusters, neither
element of which is a stop. The answer emerges when one considers that these speech sounds are opposite in the state of the
oral and velic exit valves. The nasal has all oral exit valves closed
and the velic valve open whereas the fricative has an oral valve
open, and the nasal valve closed. If in the transition between the
nasal and fricative the velic valve should close prematurely, then
all exit valves will be closed and thus a brief epiphenomenal stop
will emerge. (For more examples, see Ohala 1997.)
John Ohala
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Browman, C. P., and L. Goldstein. 1986. Towards an articulatory phonology. In Phonology Yearbook 3, ed. C. Ewan and J. Anderson, 21952.
Cambridge: Cambridge University Press.
Catford, J. C. 1977. Fundamental Problems in Phonetics. Bloomington,
IN: Indiana University Press.
Goldstein, L., and C. P. Browman. 1986. Representation of voicing contrasts using articulatory gestures. Journal of Phonetics 14: 33942.
Hardcastle, W. J., and J. Laver. 1999. The Handbook of Phonetic Sciences.
Oxford: Blackwells.
Huffman, M. K. and R. Krakow, eds. 1993. Nasals, Nasalization and the
Velum: Phonetics and Phonology, Vol. 5. San Diego, CA: Academic
Press.
Ladefoged, P. 1964. A Phonetic Study of West African Languages: An
Auditory-Instrumental Survey. Cambridge: Cambridge University
Press.
Ladefoged, P., and I. Maddieson. 1995. The Sounds of the Worlds
Languages. Oxford: Blackwell.
MacNeilage, P. F., ed. 1983. The Production of Speech. New York: SpringerVerlag.
Maddieson, I. 1984. Patterns of Sounds. Cambridge: Cambridge University
Press.
Ohala, J. J. 1990. Respiratory activity in speech. In Speech Production and
Speech Modelling, ed. W. J. Hardcastle A. Marchal, 2353. Dordrecht,
the Netherlands: Kluwer.
. 1997. Emergent stops. Proc. 4th Seoul International Conference
on Linguistics [SICOL] 1115 Aug 1997: 8491.
Rothenberg, M. 1968. The Breath-Stream Dynamics of Simple-ReleasedPlosive Production. Basel: Karger.
Silverman, D. 2006. A Critical Introduction to Phonology: Of Sound, Mind,
and Body. London and New York: Continuum International Publishing
Group.
Sol, M-J. 2002. Aerodynamic characteristics of trills and phonological
patterning. Journal of Phonetics 30: 65588.

ARTIFICIAL LANGUAGES
An artificial language can be defined as a language, or languagelike system, that has not evolved in the usual way that natural
languages such as English have; that is, its creation is due to
conscious human action. However, this definition leaves open
some questions. For one thing, what do we mean by language
or language-like system? Among the systems of communication that could be, and have been, called artificial languages
are systems of logic, for example, predicate calculus, and computer languages, such as BASIC. However, the functions of these
languages are different from the function of natural languages,

Artificial Languages
which is communication among humans. I, therefore, focus on
artificial languages that have this latter function, for example,
Esperanto. Under the heading of artificial languages, one might
also include languages that have been made up in connection
with novels, films, television programs, and so on, for example,
Klingon (fictional or imaginary languages), or as part of some
other imaginary world, or those that have been created for the
enjoyment of their designer (personal languages).
Some languages (philosophical languages) were designed
to reflect the real world better than natural languages. Some of
the earliest known ideas on artificial languages, from the seventeenth century, involve this type. The terms constructed language (or conlang) and planned language are roughly equivalent
to artificial language (although one could point out that some
natural languages have undergone a degree of planning), while
(international) auxiliary language covers only those languages
intended for international communication (of course, some natural languages are also used for this); many, if not most, artificial
languages have been created for this purpose.
Another question concerns our notions of artificial and natural. On the one hand, many (arguably all) natural languages have
been subjected to some human manipulation. Consider, for example, the long line of English prescriptivists who have tried to eliminate some constructions of the language, or organizations such as
the French Academy, which has attempted to keep some English
words out of French. Although many of these manipulations have
not completely succeeded, they have had some effect, and therefore one could argue that English and French are partly artificial.
On the other hand, many consciously created languages were
built from elements of one or several natural languages and could
thus be considered not entirely artificial. Therefore, the boundary
between natural and artificial languages is not entirely clear.
In fact, a common classification of artificial languages is in
terms of whether they are based on natural languages: a posteriori languages are, while a priori languages are not (the philosophical languages belonging to the second group). That is, a
priori languages are (supposedly) built from scratch, not taking anything from natural languages. This is a simplification, as
few, if any, languages are entirely a priori; many contain both a
posteriori and a priori components. Therefore, the distinction
should, rather, be seen as a spectrum, with languages at different
points having differing ratios of a priori and a posteriori components. Artificial languages consisting of substantial proportions
of both types are called mixed languages.
Esperanto stands far above other artificial languages in terms
of success it has vastly more speakers than any others (and
even some native speakers). It has been claimed to have more
than a million speakers, though some would disagree with such
a large number, and of course, the question hinges on how one
defines a speaker. Only a relatively small number of artificial languages have achieved much of a community of speakers. These
include Volapk, Interlingua, and Ido, the latter being a modified
Esperanto. Many artificial languages were not used by anyone
other than their designer and perhaps several other people. In
fact, a large number of artificial languages were never fully developed, with only incomplete descriptions having been published.
Let us now see some examples of sentences in artificial languages. Because the a priori languages do not (intentionally) use

Aspect
elements from any natural languages, on the surface they may
seem rather strange, as shown by the following examples from the
language Oz (which, in spite of its name, was a serious project):
(1)

ap if-blEn-vOs
he HABITUAL-seldom-study
he seldom studies(Elam 1932, 20)

(2)

ep ip-Qks ap
I PAST-see him
I saw him (ibid.)

However, one could assert that since even a priori languages


are human creations, they cannot be that different from natural
languages.
A posteriori languages can draw from several languages or
from just one. In the latter case, they are usually or always simplifications of the language. There have been many such simplifications of Latin, some of English, and some of other languages.
Following is a Latin sentence and its equivalent in SPL (or SIMPLATINA), an artificial language created from Latin.
(3)

Nuntium audiverat antequam domum venit.


Fin audit nntium ntequam fin venit in domus.
He had heard the news before he came home. (Dominicus
1982, 21)

One might be surprised to learn that there are a thousand


or more artificial languages, even excluding fictional ones (but
including languages that were not fully elaborated). It might also
be unexpected that people have continued to devise new artificial languages for international communication, given how many
have already been proposed and not achieved their goal. The
existence of the Internet may have served as an impetus, since it
is now easy for language creators to present their languages to a
wide audience. The number of artificial languages will probably
keep increasing, though with none of them achieving the status
of a universal second language.
Alan Reed Libert
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Albani, Paolo, and Berlinghiero Buonarroti. 1994. Aga Magra Difra.
Bologna: Zanichelli. An encyclopaedia with many entries for artificial
languages, including fictional languages, and their creators.
Dominicus, Richardius. 1982. SPL. Wisconsin: Dominicus Publishing
House.
Elam, Charles Milton. 1932. The Case for an A Priori Language. Cincinnati,
OH: The Open Sesame Press.
Large, Andrew. 1985. The Artificial Language Movement. Oxford: Basil
Blackwell.
Pei, Mario. 1968. One Language for the World. New York: Biblo and
Tannen.

ASPECT
Situations unfold over time. When we talk about them, we often
specify how they unfold over time (or not). There are many ways
in which language conveys such temporal information. While
tense specifies location of an event in relation to other points
in time (e.g., past, present, future, pluperfect), aspect specifies

111

Aspect
internal temporal structure of a situation (e.g., whether it is
ongoing or completed). This is important information to convey
for our linguistic communication to be successful, and many languages convey it by various means lexical, grammatical, and/
or pragmatic. English grammatically marks tense (-ed, -s, will),
while Chinese does not, relying instead on lexical and pragmatic
means. English also grammatically marks aspect (progressive
be V-ing), while Hebrew does not.
Grammatical marking of aspect, often encoded in auxiliaries and inflections, is known as grammatical aspect, or viewpoint aspect. It is called viewpoint aspect because it signifies
the speakers viewpoint. When one chooses to say He ran a
mile, one is viewing the situation from outside, disregarding
its internal structure (perfective aspect), while if one says He
was running a mile, the beginning and end of this situation
are disregarded and one is focusing on the internal structure of
this situation (imperfective aspect) (Comrie 1976; Smith 1997).
The former is often used to push the narrative storyline forward
(foreground), while the latter is associated with background
information (Hopper 1979).
Equally important in conveying aspectual information is lexical aspect also known as inherent (lexical) aspect, situation
aspect (or situation type), aktionsart, event type, and so on. This
is defined by the temporal semantic characteristic of the verb
(and its associated elements) that refers to a particular situation.
Although there are numerous proposals, the most well known is
the classification proposed by Zeno Vendler (1957):
Achievement: that which takes place instantaneously, and is
reducible to a single point in time (e.g., recognize, die, reach
the summit)
Accomplishment: that which has dynamic duration, but has a
single clear inherent endpoint (e.g., run a mile, make a chair,
walk to the store)
Activity: that which has dynamic duration, but with an arbitrary endpoint, and is homogeneous in its structure (e.g., run,
sing, play, dance)
State: that which has no dynamics, and continues without
additional effort or energy being applied (e.g., see, love, hate,
want)
Lexical aspect has proved to be important in linguistic analysis, acquisition, and processing of aspect. In linguistic analysis,
Carlota S. Smith (1997) proposed the two-component theory, a
system in which the aspectual meaning of a sentence is determined by the interaction between lexical aspect and grammatical aspect. For example, imperfective aspect (e.g., progressive in
English) takes the internal view, and, therefore, it is compatible
with durative predicates of activity and accomplishment and
yields progressive meaning. In contrast, achievement, since it is
nondurative, is not so compatible with imperfective aspect, and
such pairing is often anomalous (e.g., *He is noticing the error) or
results in preliminary stages meaning (e.g., He is dying).
In acquisition, this interaction of lexical and grammatical
aspect has been observed since the 1970s. Cross-linguistically,
when children acquire (perfective) past tense marking, they
show strong association between telic verbs (achievements and
accomplishments), between general imperfective marking (such

112

as French imparfait) and atelic verbs (states and activities), and


between progressive (i.e., dynamic imperfective) marking and
activity verbs (see Li and Shirai 2000 for a review). Psychologists
and linguists alike have tried to explain this observation. One
important proposal relies on innateness (the language bioprogram hypothesis, Bickerton 1981), while an alternative proposal
is based on input frequency (Shirai and Andersen 1995).
The notion of compatibility is crucial when we discuss the
interaction of lexical and grammatical aspect since some combinations are more natural, prototypical, and frequent. Telic verbs
are more compatible with perfective aspect, while activities
are most naturally associated with progressive marking. This is
reflected in frequency distribution cross-linguistically (Andersen
and Shirai 1996). For example, about 60 percent of past tense
markers in child-directed speech in English were attached to
achievement verbs, while almost 95 percent of past tense forms
in childrens speech were used with achievement verbs (e.g.,
broke, dropped) when children started using them (Shirai and
Andersen 1995).
This frequency effect is not yet well recognized in the area of
language processing. Carol J. Madden and Rolf A. Zwaan (2003)
and Todd Feretti, Marta Kutas, and Ken McRae (2007) found
the strong effect of grammatical aspect in their experiments on
aspectual processing, but they did not manipulate the effect
of lexical aspect. Although Madden and Zwaan (2003) found a
facilitating effect of perfective aspect on processing but not of
imperfective aspect, their experiments used only accomplishment verbs, which are telic and more compatible with perfective
aspect (i.e., past tense in English). Foong Ha Yap and colleagues
(2009) replicated facilitating effects of perfective aspect with
accomplishments and, in addition, of imperfective (progressive)
aspect with activities in Cantonese. Thus, the interaction of lexical and grammatical aspect is pervasive and cannot be ignored
in any research involving aspectual phenomena.
Yasuhiro Shirai
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Andersen, Roger W., and Yasuhiro Shirai. 1996. Primacy of aspect in
first and second language acquisition: The Pidgin/Creole connection.
In Handbook of Second Language Acquisition, ed. W. Ritchie and T.
Bhatia, 52770. San Diego, CA: Academic Press.
Bickerton, Derek. 1981. Roots of Language. Ann Arbor, MI: Karoma.
Comrie, Bernard. 1976. Aspect. Cambridge: Cambridge University Press.
Ferretti, Todd R., Marta Kutas, and Ken McRae. 2007. Verb aspect
and the activation of event knowledge. Journal of Experimental
Psychology: Learning, Memory, and Cognition 33.1: 18296.
Hopper, Paul J. 1979. Aspect and foregrounding in discourse. In Syntax
and Semantics. Vol. 12: Discourse and Syntax. ed. T. Givon, 21341.
New York: Academic Press.
Li, Ping, and Yasuhiro Shirai. 2000. The Acquisition of Lexical and
Grammatical Aspect. Berlin: Mouton de Gruyter.
Madden, Carol. J., and Rolf A. Zwaan. 2003. How does verb aspect constrain event representation? Memory & Cognition 31: 66372.
Shirai, Yasuhiro, and Roger W. Andersen. 1995. The acquisition of tenseaspect morphology: A prototype account. Language 71: 74362.
Smith, Carlota S. 1997. The Parameter of Aspect. 2d ed. Dordrecht, the
Netherlands: Kluwer.
Vendler, Zeno. 1957. Verbs and times. Philosophical Review
66: 14360.

Auditory Processing
Yap, Foong Ha, Patrick Chun Kau Chu, Emily Sze Man Yiu, Stella Fay
Wong, Stella Wing Man Kwan, Stephen Matthews, Li Hai Tan, Ping Li,
and Yasuhiro Shirai. 2009. Aspectual asymmetries in the mental representation of events: Role of lexical and grammatical aspect. Memory
& Cognition 37: 58795.

AUDITORY PROCESSING
Auditory processing refers to the cognitive processes that enable
a listener to extract a message from the raw material of a speech
signal. The study of auditory processing draws upon a range of
sources within the linguistic sciences: most notably cognitive
psychology, discourse analysis, phonetics, phonology
and neurolinguistics (see brain and language). It is distinct
from theories of empathetic listening (what makes a good listener) in areas such as counseling.
It was not until the 1960s that a significant body of listening research developed with the advent of more sophisticated
recording equipment and the increased availability of spectrograms to display the physical characteristics of the speech
signal (see acoustic phonetics). The many advances in our
understanding of the skill since then include early work on
phoneme perception by the Haskins Laboratories; the recognition that processing occurs on line rather than waiting for
an utterance to be completed; and insights into how listeners
identify word boundaries in connected speech. Evidence of
the extent to which listeners have to build a message for themselves on the basis of inference has resulted in a sharp move
away from a view of listening as a passive skill and the recognition that a listener actively engages in a process of meaning
construction.
Auditory processing falls into two closely linked operations:
Decoding, where acoustic stimuli are translated into linguistic
units;
Meaning construction, which embellishes the bare meaning
of the utterance by reference to knowledge sources outside
the signal. It also requires listener decisions as to the importance of what has been said and how it is linked to the discourse that preceded it.
The listener thus draws upon four information sources. The first
is perceptual, based on the signal reaching the listeners ear. The
second is linguistic, consisting of the listeners stored knowledge
of the phonology, lexis, and syntax of the language being
spoken. The third is external: drawing upon the listeners knowledge of the world, the speaker, the topic, and the type of situation. A final component is the listeners ongoing model of what
has been said so far.

Decoding
Decoding is principally a matching operation in which evidence
from the signal is mapped onto stored representations in the
listeners mind of the phonemes, words, and recurrent chunks
of a language (see mapping). The process was once represented
as taking place in a linear and bottom-up way, with phonemes
shaped into syllables, syllables into words, and words into
clauses. In fact, listeners appear to draw upon several levels of
representation at once, with their knowledge of higher-level units,

such as syllables, words, and formulaic chunks, influencing their


decisions as to what has been heard. Decoding is also assisted
by context, in the form of world knowledge, knowledge of the
speaker, and recall of what the speaker has said so far.
There has been much discussion as to whether these pieces of
information combine in the mind of the listener during decoding
or whether they are handled separately. The argument behind
the first (interactive) view is that all the evidence can be considered simultaneously; the argument behind the second view
(modularity) is that the processor operates more rapidly if it
employs localized criteria specific to the phoneme, the word, or
the context.
Decoding can be discussed at three levels.
The first is phoneme recognition (see speech perception).
Translating acoustic evidence into the sounds of the target language does not involve simple one-to-one matching. There
is, firstly, an issue of noninvariance: researchers have not succeeded in finding clusters of cues that uniquely identify individual phonemes. Indeed, they have discovered that the same set of
cues may be interpreted differently according to the phonemes
that precede and follow them. There is also an issue of nonlinearity: The phonemes within a word are not clearly bounded units
but blend into each other in a process known as co-articulation.
A further complication is speaker variation: A listener has to
adjust to differences between individual speakers in pitch of
voice, accent, speech rate, and so on.
One solution holds that listeners employ a more reliable
unit of analysis than the phoneme. They might map direct from
acoustic stimuli to words stored in their minds, or they might use
the syllable as their principal perceptual unit. Another solution
views phoneme recognition as the outcome of cue trading, where
the listener weighs competing evidence until a particular candidate emerges as the most likely.
These accounts tend to assume that we have in our minds
a set of idealized templates for the sounds of a language and
match imperfect real-life examples to them by normalization
by editing out features that are nonstandard or irrelevant. An
alternative approach shifts the focus away from processing and
onto how the listener represents the sounds. A variant of the
template notion suggests that a phoneme may be represented
in an underspecified way that stores only a few features essential to its recognition. A second possibility is that phonemes
are stored as prototypes with a range of permissible variation associated with each. But there is now increasing support
for a view that we do not construct a central representation of a
phoneme but instead store multiple examples of the words we
hear, uttered in their variant forms. This exemplar account
accords with evidence that the human mind is better at storing
massive amounts of information than was previously supposed.
It explains how we are able to adjust to unfamiliar accents in
our native language and to recognize the same word uttered in a
range of different voices.
The second level is word recognition. The listening process takes place on line, with a listener able to shadow (repeat)
what a speaker says at a delay of about a quarter of a second.
Cohort theory (Marslen-Wilson 1987) postulates that a listener
retrieves a bank of possible word matches when the initial phonemes of a word are uttered, then gradually narrows them down

113

Auditory Processing
as more of the word is heard. The correct item is identified when
the words uniqueness point is reached and the cohort is reduced
to one possible match.
However, many words do not have a uniqueness point (the
sequence man might be a complete word or the first syllable of
manner or manager), and there are no consistent gaps between
words in connected speech to mark where boundaries fall.
Locating boundaries (lexical segmentation) is unproblematic
when one is listening to languages that bear fixed stress on the
first, penultimate, or last syllable of the word but becomes an
issue when processing languages with variable stress. Research
suggests that listeners exploit certain prosodic features of these
languages in order to establish the most likely points for words to
begin or end. In English, they take advantage of the fact that the
majority of content words in running speech are monosyllabic or
begin with a strong syllable (Cutler 1990).
A further problem for lexical recognition is that many words
in connected speech (particularly function words) occur in
a reduced form. They might be brief, of low saliency, and very
different from their citation forms. Using the gating method,
which presents connected speech in gradually increasing segments, researchers have demonstrated that word identification
in listening is sometimes a retrospective process, with listeners
unable to identify a word correctly and confidently until two or
more syllables after its offset.
There have also been attempts to model lexical recognition
using connectionist computer programs that analyze spoken
input in brief time slices rather than syllables. Early matches
are continually revised as evidence accumulates across slices
(see connectionist models, language structure, and
representation). The most well known of these programs is
TRACE (McClelland and Elman 1986).
The speed with which a word is identified by a listener is
subject to variation. High-frequency words are more rapidly
retrieved than low-frequency ones and are said to be more easily activated. Recognition is also faster when the listener has
recently heard a word that is closely associated with the target.
Thus, encountering a word like doctor facilitates (or primes; see
priming, semantic ) later recognition of nurse, patient, and
hospital. This process, known as spreading activation,
is highly automatic and is distinct from the normal effects of
context.
Syntactic parsing is the third level. Though speech is received
linearly, syllable by syllable, a listener needs to build larger-scale
syntactic structures from it. Listeners appear to retain a verbatim
record of the words heard until a major syntactic boundary. A
wrap-up process then turns the words into an abstract proposition, and they cease to be available to report. Intonation contours frequently coincide with syntactic units and assist listeners
in locating clause and sentence boundaries.
Where there is ambiguity early in an utterance, a listener
has to carry forward parallel hypotheses about how the utterance will end, with one prioritized and the others held in reserve.
Researchers employ garden path sentences (example: The lawyer questioned by the judge apologized) to establish the criteria that influence the preferred interpretation. Listeners appear
to be swayed by multiple factors, including syntactic simplicity,
semantic probability, and argument structure.

114

Decoding in ones first language (L1) is highly automatic


which is why decoding skills in a second language are difficult
to acquire. Studying vocabulary and syntax is not sufficient; a listener needs to recognize the relevant linguistic forms as they occur
in connected speech. The listener is likely to perceive the sounds
of the second language by reference to the phoneme categories of
the first and may also transfer processing routines, such as the L1s
lexical segmentation strategy or the relative importance it accords
to word order, inflection, and animacy. Second language listeners often find themselves heavily dependent upon contextual
cues in order to compensate strategically for failures of decoding.

Meaning Construction
The outcome of decoding is an abstract proposition, which
represents the literal meaning of the utterance independently of
the context. A listener has to build a more complex meaning representation (or mental model), which
(a) adds to and contextualizes the proposition;
(b) links it conceptually to what has gone before. This operation takes place locally as well as at a discourse level. Listeners
need to make local connections between ideas, associating
pronouns with their referents and recognizing logical connectives (in addition, however). But they also have to carry forward
a developing representation of the whole discourse so far.
Meaning construction embraces several different processes.
Many of them are more cognitively demanding in listening than
in reading because the listener is entirely dependent upon the
mental representation that he/she has built up and cannot look
back to check understanding.
ENRICHMENT. The listener adds depth and relevance to the
proposition by drawing upon external information: knowledge
of the world, the topic, the speaker, and the current situation.
Understanding is also deepened by relating the proposition to
the current topic and to the points made so far by the speaker.
INFERENCE. Listeners supply connections that the speaker does
not make explicitly. They might employ scripts to provide
default components for common activities. If a speaker mentions going to a restaurant, the listener takes for granted a waiter,
a menu, and a conventional procedure for ordering.
SELECTION. Listeners do not simply record facts; they select
some, they omit some, and they store some in reduced form. The
same utterance may result in differently constituted messages
in the minds of different listeners. One important factor is the
listeners perception of the intentions of the speaker. Another is
the listeners own purpose for listening. A further consideration
is redundancy: Spoken discourse is often repetitive, with the
speaker reiterating, rephrasing, or revisiting information that has
already been expressed.
INTEGRATION. Listeners integrate the incoming information into
what has been heard so far. Heed is paid to whether it extends an
established topic or whether it initiates a new one.
SELF-MONITORING. Listeners check to see whether incoming
information is consistent with the meaning representation built

Autism and Language


up so far. If it is not, one or the other needs to be adjusted, or a
comprehension check needs to be made.
DISCOURSE STRUCTURE. Listeners impose an argument structure upon the meaning representation, with major points distinguished from minor. Here, they may be assisted by analogy
with previous speech events.
John Field
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brown, Gillian. 1995. Speakers, Listeners and Communication.
Cambridge: Cambridge University Press. Discourse-based account of
meaning construction in listening.
Cutler, Anne. 1990. Exploiting prosodic possibilities in speech segmentation. In Cognitive Models of Speech Processing, ed. G. Altmann,
10521. Cambridge, MA: MIT Press.
Garrod, Simon, and Martin Pickering, eds. 1999. Language Processing.
New York: Psychology Press. Papers on lexical, syntactic, and discourse processing.
Marslen-Wilson, William. 1987. Functional parallelism in spoken wordrecognition. Cognition 25: 71102.
McClelland, J. L., and J. L. Elman. 1986. The TRACE model of speech perception. Cognitive Psychology 18: 186.
Miller, Joanne L., and Peter D. Eimas, eds. 1995. Speech, Language and
Communication. San Diego, CA: Academic Press. Papers by L. Nygaard
et al. and Anne Cutler review issues in perception.
Pisoni, David B., and Robert E. Remez. 2005. The Handbook of Speech
Perception. Oxford: Blackwell. Papers covering most major issues in
decoding.

AUTISM AND LANGUAGE


Autism is a neurodevelopmental disorder that is among the most
prominent of disorders affecting language. While its causes are
unknown, research has focused on cognitive, neurological, and
genetic explanations. Autism affects more than one domain of
functioning, with language and communication as primary
deficits.
Since Leo Kanner published the first account of children with
autism in 1943, widening diagnostic criteria have increased the
identification of cases. There have also been dramatic changes in
classification: Autism is no longer regarded as an isolated disorder but includes Asperger syndrome and atypical autism under
the rubric autism spectrum disorder (ASD) in order to reflect
that variability in expression. Diagnoses along the spectrum
are characterized by a common set of features: impairments in
social communication, restricted interests, and repetitive activities, with behaviors varying at different ages as well as different
levels of functioning (DSM-IV, American Psychiatric Association
1994). Autism occurs in at least 0.2 percent of the population,
affecting three times more males than females, while the other
disorders on the spectrum are estimated to affect another 0.4%
(Fombonne et al. 2006).
The social communication problems in ASD vary widely.
Parents of young children later diagnosed with ASD often
observe an absence of simple communicative behaviors, such
as shared attention (e.g., pointing to something to share interest) and make-believe play. Although many children with autism
never acquire functional speech, others develop speech that

differs remarkably from that of age-matched peers. Speech characteristics typical of autism include pronoun reversal (referring
to self as you); unvaried or atypical intonation; neologisms
(Volden and Lord 1991); the use of stereotyped, repetitive, and
idiosyncratic language; and echolalia. Barry Prizant and Judith
Duchan (1981, 246) suggest that echolalia may serve important
communicative and cognitive functions, such as turn-taking for
people with autism.
Significantly, social communication in ASD often fails even
in the presence of apparently intact grammatical skills. This
can be seen in Asperger syndrome, where language skills can
be advanced, vocabulary extensive, and syntax formally correct, even bookish. The speech of individuals with Asperger
syndrome is often pedantic, exhibiting unvaried, stereotyped
phrases and expressions associated with contexts or registers
not presupposed by the immediate situation of talk. The speech
patterns associated with ASD are part of the broader spectrum of
impaired reciprocal social interaction.
Conversation may be the most difficult area of communication for people with ASD. Conventional rules of turn-taking are
often ignored. Speakers may fail to sustain conversation beyond
yes/no answers or speak at length on circumscribed interests,
and they may resist attempts to shift topic. Speakers may also
fail to attend to the conversational needs of listeners and may
have difficulty applying contextual and cultural knowledge in
conversation. They may thus encounter problems interpreting
deictic references, as the following example illustrates: Speaker
1: What did you do on the weekend? Speaker 2: What weekend?
Here, the conventional response to the question posed would be
to interpret the weekend as the one that had just passed. Such
problems with relevance appear to be related to the tendency
in ASD toward an overliteral understanding of communication,
including difficulties interpreting indirect requests and metaphor (Happ 1993).
A number of cognitive theories are currently being explored
to explain the core features of ASD. Executive dysfunction is one
widely accepted cognitive explanation for some behavior difficulties in ASD. This refers to decision-making processes that
are necessary for performing goal-directed activities, which
are thought to originate in the frontal lobes (Russell 1997).
Weak central coherence theory posits a detail-oriented processing style at the expense of global and contextual information
and alludes to poor connectivity between brain regions (Happ
and Frith 2006). Intriguingly, this information-processing style
can often lead to superior performance on certain tasks, such as
the Embedded Figures Task (Witkin et al. 1971), underscoring
the fact that ASD is not merely a set of impairments but involves
unique ways of processing information. The theory most frequently cited to explain communication difficulties in ASD is
theory of mind (ToM) (Baron-Cohen 1995). ToM explains
these difficulties in terms of a cognitive mechanism underlying the ability to recognize others mental states. Many of the
pragmatic impairments that are known to occur in ASD can
be linked to a lack of intuitive mentalizing ability, for example,
difficulties understanding pretense, irony, deception, and
nonliteral language. The ToM hypothesis does not preclude
the presence of assets and islets of ability as suggested by weak
central coherence theory. Cognitive theories and hypothesized

115

Autonomy of Syntax
neural correlates with respect to facial and emotion information processing in the amygdala have so far provided the most
compelling explanations for the communication impairments
seen in ASD. Research into genetic causes appears promising, since some of the strongest genetic effects in autism seem
related to language abilities.
Jessica de Villiers
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
American Psychiatric Association. 1994. Diagnostic and Statistical Manual
of Mental Disorders (DSM-IV). 4th ed. Washington, DC: American
Psychiatric Association.
Baron-Cohen, Simon. 1995. Mindblindness: An Essay on Autism and
Theory of Mind. Cambridge, MA: MIT Press.
Fombonne, Eric, Rita Zakarian, Andrew Bennett, Linyan Meng, and
Diane McLean-Heywood. 2006. Pervasive developmental disorders
in Montreal, Quebec, Canada: Prevalence and links with immunizations. Pediatrics 118.1: 13950.
Frith, Uta. 2003. Autism: Explaining the Enigma. 2d ed. Oxford:
Blackwell.
Happ, Francesca. 1993. Communicative competence and theory of
mind in autism: A test of relevance theory. Cognition 48.2: 10119.
Happ, Francesca, and Uta Frith. 2006. The weak coherence
account: Detail-focused cognitive style in autism spectrum disorders.
Journal of Autism and Developmental Disorders 36.1: 525.
Kana, Rajesh K, Timothy A. Keller, Vladimir L. Cherkassky, Nancy J.
Minshew, and Marcel Adam Just. 2006. Sentence comprehension in
autism: Thinking in pictures with decreased functional connectivity.
Brain 129: 248493.
Kanner, Leo. 1943. Autistic disturbances of affective contact. Nervous
Child 2: 21750.
Prizant, Barry, and Judith Duchan. 1981. The functions of immediate
echolalia in autistic children. Journal of Speech and Hearing Disorders
46: 2419.
Russell, James, ed. 1997. Autism as an Executive Disorder. Oxford: Oxford
University Press.
Volden, Joanne, and Catherine Lord. 1991. Neologisms and idiosyncratic
language in autistic speakers. Journal of Autism and Developmental
Disorders 21.2: 10930.
Witkin, H., P. Oltman, E. Raskin, and S. Karp. 1971. A Manual for the
Embedded Figures Test. Palo Alto, CA: Consulting Psychologists Press.

AUTONOMY OF SYNTAX
Autonomy of syntax refers to what in recent times has been the
dominant assumption concerning the formulation of syntactic
regularities: syntax is determined independently of phonological realization or semantic interpretation. The formal properties of syntax are manipulated purely formally.
Such an assumption is familiar to modern students of linguistics from numerous textbook presentations, such as Andrew
Radfords (1988, 31):
autonomous syntax principle. No syntactic rule can make reference to pragmatic, phonological, or semantic information.

Some such assumption is already in place in Noam Chomsky


(1957, 17): I think that we are forced to conclude that grammar
is autonomous and independent of meaning. And in a later
espousing of the assumption, Chomsky traces the idea back to
what he refers to as structural linguistics (1972, 119):

116

A central idea of much of structural linguistics was that the formal


devices of language should be studied independently of their use.
The earliest work in transformational-generative grammar took
over a version of this thesis, as a working hypothesis. I think it has
been a fruitful hypothesis. It seems that grammars contain a substructure of perfectly formal rules operating on phrase-markers
in narrowly circumscribed ways. Not only are these rules independent of meaning or sound in their function

This passage is very pertinent in seeking to understand the


assumption and its status and origins (and cf. Chomsky 1975,
1822; 1977, 3858). But let us observe some things about origins
that it doesnt entirely convey.
Firstly, Chomskys much of structural linguistics should
not be taken to include most of the work done in early structural
linguistics in Europe, even by self-declared autonomists (see
Anderson 2005). It is notably the followers of Leonard Bloomfield
(1926) and proponents of transformational grammar who
insist on the autonomy of syntax from meaning. And, even for
the post-Bloomfieldians, syntax is far from autonomous from
phonology.
Perhaps more significantly, we should fully register the extent
to which the autonomy assumption is an innovation (Anderson
2005). Gram mar or syntax before structuralism was not
conceived of as autonomous; syntactic rules and principles can
refer to semantically and/or phonologically defined categories.
Consider as an example of this earlier tradition Otto Jespersens
description of the syntax of the SPEECH-ACT category of question: [T]he formal means by which questions are expressed,
are (1) tone; (2) separate interrogative words ; (3) word-order
(1924, 305). Syntax (and intonation) is expressive of meaning.
However, autonomy requires that elements that participate
in purely syntactic regularities be syntactic themselves. Thus, the
feature Q for interrogative clauses of Chomsky (1995, 4.5.4)
is part of syntax. It is interpretable, but its interpretation is
not pertinent to syntax (see illocutionary force and sentence types). But from a traditional point of view, it is the categorial meaning of Q that, as with other syntactic elements,
drives its syntax. Its status as (prototypically) a request for information is what demands, for instance, the presence in the sentence of an open element, marked, for example, by wh-, or by
intonation or some such indication of openness of the truth of the
sentence itself.
The autonomy hypothesis is falsifiable only if there is an independent notion of what constitutes syntax; otherwise, any apparent counterexample can be relegated to interaction between
syntax and some other module (see modularity). An unfalsifiable assumption of autonomy defines a research program, rather
than constituting an empirical hypothesis: It is methodological
rather than ontological. The program, as well as the hypothesis,
is based on the premise that it is fruitful to operate as if syntax
is autonomous, in contrast with the more traditional view that
nonreference by a syntactic regularity to interpretation is exceptional, involving demotivation (grammaticalization) within
a syntax whose central concern is with the role of sound and
structure as expressive of meaning.
Opponents of the autonomy assumption, whatever its status, tend to interpret it in the absolute form described here (as

Babbling
Langacker 1987,155). Chomsky, however, envisages autonomy
theses of varying degrees of strength (1977, 43), whereby syntax is not necessarily exhausted by the substructure of perfectly
formal rules (1972, 119) of his formulation. Thus, the significant
question with regard to the autonomy thesis may not be a question of yes or no, but rather of more or less, or more correctly,
where and how much (Chomsky 1977, 42). Certainly, provided
again that we have an independent characterization of syntax,
the extent of autonomy and its accommodation are in themselves
interesting empirical questions, with consequences for modularity, universal grammar, and the autonomy of language itself.
And seeking to answer them might be more comprehensible to
opponents of a strong interpretation of autonomy.
Work within the autonomist program(s), whatever the status
of the assumption, has undoubtedly had important results, but
there is room for debate as to how fruitful has been the pursuit of
the autonomy assumption as such. And in addition to the question of how it relates to independent notions of what syntax is, a
major difficulty in evaluating the assumption, and its contribution to these results, is the changing nature of the grammatical
enterprise(s) in which autonomy has been invoked, as well as the
varying degrees of emphasis with which it has been put forward
or denied.
John M. Anderson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, John M. 2005. Structuralism and autonomy: From Saussure
to Chomsky. Historiographia Linguistica 32: 11748.
Bloomfield, Leonard. 1926. A set of postulates for the science of language. Language 2: 15364.
Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton.
. 1972. Some empirical issues in the theory of transformational
grammar. In Goals of Linguistic Theory, ed. Stanley Peters, 63130.
Englewood Cliffs, NJ: Prentice-Hall.
. 1975. The Logical Structure of Linguistic Theory. New York: Plenum
Press.
. 1977. Essays on Form and Interpretation. Amsterdam: NorthHolland.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
Jespersen, Otto. 1924. The Philosophy of Grammar. London: George Allen
and Unwin.
Langacker, Ronald W. 1987. Foundations of Cognitive Grammar,
I: Theoretical Prerequisites. Stanford, CA: Stanford University Press.
Radford, Andrew. 1988. Transformational Grammar: A First Course.
Cambridge: Cambridge University Press.

B
BABBLING
Babbling can be defined as infant vocal production that is broadly
adultlike in phonetic shape but that lacks any identifiable
adult model or intended meaning. The formal criterion broadly
adultlike limits study to the period that follows the childs first
temporally regular or rhythmic C-V (Consonant-Vowel)-syllable
production, also known as canonical babbling (Oller 1980); these

vocal forms include both a complete or nearly complete supraglottal closure and a transition to a recognizable vocalic nucleus,
for example, [dada], [babababa], [aaa]. Prior to that, the
child vocalizes in more primitive ways that are not thought to be
directly related to language.
The lack of an adult model is a question of interpretation
since the first word forms are highly similar in form to concurrent babbling. In fact, there is in most cases a gradual shift from
a predominance of unidentifiable vocalizations, beginning with
the emergence of canonical babbling (at 6 to 10 months), to a
predominance of word use identifiable in the situational context (16 to 22 months). The extent to which the shift is relatively
abrupt or gradual is highly individual.
Finally, there are vocal forms that do not appear to be
based on an adult model but that are nevertheless used with
consistent broad communicative meaning such as request,
rejection, or interest; these transitional forms or protowords
(Vihman and Miller 1988) should thus be distinguished from
babbling, which lacks any apparent communicative goal.
Babbling is a largely self-directed process of exploration
(Elbers 1982, 45).

Brief Modern History


THE CONTINUITY ISSUE: THE RELATIONSHIP OF BABBLE TO
WORDS. Roman Jakobson ([1941] 1968) was the first linguist to
pay serious theoretical attention to babbling if only to deny its
relevance to language learning. On the basis of the diary accounts
available to him, Jakobson developed the (discontinuity) view
that babbling was merely random sound production, expressing the full range of human phonetic possibility but unrelated to
the more austere or constrained repertoire of the first words.
Jakobson saw the latter as reflecting a well-ordered universal
scheme for the emergence of phonological oppositions, such
that the low vowel /a/ is primary, with contrast with /i/ following,
while anterior stops are the first consonants produced (labial /b/
or dental /d/), followed by nasals and only later by other places
and manners of articulation.
This impressively articulated universalist theory held sway
for many years but was challenged in the 1970s when diary data
began to be supplemented by planned recordings of infants (typically in free interaction with an adult). Charles A. Ferguson and
Olga K. Garnica (1975) and Paul Kiparsky and Lise Menn (1977)
were among the first to raise objections to Jakobsons ideas of
gradual phonemic differentiation, which disregarded the effect
of word position on order of segment acquisition and which would
be difficult to defend on the basis of the very few words produced
in a childs earliest lexical period. On the other hand, Jakobsons
claims regarding the phones that occur in the first words were, on
the whole, quite accurate, based as they were on decades of diary
records provided by linguists and psychologists.
What was not supported by later studies was the strong
separation required by Jakobsons theory between words (or
phonology) and babble (or phonetic production). Far from
babbling being unrelated to word production, later studies have
established that the first words draw their phonetic resources
from the particular inventory of sound patterns developed by
the individual child through babbling (Vihman et al. 1985; continuity has also been reported for babbled gesture and first

117

Babbling
signs: Cheek et al. 2001). For example, a French child whose
prelinguistic babbling made considerable use of liquids (mainly
[l]) was found to develop several first words with [l(j)], which is
uncommon in early phonology: allo hello (on the telephone)
[ailo], [hailo], [haljo], [alo]; lolo bottle (babytalk term) [ljoljo];
donne (le) give (it) [d], [dl], [ld], [heldo] (Vihman 1993).
BABBLING DRIFT: THE EFFECT OF PERCEPTION ON PRODUCTION. A
second issue that has aroused interest for half a century is that
of possible drift in babbling toward the sounds of the native language (Brown 1958). The issue has generated considerable heat
and is important since it concerns the extent to which infants can
be taken to be capable of translating their perceptual experience
of the sound patterns of the ambient language into their limited
production repertoire. That is, any identifiable ambient language
influence on prelinguistic vocalizations means that infants have
both perceived the typical sounds of their language and adjusted
their vocal production accordingly.
Many studies, from Atkinson, MacWhinney, and Stoel
(1968) to Engstrand, Williams, and Lacerda (2003), have used
adult perceptual judgments of recorded vocalizations to determine whether infants language of exposure can be identified,
as that would provide evidence of drift; the findings remain
inconclusive, however. Meanwhile, Bndicte de BoyssonBardies and colleagues, using acoustic analyses of vowels
(1989) and tallies of transcribed consonant types (BoyssonBardies and Vihman 1991), established significant prelinguistic adult language influence, although the mechanism for such
an effect remained unclear. More recent work demonstrating
the extent of early implicit or distributional learning (Saffran,
Aslin, and Newport 1996) suggests that infants are capable of
registering dominant patterns of their language within the first
year. Thus, the mechanism needed to account for drift may be
the effect of implicit perceptual learning on production: Those
vocalizations that, as the producing child perceives them, activate perceptual responses already familiar from input patterning would strengthen perceptuomotor connections, leading to
their repeated use.

Theoretical approaches
FRAME AND CONTENT: THE ARTICULATORY BASIS OF BABBLING. The
most widely accepted current model of babbling is that of
Peter F. MacNeilage and Barbara L. Davis (1990 and Davis and
MacNeilage 1990, 1995 (for a review of competing ideas, see Chen
and Kent 2005). The articulatory basis of babbling is claimed to
be frame dominance, meaning that the patterns produced largely
reflect mandibular oscillation without independent control of
lip and tongue movement. The result is strong C-V associations,
such that alveolars are followed by front vowels, labials by central
vowels (the pure frames, requiring no particular tongue setting),
and velars by back vowels. Furthermore, the model predicts that
changes in the mandibular cycle will result in height changes for
vowels and manner changes for consonants in variegated babbling sequences. The work of this team and collaborators investigating a range of other languages (e.g., Dutch, French, Romanian,
Turkish: Davis et al. 2005) have largely supported the predictions
and have demonstrated a tendency for adult languages to show
the C-V associations as well (MacNeilage and Davis 2000) but

118

Chen and Kent (2005) report an association of labials with back


vowels in their extensive Mandarin data, both child and adult.
The balance of ambient language (perceptual) influence versus
universal (physiological or motoric) tendencies thus remains
controversial. Any early C-V associations can be expected to
fade with lexical growth as infants follow their individual paths
toward segmental independence (freeing the content from the
frame).
VOCAL MOTOR SCHEMES AND THE EFFECT OF PRODUCTION ON
PERCEPTION. Lorraine McCune and Marilyn M. Vihman (2001)
introduced the concept of vocal motor schemes (VMS), or generalized action patterns that yield consistent phonetic forms (p.
673), identified on the basis of repeated high-frequency production of one or more consonants over the course of several recordings. VMS index emergent stability in consonant production, a
reliable predictor of lexical advance.
Vihmans articulatory filter model (1993) posits that an
infants babbling patterns will effectively highlight related forms
in the input. Once one or more VMS are established, it is possible
to test the model by measuring infants attentional response to
a series of short sentences featuring nonwords that do or do not
include that childs VMS. Capitalizing on wide infant variability
in the timing and nature of first vocal forms (within the limits of
the strong universal contraints), Rory A. DePaolis (2006) established an effect of infant production on the perception of speech.
His findings support the idea that the first words, typically produced in priming situations (context-limited words: McCune
and Vihman 2001), are based on infant experience of a rough
match between vocal forms established through babbling practice and words heard frequently in input speech (Vihman and
Kunnari 2006). Such selection of words to attempt based on
the vocal forms available for matching would account for the relative accuracy of first words (Ferguson and Farwell 1975), their
constrained shapes (e.g., one or two syllables in length, with little
variegation across word position or syllables), and their strong
rootedness in the biomechanical basis of babbling as established
by Davis and MacNeilage. It also explains the difficulty of distinguishing words from babble (continuity) and the subtlety of the
ambient language effect on babbling and early words (drift).
Marilyn Vihman
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Atkinson, Kay, Brian MacWhinney, and Carol Stoel. 1968. An experiment on the recognition of babbling. Language Behavior Research
Laboratory Working Paper 14. University of California, Berkeley.
Boysson-Bardies, Bndicte de, Pierre Hall, Laurent Sagart, and
Catherine Durand. 1989. A crosslinguistic investigation of vowel formants in babbling. Journal of Child Language 16: 117.
Boysson-Bardies, Bndicte de, and Marilyn M. Vihman. 1991.
Adaptation to language. Language 67: 297319.
Brown, Roger. 1958. Words and Things. Glencoe, IL: Free Press.
Cheek, Adrianne, Kearsy Cormier, Ann Repp, and Richard P. Meier. 2001.
Prelinguistic gesture predicts mastery and error in the production of
early signs. Language 77: 292323.
Chen, L. M., and Raymond D. Kent. 2005. Consonant-vowel cooccurrence patterns in Mandarin-learning infants. Journal of Child
Language 32: 50734.

Babbling
Davis, Barbara L., and Peter F. MacNeilage. 1990. Acquisition of correct
vowel production. Journal of Speech and Hearing Research 33: 1627.
. 1995. The articulatory basis of babbling. Journal of Speech and
Hearing Research 38: 11991211.
Davis, Barbara L., Sophie Kern, Dilara Koba, and Inge Zink. 2005.
Vocalizations in canonical babbling. Paper presented at Symposium,
10th International Congress of the Association for the Study of Child
Language, Berlin.
DePaolis, Rory A. 2006. The influence of production on the perception
of speech. In Proceedings of the 30th Boston University Conference on
Language Development, ed. D. Bamman, T. Magnitskaia, and C. Zaller,
14253. Somerville, MA: Cascadilla Press.
Elbers, L. 1982. Operating principles in repetitive babbling. Cognition
12: 4563.
Engstrand, Olle, Karen Williams, and Francisco Lacerda. 2003. Does
babbling sound native? Phonetica 60: 1744.
Ferguson, Charles A., and Carol B. Farwell. 1975. Words and sounds in
early language acquisition. Language 51: 41939.
Ferguson, Charles A., and Olga K. Garnica. 1975. Theories of phonological development. In Foundations of Language Development, ed. Eric
H. Lenneberg and Elizabeth Lenneberg, 15380. New York: Academic
Press.
Jakobson, Roman. [1941] 1968. Child Language, Aphasia, and
Phonological Universals. The Hague: Mouton. Eng. translation of
Kindersprache, Aphasie und allgemeine Lautgesetze, Uppsala.
Kiparsky, Paul, and Lise Menn. 1977. On the acquisition of phonology.
In Language Learning and Thought, ed. John Macnamara, 4778. New
York: Academic Press.
MacNeilage, Peter F., and Barbara L. Davis. 1990. Acquisition of speech
production: Frames, then content. In Attention and Performance. Vol.
13: Motor Representation and Control. Ed. Marc Jeannerod, 45375.
Hillsdale, NJ: Lawrence Erlbaum.
. 2000. On the origin of internal structure of word forms. Science
288: 52731.
McCune, Lorraine, and Marilyn M. Vihman. 2001. Early phonetic
and lexical development. Journal of Speech, Language and Hearing
Research 44: 67084.
Oller, D. Kimbrough. 1980. The emergence of the sounds of speech
in infancy. In Child Phonology. Vol. 1: Production. Ed. Grace Yenikomshian, James F. Kavanagh, and Charles A. Ferguson, 93112. New
York: Academic Press.
. 2000. The Emergence of the Speech Capacity. Mahwah, NJ: Lawrence
Erlbaum. This book provides a thorough review of babbling studies
conducted with hearing, hearing-impaired, premature, and low SES
(socioeconomic status) infants, as well as provocative ideas about the
evolution of language based on evidence from ontogeny.
Saffran, Jenny R., Richard N. Aslin, and Elissa L. Newport. 1996. Statistical
learning by 8-month-old infants. Science 274: 19268.
Vihman, Marilyn M. 1993. Variable paths to early word production.
Journal of Phonetics 21: 6182.
. 1996. Phonological Development. Oxford: Blackwell. This book
provides an overview of research in infant speech perception and production and their interactions, as well as of theories of phonological
development, early word patterning, and the nature of the transition
into language.
Vihman, Marilyn M., and Sari Kunnari. 2006. The sources of phonological knowledge. In Recherches Linguistiques de Vincennes 35: 13364.
Vihman, Marilyn M., Marlys A. Macken, Ruth Miller, Hazel Simmons,
and James Miller. 1985. From babbling to speech: A re-assessment of
the continuity issue. Language 61: 397445.
Vihman, Marilyn M., and Ruth Miller. 1988. Words and babble at the
threshold of lexical acquisition. In The Emergent Lexicon, ed. Michael
D. Smith and John L. Locke, 15183. New York: Academic Press.

Basal Ganglia

BASAL GANGLIA
Nothing in biology makes sense except in the light of evolution.
Dobzhansky 1973

The basal ganglia are subcortical structures that can be traced


back to frogs and are traditionally associated with motor control.
However, current studies show that complex behaviors generally
are regulated by neural circuits that link local processes in different parts of the brain. In humans, the basal ganglia play a critical
role in neural circuits regulating cognitive processes, including
language, as well as motor control and emotion. The capacities
that differentiate humans from other species, such as being able
to talk, forming and comprehending sentences that have complex syntax, and possessing cognitive flexibility, devolve from
neural circuits that link activity in different regions of the cortex
through the basal ganglia. The neural bases of human language
thus involve the interplay of processes that regulate motor control, other aspects of cognition, mood, and personality. Given the
involvement of multiple regions of the brain that are involved in
many activities, it is difficult to see how any organ of the brain
could be specific to language and language alone, such as the
narrow faculty of language that, according to Marc D. Hauser,
Noam Chomsky, and W. T. Fitch (2002) yields the recursive properties of syntax.
Evidence from experiments-in-nature that attempt to link
specific behavioral deficits with damage to a particular part of
a patients brain led to the traditional Broca-Wernicke theory.
This traditional theory claims that linguistic processes are localized in these two regions of the neocortex, the outermost part of
the brain. However, evidence from brain-imaging techniques,
such as computer augmented tomography (CT scans), demonstrated that aphasia, permanent loss of language, never occurs
in the absence of subcortical damage (Stuss and Benson 1986).
Subsequent findings from techniques such as functional magnetic resonance imaging (fMRI see neuroimaging) that
indirectly map neural activity show that although Brocas area
and Wernickes area are active when neurologically intact subjects perform various linguistic tasks; these areas are elements of
complex neural circuits that link activity in other cortical regions
and subcortical structures (Kotz et al. 2003). Studies of neurodegenerative disorders, such as Parkinsons disease (Lieberman,
Friedman, and Feldman 1990; Grossman et al. 1992), revealed
the role of the basal ganglia in regulating speech and language.
Speech production and the comprehension of distinctions in
meaning conveyed by syntax deteriorated when basal ganglia
function was impaired. Basal ganglia dysfunction is implicated
in seemingly unrelated conditions, such as obsessive-compulsive disorder, schizophrenia, Parkinsons disease, and verbal
apraxia a condition in which orofacial, laryngeal, and respiratory control during speech is impaired (Lieberman 2006)

Neural Circuits
These syndromes follow from the basal ganglia activity in different neural circuits. Neural circuits that link activity in different
parts of the brain appear to be the bases for most, if not all, complex mammalian behaviors. In humans, a class of neural circuits
that links activity in different regions of the cortex through the

119

Basal Ganglia
basal ganglia and other subcortical structures appears to play a
key role in regulating aspects of human linguistic ability, such
as talking and comprehending the meaning of a sentence, as
well as such seemingly unrelated phenomena as decision making, walking, attention, and emotional state. To understand the
nature of neural circuits, we must take account of the distinction
that exists between local operations that are carried out within
some particular part of the brain and an observable behavior
that results from many local operations linked in a neural circuit. Complex brains, including the human brain, perform local
operations involving tactile, visual, or auditory stimuli in particular regions of the brain. Other neural structures perform local
operations that regulate aspects of motor control or hold information in short-term (working) memory, and so on.
The basic computational elements of biological brains are
neurons. Local operations result from activity in an anatomically
segregated, population (a group) of neurons. A given part of the
brain many contain many distinct anatomically segregated neuronal populations that each carry out similar local operations.
But these local operations do not constitute observable behaviors. Each anatomically segregated neuronal population projects
to anatomically distinct neuronal populations in other regions of
the brain, forming a neural circuit. The linked local operations
performed in the circuit constitute the neural basis of an observable aspect of behavior, such as striking the keys of a computer
keyboard.

Basal Ganglia Operations


Research that initially focused on Parkinsons disease, a neurodegenerative disease that affects the operation of the basal
ganglia, largely sparing cortex, demonstrated their role in motor
control, syntax, and cognitive flexibility. In their review article, C.
D. Marsden and J. A. Obeso (1994) noted that the basal ganglia
constitute a sequencing engine for both motor and cognitive
acts. The basal ganglia regulate routine motor acts by activating
and linking motor pattern generators that each constitute an
instruction set for a submovement to the frontal regions of the
brain involved in motor control. As each submovement reaches
its goal, the pattern generator for the next appropriate submovement is activated. Therefore, motor control deficits characterize
neurodegenerative diseases such as Parkinsons that degrade
basal ganglia operations.
The basal ganglia have other motor functions; in changing
circumstances, they can switch to a set of motor pattern generators that constitute a better fit to the changed environment
constituting adaptive motor control. Basal ganglia operations
involving cognitive pattern generators (Graybiel 1997) account
for the subcortical dementia associated with Parkinsons disease.
Afflicted individuals perseverate: They are unable to switch to a
new train of thought when circumstances change. On cognitive
tests such as the Wisconsin Card Sorting Test (WCST), they have
difficulty switching to a new cognitive criterion. For example, a
subject who has been successfully sorting cards by their color
will have difficulty switching to sorting them by the number of
symbols printed on each card. Neurophysiologic studies that
trace the linkages between the segregated neuronal populations
of the basal ganglia and cortex confirm circuits that project from
the basal ganglia to regions of the brain that are implicated in

120

cognitive as well as motor acts. Brain imaging studies reveal


increased basal ganglia activity in syntactically complex sentences, as well as at the points where a person must switch from
one criterion to another, as is the case in studies using tests of
cognition such as the WCST (Monchi et al. 2001).
Thus, basal ganglia dysfunction arising from neurodegenerative diseases, lesions, or the effects of oxygen deprivation
(Lieberman et al. 2005) also can result in an inability to comprehend distinctions in meaning conveyed by complex syntax.
Afflicted individuals appear to have difficulty switching the cognitive pattern generators that code syntactic operations at clause
boundaries or that in sentences depart from a simple canonical
form. These subjects typically have difficulty sequencing motor
acts, including those involved in speech. Their motor acts are
slower, resulting in longer vowel durations, and those subjects
have difficulty rapidly sequencing the tongue, lip, and laryngeal
maneuvers necessary to differentiate stop consonants, such as
[b] for [p], or [d] from [t].

Motor Control and Syntax


Linguists have long realized that the syntactic operations (i.e.,
the rules that they use to describe the structure of a sentence)
yield hierarchical structures. In describing the syntax of the sentence John saw the cat, the words the cat are part of a constituent
that includes the verb saw. The rules that can be used to describe
seemingly simple motor acts such as walking also yield hierarchical structures. Both motor control and syntax involve selectional
constraints that result in hierarchical structures. For example, the
motor pattern generator for heel strike cannot be activated before
or much after your foot meets the ground. This yields a hierarchical tree diagram similar to those commonly used to convey the
grammatical structure of a sentence. The syntactic tree diagram
for a square dance in which swing your partner occurred again
and again would not differ in principle from that of a sentence
having embedded relative clauses. (For more on the similarities
between motor control rules and those of generative syntax, see
Lieberman 2006).

Genetic Findings
Studies of the regulatory gene FOXP2 provide a starting point for
understanding the evolution of the cortical-striatal-cortical circuits that confer human linguistic ability (see genes and language). Other genes undoubtedly are involved and FOXP2 is not
a language gene. FOXP2 governs the embryonic development
of the basal ganglia, other subcortical structures, and lung tissue
and other structures. Its discovery resulted from a long-term study
of an extended family in which many individuals are marked by
a genetic anomaly. A syndrome, a suite of speech and orofacial
movement disorders, and cognitive and linguistic deficits mark
these individuals. They are not able to protrude their tongues
while closing their lips, cannot repeat two word sequences, and
have difficulty comprehending distinctions in meaning conveyed
by syntax (Vargha-Khadem et al. 1998). On standardized intelligence tests, they have significantly lower scores than their nonafflicted siblings. MRI imaging shows that the caudate nucleus (a
basal ganglia structure) is abnormal. fMRI imaging, which provides a measure of neural activity, shows underactivation in the
putamen (the principal basal ganglia input structure), Brocas

Basal Ganglia

Basic Level Concepts

area, and its right homolog (Watkins et al. 2002; Liegeois et al.
2003). These structures are connected by neural circuits through
the striatum (Lehericy et al. 2004). The behavioral deficits of
afflicted individuals are similar to those seen in Parkinsons disease and oxygen deprivation (cf. Lieberman 2006 for details).
The role of FOXP2 during early brain development in humans
and of the mouse version ( foxp2) in mice was established by C.
S. Lai and colleagues (2003). The gene governs the expression of
other genes during embryonic development. In both the human
and mouse brain, the gene is active in the interconnected neural structures that constitute the cortical-striatal-cortical circuits
regulating motor control and cognition in humans, including the
caudate nucleus and putamen of the basal ganglia, the thalamus, inferior olives, and cerebellum. Despite the high degree
of similarity, the mouse and human versions are separated by
three mutations. The chimpanzee and human versions are separated by two mutations. W. Enard and colleagues (2002), using
the techniques of molecular genetics, estimate that the human
form appeared somewhere in the last 200,000 years, in the time
frame (Stringer 1998) associated with the emergence of anatomically modern Homo sapiens. The appearance of human speech
anatomy 50,000 years ago presupposes the prior appearance of
this neural substrate (see speech anatomy, evolution of).
In short, the basal ganglia are neural structures that were initially adapted for one function motor control. In the course of
evolution, the human basal ganglia were modified, taking on
additional cognitive and linguistic tasks.
Philip Lieberman
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Dobzhansky, Theodosius. 1973. Nothing in biology makes sense except
in the light of evolution. American Biology Teacher 35: 1259.
Enard, W., M. Przeworski, S. E. Fisher, C. S. Lai, V. Wiebe, T. Kitano, A. P.
Monaco, and S. Paabo. 2002. Molecular evolution of FOXP2, a gene
involved in speech and language. Nature 41: 86972.
Graybiel, Ann M. 1997. The basal ganglia and cognitive pattern generators. Schizoprenia Bulletin 23: 45969.
Grossman, Murray, S. Carvell, S. Gollomp, M. B. Stern, G. Vernon, and
H. I. Hurtig. 1992. Sentence comprehension and praxis deficits in
Parkinsons disease. Neurology 41: 16208.
Hauser, Marc D., N. Chomsky, and W. T. Fitch. 2002. The faculty of
language: What is it, who had it, and how did it evolve? Science
298: 156979.
Kotz, Sonia A., M. Meyer, K. Alter, M. Besson, D. Y. von Cramon, and A. D.
Frederica. 2003. On the lateralization of emotional prosody: An fMRI
investigation. Brain and Language 86: 36676.
Lai, C. S., D. Gerrelli, A. P. Monaco, S. E. Fisher, and A. J. Copp. 2003.
FOXP2 expression during brain development coincides with adult
sites of a pathology in a severe speech and language disorder. Brain
126: 245562.
Lehericy, S. M., M. Ducros, P. F. Van de Moortele, C. Francois, L. Thivard,
C. Poupon, N. Swindale, K. Ugurbil, and D. S. Kim. 2004. Diffusion
tensor tracking shows distinct corticostriatal circuits in humans.
Annals of Neurology 55: 5229.
Lieberman, Philip. 2006. Toward an Evolutionary Biology of Language.
Cambridge: Harvard University Press.
Lieberman, Philip, J. Friedman, and L. S. Feldman. 1990. Syntactic deficits in Parkinsons Disease. Journal of Nervous and Mental Disease
178: 3605.

Lieberman, Philip, A. Morey, J. Hochstadt, M. Larson, and S. Mather S.


2005. Mount Everest: A space analogue for speech monitoring of cognitive deficits and stress. Aviation, Space and Environmental Medicine
76: 198207.
Liegeois, F., T. Baldeweg, A. Connelly, D. G. Gadian, M. Mishkin, and F.
Varhgha-Khadem, 2003. Language fMRI abnormalities associated
with FOXP2 gene mutation. Nature Neuroscience 6: 12307.
Marsden, C. D., and J. A. Obeso.1994. The functions of the basal ganglia
and the paradox of stereotaxic surgery in Parkinsons disease. Brain
117: 87797.
Monchi, O., P. Petrides, V. Petre, K. Worsley, and A. Dagher. 2001.
Wisconsin card sorting revisited: Distinct neural circuits participating in different stages of the task identified by event-related functional
magnetic resonance imaging. Journal of Neuroscience 21: 773341.
Stringer, Christopher B. 1998. Chronological and biogeographic perspectives on later human evolution. In Neanderthals and Modern
Humans in Western Asia, ed. T. Akazawa, K. Abel, and O. Bar-Yosef,
2938. New York: Plenum.
Stuss, Donald T., and D. F. Benson. 1986. The Frontal Lobes. New
York: Raven
Vargha-Khadem, Faraneh, K. E. Watkins, C. J. Price, J. Ashburner, K. J.
Alcock, A. Connelly, R. S. Frackowiak, K. J. Friston, M. E. Pembrey, M.
Mishkin, D. G. Gadian, and R. E. Passingham. 1998. Neural basis of an
inherited speech and language disorder. PNAS USA 95: 12695700.
Watkins, Kate, F. Vargha-Khadem, J. Ashburn, R. E. Passingham, A.
Connelly, K. J. Friston, R. S. Frackiwiak, M. Miskin, and D. G. Gadian.
2002. MRI analysis of an inherited speech and language disorder: Structural brain abnormalities. Brain 125: 46578.

BASIC LEVEL CONCEPTS


A concept is a mental representation that allows people to
pick out a group of equivalent things or a category (see categorization). For example, people use their concept of dog to pick
out members of category of things that are called dogs.
Concepts are also organized into hierarchical taxonomies,
or sequences of progressively larger categories, in which each
category includes all the previous ones. For example, an object
driven on a highway with four wheels and a top that folds back
can be called a convertible, a car, or a vehicle. The category car is
more general than convertible because it includes other objects
(e.g., station wagons) as well as the members of convertible. The
category vehicle is more general than convertible and car because
it contains other objects (e.g., trucks) as well as the members of
these categories.
Strong evidence from cognitive psychology (Rosch et al.
1976) and anthropology (Berlin 1992) suggests that one level of
such hierarchies is cognitively privileged. Eleanor Rosch and
colleagues (1976) used a wide range of converging methods
that singled out the basic level as playing a central role in many
categorization processes. For example, the category level represented by chair and dog is typically considered the basic level, in
contrast to more general superordinate concepts, such as furniture and animal, and more specific subordinate concepts, such
as recliner and labrador retriever.
Basic level concepts have advantages over other concepts.
Pictures of objects are categorized faster at the basic level than
at other levels (Jolicoeur, Gluck, and Kosslyn 1984). As noted
by Rosch and her colleagues, people primarily use basic level
names in naming tasks, and the basic level is the highest level

121

Basic Level Concepts

Bilingual Education

for which category members have similar overall shape (cf. car
versus vehicle). Children learn basic level concepts sooner than
other concepts (Brown 1958; Horton and Markman 1980). Basic
level advantages are found in many other domains, including
environmental scenes (Tversky and Hemenway 1983), social
categories (Cantor and Mischel 1979), and actions (Morris and
Murphy 1990).
One explanation for the advantages of basic level categories
over other categories is that they are more differentiated (Rosch
et al. 1976; Murphy and Brownell 1985). Members of basic level
categories have many features in common. These features are
also distinct from those of other categories at this level. In contrast, although members of more specific, subordinate categories (e.g., sports car) have slightly more features in common than
do those of basic level categories, many of these features are not
distinctive. That is, members of a subordinate category share
their features with other subordinates (e.g., members of sports
car share a number of features with other subcategories of car).
In contrast, the members of more general, superordinate categories (e.g., vehicle) have few common features.
Differentiation explains the basic level advantage because
it reflects a compromise between two competing functions of
concepts. Categories should be informative so that one can draw
inferences about an entity on the basis of its category membership. Emphasizing this function leads to the formation of large
numbers of categories with the finest possible discriminations
between categories (Rosch et al. 1976, 384). However, the formation of categories should only preserve important differences
between them that are practical: It is to the organisms advantage
not to differentiate one stimulus from others when that differentiation is irrelevant to the purposes at hand (ibid.). This function
counteracts the tendency to create large numbers of categories
and reflects the principle of cognitive economy (Rosch et al.
1976). Overall, basic level categories have an advantage because
they are relatively general and informative, whereas superordinate categories, though general, are not informative and subordinate categories, though informative, are not general.
The basic level may change with expertise in a way that is
consistent with the differentiation explanation. For example,
James Tanaka and Marjorie Taylor (1991) investigated expertise effects on the basic level in expert dog breeders and birdwatchers. Using a number of tasks, they tested each expert in
both the dog and bird domains. For instance, in a speeded categorization task, experts in their novice domain were fastest at the
basic level and slowest at the subordinate level (as Rosch et al.
1976 found). However, in their area of expertise, categorization
was equally fast at the basic and subordinate levels. For more
detailed reviews of this literature, see Lassaline, Wisniewski,
and Medin 1992; Murphy and Lassaline 1997.)
Edward Wisniewski
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Berlin, Brent. 1992. Ethnobiological Classification: Principles of
Categorization of Plants and Animals in Traditional Societies.
Princeton, NJ: Princeton University Press.
Brown, Roger. 1958. How shall a thing be called? Psychological Review
65: 1421.

122

Cantor, Nancy, and Walter Mischel. 1979. Prototypes in person perception. In Advances in Experimental Social Psychology, ed. L. Berkowitz,
452. New York: Academic Press.
Horton, Marjorie, and Ellen Markman. 1980. Developmental differences in the acquisition of basic and superordinate categories. Child
Development 51: 70815.
Jolicoeur, Pierre, Mark Gluck, and Steven Kosslyn. 1984. Pictures and
names: Making the connection. Cognitive Psychology 16: 24375.
Lassaline, Mary, Edward Wisniewski, and Douglas Medin. 1992.
Basic levels in artificial and natural categories: Are all basic categories created equal? In Percepts, Concepts, and Categories: The
Representation and Processing of Information, ed. B. Burns, 32880.
North Holland: Elsevier.
Morris, Michael, and Gregory Murphy. 1990. Converging operations on
a basic level in event taxonomies. Memory and Cognition 18: 40718.
Murphy, Gregory, and Hiram Brownell. 1985. Category differentiation in
object recognition: Typicality constraints on the basic category advantage. Journal of Experimental Psychology: Learning, Memory and
Cognition 11: 7084.
Murphy, Gregory, and Mary Lassaline. 1997. Hierarchical structure
in concepts and the basic level of categorization. In Knowledge,
Concepts, and Categories, ed. K. Lamberts and D. Shanks, 93131.
London: Psychology Press
Rosch, Eleanor, Carolyn Mervis, Wayne Gray, David Johnson, and Penny
Boyes-Braem. 1976. Basic objects in natural categories. Cognitive
Psychology 8: 382439.
Tanaka, James, and Marjorie Taylor. 1991. Object categories and expertise: Is the basic level in the eye of the beholder? Cognitive Psychology
23: 47282.
Tversky, Barbara, and Kathy Hemenway. 1983. Categories of environmental scenes. Cognitive Psychology 15: 12149.

BILINGUAL EDUCATION
In principle, bilingual education is just the use of two languages
in instruction in a school setting. However, in practice, it covers
a wide array of programs. Bilingual education programs range
from high-status schools promoting international education
through prestige languages, such as English and French, to
highly marginalized schools devoted to the bare-bones schooling of immigrant children.
In its weak form, bilingual education may involve transitional or subtractive bilingualism, leading to monolingualism
(e.g., teaching Spanish-speaking children English to ensure their
assimilation and integration into mainstream America). In its
strong form, bilingual education aims at maintaining the language of a minority child in addition to the learning of a majority language, thus leading to additive bilingualism. Heritage
bilingual schools often practice an ideal version of additive
bilingualism, which stresses bicultural education in addition to
bilingualism.
One of the central concerns of bilingual education is to
address the educational needs/performance of minority children by maintaining their mother tongue. The proponents of
maintaining the mother tongue claim that such maintenance
is critical for linguistic and cognitive growth of the child, school
performance, psychological security, ethnic and cultural identity (see ethnolinguistic identity), self-esteem, and many
other positive personal and intellectual characteristics. The supporters of transitional bilingualism claim that only transitional

Bilingualism, Neurobiology of
bilingualism is capable of saving children from poor academic
performance and allowing for assimilation.
Bilingual education has been steadily gaining strength around
the globe since the era of decolonization. It is further fueled by
the growth of ethnic awareness and the movement to prevent
the extinction of languages of the world. Many countries,
particularly in Europe, that earlier fostered monolithic ideology
have begun to recognize their diversity as a source of social and
economic capital, thus marking a new era of bilingual/multilingual education (e.g., in the United Kingdom, France, and Spain,
among others; see language policy).
Multilingual countries of Asia and Africa continue to nurture
a long tradition of bilingual education. Since 1956, India, for
example, has had as official policy a three-language formula in
education. This formula calls for multilingual education. In addition to learning the two national languages Hindi and English
students are expected to learn a third or a fourth language.
Because of its deep-rooted association with immigrants, bilingual education in the United States is particularly notable for its
turbulent history. On June 2, 1998, the people of California voted
to end a tradition of bilingual education by passing Proposition
227, which gives immigrant children just one year to learn English
before they enroll in regular classes. Many school systems in
other states are waiting either to put in place severe restrictions
on bilingual instruction or to eliminate it completely by passing
English only policies (for details, see Genesee 2006).
While bilingual education is often associated with the education of minority students (e.g., in the United States), the
Canadian immersion programs in French devoted to the
majority Anglophones serve as a model of bilingual education
for majority students (for details, see Genesee 2006).
Tej K. Bhatia
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Crawford, James. 2004. Educating English Learners. Los Angeles: Bilingual
Educational Services.
Genesee, Fred. 2006. What do we know about bilingual education for
majority-language students? In Handbook of Bilingualism, ed. Tej K.
Bhatia and William C. Ritchie, 54776. Oxford: Blackwell.

BILINGUALISM, NEUROBIOLOGY OF
Neurobiology of bilingualism refers to the study of the cerebral organization of multiple languages in the human brain.
From early accounts of selective loss and recovery in bilingual
aphasia (i.e., loss of language due to a brain lesion) to recent
electrophysiological and functional neuroimaging studies, issues
inherent to the bilingual brain have inspired researchers for more
than a century. Investigations into the neural basis of bilingualism focus not only on how two languages (L1 and L2) are represented in the brain (i.e., the anatomical location) but also on how
these languages are processed. Indeed, the main assumption is
that a weaker L2 may be processed through brain mechanisms
that may differ from those underlying L1 processing. After a brief
historical overview, I illustrate findings inherent to the representation of languages, followed by a section focusing on language
processing.

Historical Overview
From a historical standpoint, the first approach to studying
brain organization for bilingualism was the study of bilingual
aphasics. Several clinical aphasia studies have shown that bilingual aphasics do not necessarily manifest the same language
disorders with the same degree of severity in both languages. In
some cases, L1 is recovered better than a L2. In other cases, the
converse obtains. Since the landmark 1895 study of the French
neurologist Albert Pitres, who was the first to draw attention to
the relative frequency of differential language recovery following aphasia in bilinguals, many different recovery patterns have
been described: from selective recovery of a given language
(i.e., one language remains impaired while the other recovers); parallel recovery of both languages, successive recovery
(i.e., after the recovery process of one language, the other language recovers); alternating recovery (i.e., the language that
was first recovered will be lost again due to the recovery of the
language that was not first recovered); and alternating antagonistic recovery (i.e., on one day the patient is able to speak in
one language while on the next day only in the other); to the
pathological mixing of two languages (i.e., the elements of the
two languages are involuntarily mixed during language production). The study of bilingual aphasia is important because
it indicates the cortical regions necessary for performance of a
linguistic task (e.g., speaking in L1).
Clinical case reports indicate a set of relevant factors and
have led to theoretical conjectures. However, at present we lack a
causal account for the various recovery patterns and cannot predict clinical outcomes. Concerning the possible factors involved,
no correlation has been found between the pattern of recovery
and neurological, etiological, experiential, or linguistic parameters: not site, size or origin of lesion, type or severity of aphasia,
type of bilingualism, language structure type, or factors related to
acquisition or habitual use.
Theoretical conjectures arising from the study of bilingual
aphasia developed along two distinct lines, a more traditional
approach and a more dynamic approach. The more traditional
localizationist view argued, for instance, that the specific loss
of one language would occur because the bilinguals languages
are represented in different brain areas or even in different
hemispheres, and hence, a focal brain lesion within a languagespecific area may alter only that specific language, leaving the
other language intact. In contrast, according to the dynamic view
selective recovery arises because of compromise to the language
system, rather than to damage to differential brain representations. A selective loss of a language arises because of increased
inhibition, that is, of a raised activation threshold for the affected
or lost language, or even because of an imbalance in the means
to activate the language due to the lesion. It is worth underlining
that Pitres himself proposed a dynamic explanation of language
recovery in bilingual aphasics: Language recovery could occur
only if the lesion had not entirely destroyed language areas but
temporarily inhibited them through a sort of pathological inertia. In Pitress opinion, the patient generally first recovered the
language to which she/he was premorbidly more exposed (not
necessarily the native language) because the neural elements
subserving the more exposed language were more strongly
associated.

123

Bilingualism, Neurobiology of
The dynamic view not only explains the so-called selective recovery of a language but can also explain many reported
recovery patterns in bilingual aphasia. As outlined by M. Paradis
(1998), a parallel recovery would then occur when both languages are inhibited to the same degree. When inhibition affects
only one language for a period of time and then shifts to the other
language (with disinhibition of the prior inhibited language) a
pattern of alternating antagonistic recovery occurs (see Green
1986). Selective recovery would occur if the lesion permanently
raised the activation threshold for one language, and pathological mixing among languages would occur when languages could
no longer be selectively inhibited.
In general, the aphasia data have provided a rich source of evidence on the range of language disorders and language recovery
patterns in bilinguals. However, there are limitations to the generalizability of such data to neurologically healthy individuals.
Concerns about the lesion-deficit approach include the inability
to determine whether specific language deficits are the result of
damage to a specialized language component at the lesion site,
or if the damaged area is simply part of a larger neural network
that mediates a given component of language. Likewise, aphasia
data do not allow one to separate the effects of injury from those
of neural plasticity or a reallocation of healthy cortical tissue
for the mediation of language functions lost as a result of brain
injury. Nevertheless, studying the effects of brain damage on linguistic function in bilinguals has led to a number of interesting
observations about the nature and course of language impairment and recovery, which in turn has stimulated researchers to
apply functional neuroimaging techniques to the investigation of
bilingual language processing.

The Neural Representation of L2


Since its inception, neuroimaging work on bilinguals has been
motivated by the same localizationist questions that run through
the bilingual aphasia literature: whether multiple languages are
represented in overlapping or separate cerebral systems. In addition, neuroimaging and neurophysiological data on this issue
have often been influenced by possible biases, such as lack of
information on the age of acquisition and degree of proficiency
in the experimental subjects. Both these variables indeed exert
profound influences on the brain organization of L2.
According to psycholinguistic evidence grounded on the
concept of universal grammar, the age of L2 acquisition is
expected to be crucial for grammatical processing. In fact, grammatical processing may be particularly deficient when L2 is
learned later in life. On the other hand, lexical-semantic processing seems to be less affected by age of acquisition than to depend
on the degree of L2 proficiency.
It is likely that other factors, such as usage and exposure to a
given language, can affect brain plasticity mechanisms, leading
to modifications of the neural substrate of language. I consider
separately how these variables may influence L2 processing.
An ongoing issue in neurobiology concerns the fact that the
acquisition of language seems to depend on appropriate input
during a biologically based critical periods. It has also been
suggested that L2 learning may be subject to such crucial timelocked constraints. However, L2 can be acquired at any time in
life, although L2 proficiency is rarely comparable to that of L1 if

124

L2 is acquired beyond the critical periods. The dependence of


grammatical processing upon these age effects was confirmed
by early event-related potentials (ERP) studies (Weber-Fox and
Neville 1996) and by recent functional brain imaging studies
(Wartenburger et al. 2003). In particular, I. Wartenburger and
colleagues reported no differences in brain activations for grammar in L1 and L2 in very early (from birth) highly proficient bilinguals. On the other hand, late highly proficient bilinguals were
in need of additional neural resources in order to achieve a comparable nativelike performance in grammatical tasks. The same
did not apply to lexical-semantic processing, for which the only
difference in the pattern of brain activity in bilinguals appeared
to depend upon the level of attained proficiency.
As mentioned, the degree of language proficiency seems to
exert a more pervasive influence on the lexical-semantic level
of L2. According to psycholinguistics, during the early stages
of L2 acquisition there may be a dependency on L1 to mediate
access to meaning for L2 lexical items. As L2 proficiency grows,
this dependency disappears. Higher levels of proficiency in L2
produce lexical-semantic mental representations that more
closely resemble those constructed in L1. According to D. W.
Greens convergence hypothesis (2003), any qualitative differences between native and L2 speakers disappear as proficiency
increases. The convergence hypothesis claims that the acquisition of L2 arises in the context of an already specified or partially
specified system and that L2 will receive convergent neural representation within the representations of the language learned
as L1.
Whether word or sentence production and word completion
were used as experimental tasks, neuroimaging studies reported
common activations in the left hemisphere when the degree
of L2 proficiency was comparable to that of L1. This happened
irrespective of the differences in orthography, phonology and
syntax among languages. Conversely, bilinguals with low proficiency in L2 engaged additional brain activity, mostly in the left
prefrontal cortex. Similar results were found in studies that did
not directly address lexical retrieval, but employed judgment
tasks in the lexical-semantic domain.
It is worth underlining that the activity found in the left prefrontal cortex is located anteriorily to the classical language areas
and, thus, not directly linked to language functions but rather
linked to other cognitive functions, such as cognitive control
and attention. Crucially, the engagement of the left prefrontal
cortex was reported for bilinguals with a low degree of L2 proficiency and/or exposure. One may conclude that the differences
found between high and low proficient bilinguals are not due to
anatomical differences of L2 brain representations but instead
reflect the cognitive dynamics of processing a weaker L2 as compared to L1.

Neural Aspects of L2 Processing


One of the most salient aspects and one specific to bilingual language processing is language control. Language control refers to
the fact that there may be competition between languages and
that this competition is resolved by actively inhibiting the socalled non-target language. Consider that individuals can perform
different actions on the same stimulus. For instance, a bilingual
can name a presented word in L1 or translate it into L2. The task

Bilingualism, Neurobiology of
goal must be maintained in the face of conflicting goals, and the
various actions required to perform the task must be coordinated
(e.g., retrieve or compute the words phonology from its spelling
or retrieve the meaning of the word and select its translation).
Once a given task is established, however (e.g., speaking in L2),
competition with alternative possible tasks (speaking in L1) may
be resolved more automatically. Where individuals wish to alter
their goal (for example, to switch from speaking in one language
to speaking in another), they must disengage from the current
goal and switch to the new goal. Lexical concepts matching the
intended language must be selected and produced, while those
not matching the intended language must be inhibited through
language control mechanisms. For instance, in word production
studies, language control would inhibit potential interferences
from the non-target language. Psycholinguistic evidence points
to the fact that such interference is more common during production in a language that is mastered to a lower degree of proficiency, for example, a weak L2. In that case, for example, when
asked to name a picture in L2, the bilingual speaker has to inhibit
L1 in order to prevent a prepotent interference from L1.
Functional neuroimaging studies using experimental tasks
like picture naming, switching, translating, and so on have elegantly shown that these tasks are paralleled by the activation of
a set of brain areas that are not directly linked to language representation, such as the brain activity within the left prefrontal cortex, the left caudate nucleus, and the anterior cingulate cortex.
The engagement of these areas is even more relevant when subjects have to process a weak L2. The functions generally ascribed
to the prefrontal cortex comprise working memory, response
inhibition, response selection, and decision making, while the
left caudate was reported to be imported for language selection
and set switching. The anterior cingulate cortex is related to such
functions as conflict monitoring, attention, and error detection.
It becomes clear that the engagement of these structures provides a cerebral testimony of the cognitive processes inherent to
bilingual language processing: competition and control between
languages.

Conclusions
Extensive reviews focusing on the bilingual brain as studied with
functional neuroimaging are available in the literature to which
the reader is referred (Abutalebi, Cappa, and Perani 2005; Perani
and Abutalebi 2005; but see also Paradis (2004) for a critical
viewpoint). In broad outlines, functional neuroimaging has shed
new light on the neural basis of L2 processing and on its relationship to native language (L1). First of all, the long-held assumption that L1 and L2 are necessarily represented in different brain
regions or even in different hemispheres in bilinguals has not
been confirmed. On the contrary, functional neuroimaging has
elegantly outlined that L1 and L2 are processed by the same
neural devices. Indeed, the patterns of brain activation associated with tasks that engage specific aspects of linguistic processing are remarkably consistent among different languages, which
share the same brain language system. These relatively fixed
brain patterns, however, are modulated by a number of factors.
Proficiency, age of acquisition, and exposure can affect the cerebral representations of each language, interacting in a complex
way with the modalities of language performance.

Bilingualism and Multilingualism


Consider as an example the complex process of L2 acquisition.
This process may be considered as a dynamic process, requiring
additional neural resources in the early stages of L2 acquisition.
These additional neural resources are mostly found within the
left prefrontal cortex (more anteriorily to the classical language
areas), the left basal ganglia, and the anterior cingulated cortex and seem to be associated with the greater control demand
when processing a weaker L2. However, once the L2 learner
gains sufficient L2 proficiency, the neural representation of L2
converges to that of L1, at least at the macroanatomical level.
At this stage, one may suppose that L2 is processed in the same
fashion as L1, as psycholinguistic evidence points out (Kroll and
Stewart 1994).
This latter point is an important one because many functional
neuroimaging studies did not take into consideration linguistic
and psycholinguistic evidence (Paradis 2004). Yet evidence from
neuroimaging should be integrated with the psycholinguistic
findings to the mutual advantage of both research traditions.
Integrating these findings with the psycholinguistic theory may
allow us to demonstrate the biological consistency of different
models, organize and consolidate existing findings, and generate novel insights into the nature of the cerebral organization of
bilingualism.
Jubin Abutalebi
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abutalebi J., S. F. Cappa and D. Perani. 2005. Functional neuroimaging
of the bilingual brain. In Handbook of Bilingualism: Psycholinguistic
Approaches, ed. J. F. K Kroll and A. De Groot, 497515. Oxford: Oxford
University Press.
Green, D. W. 1986. Control, activation and resource. Brain and
Language 27: 21023.
. 2003. The neural basis of the lexicon and the grammar in L2
acquisition. In The Interface Between Syntax and the Lexicon in Second
Language Acquisition, ed. R. van Hout, A. Hulk, F. Kuiken, and R.
Towell, 197218. Amsterdam: John Benjamins.
Kroll, J. F., and E. Stewart. 1994. Category interference in translation
and picture naming: Evidence for asymmetric connections between
bilingual memory representations. Journal of Language and Memory
33: 14974.
Paradis, M. 1998. Language and communication in multilinguals. In
Handbook of Neurolinguistics, ed. B. Stemmer and H. Whitaker. San
Diego, CA: Academic Press, 417430.
. 2004. A Neurolinguistic Theory of Bilingualism. Amsterdam and
Philadelphia: John Benjamins.
Perani, D., and J. Abutalebi. 2005. Neural basis of first and second language processing. Current Opinion of Neurobiology 15: 2026.
Wartenburger, I., H. R. Heekeren, J. Abutalebi, S. F. Cappa, A. Villringer,
and D. Perani. 2003. Early setting of grammatical processing in the
bilingual brain. Neuron 37: 15970.
Weber-Fox, C. M., and H. J. Neville. 1996. Maturational constraints on
functional specialization for language processing: ERP and behavioral
evidence in bilingual speakers. Journal of Cognitive Neuroscience
8: 23156.

BILINGUALISM AND MULTILINGUALISM


Growing recognition that bilingualism/multilingualism is
not an exception or irregular phenomenon but is, in fact, a

125

Bilingualism and Multilingualism


growing global phenomenon marks a new challenge and a shift
for linguistic research. For instance, the traditional domain of
psycholinguistic research, which has been the monolingual
child, is now shifting to the bilingual child and multilingual language processing (see, e.g., de Bot and Kroll 2002, 133). What is
bilingualism and who is a bilingual? The questions of identification and measurement that are considered irrelevant in the context of monolingualism become more pertinent and urgent in
the context of bilingual language acquisition, production, comprehension, and processing.

Bilingualism/Multilingualism: Two Conceptual Views


Is a bilingual a composite of two monolinguals? Does the bilingual brain comprise two monolinguals crowded into a limited
space? For some researchers, the answer to these questions has
traditionally been affirmative. Such a view of bilingualism is
termed the fractional view. According to this view, monolingualism holds a key to the understanding of bilingualism. However,
a more balanced and accurate picture of bilingualism emerges
from the holistic view of bilingualism. According to this view, neither is a bilingual person the mere sum of two monolinguals nor
is the bilingual brain a composite of two monolingual brains. The
reason for this position is that the cooperation, competition, and
coexistence of the bilinguals two languages make a bilingual a
very complex and colorful individual (for details, see Grosjean
1989).

Defining and Measuring Bilingualism: Input Conditions and


Input Types
Defining and measuring bilingualism is a very complex and
uphill task due to the number and types of input conditions. For
instance, while a monolingual child receives input from his or
her parents only in one language in all settings, a bilingual child
is provided input at least in two separate languages (e.g., oneparent one-language input; one-place one-language input) in
addition to a code-mixed input in a variety of environments. In
addition, biological (age of acquisition), sociopsychological, and
other nonlinguistic factors lead to a varying degree of bilingual
language competencies. Therefore, it is natural that no widely
accepted definition or measure of bilingualism exists. Instead, a
rich range of scales, dichotomies, and categories are employed
to describe bilinguals. A bilingual who can speak and understand two languages is called a productive bilingual, whereas a
receptive bilingual is an individual who can understand but cannot speak a second language. A child who has acquired two languages before the age of five at home (natural setting) is called a
simultaneous or early bilingual, whereas those who learn a second language after the age of five, either at home or in school setting, are described as late or sequential bilinguals. Other labels
and dichotomies, such as fluent versus nonfluent, balanced versus nonbalanced, primary versus secondary, and partial versus
complete, are based upon different types of language proficiency
(speaking, writing, listening) or on an asymmetrical relationship
between the two languages.
Compound versus coordinate bilingualism refers to the differential processing of language in the brain. Compound bilinguals
process two languages using a common conceptual system,
whereas coordinate bilinguals keep language separation at both

126

conceptual and linguistic levels (see bilingualism, neurobiology of). These labels and dichotomies demonstrate the complex attributes of bilingualism that make the task of defining and
measuring bilinguals a daunting one. A working definition of
bilingualism is offered by Leonard Bloomfied ([1933] 1984, 53),
who claimed that a bilingual is one who has native-like control
over two languages (i.e., balanced bilingual).

Bilinguals Language Organization


Bilinguals organization of a verbal repertoire in the brain is also
very different from that of monolinguals. When a monolingual
decides to speak, his/her brain does not have to make complex
decisions concerning language choice as does the bilingual. Such
a decision-making process for a monolingual is restricted at best
to the choice of a variety/style (informal vs. formal) selection.
It is inconceivable for monolinguals to imagine that a multilingual person, such as this author, has to make a choice from
among four languages and their varieties while communicating
within his family in India. The language choice is not a random
one but is unconsciously governed by a set of factors. The author
is a speaker of Multani, Punjabi, Hindi, and English. Normally,
he used Multani to talk with his brothers and parents while growing up. He speaks Punjabi with two of his sisters-in-law, Hindi
with his nephews and nieces, and English with his children. In
short, each language in his brain is associated with a well-defined domain. A violation of such a domain allocation has serious implications not only for communication mishaps but also
for interpersonal relationships. In addition to the language-person domain allocation, other factors such as topics and emotions
determine his language choice. While discussing an academic
topic, he switches from Multani to English with his brothers and
from English to Hindi with his children if the context is emotive.
In short, the determinants of language choice are quite complex among bilinguals, and this, in turn, presents evidence that
bilinguals organization of their verbal repertoire is quite different from monolinguals organization.It is interesting to note that
language choice (or language negotiation) is a salient feature of
bilingual linguistic competence and performance. The complexity of language choice and its unconscious determinants
pose a serious challenge for the psycholinguistic theory of bilingual language production.

Individual, Societal, and Political Bilingualism


Bilingualism can be viewed from individual, societal, and political perspectives. In a bilingual family, not all members are always
bilinguals. Parents may be monolingual, while children may be
bilinguals or vice versa. Societal factors such as the overt prestige
of a language (or a presence of a majority language) often leads
to individual or family bilingualism. However, individual or family bilingualism can persist even without societal support. Such
bilingualism can be termed covert prestige bilingualism, which is
often motivated by the consideration of group identity.
In those societies of Asia or Africa where bilingualism exists
as a natural phenomenon as the result of a centuries-long tradition of bilingualism, an ethnic or local language becomes a Low
variety, that is, it is acquired at home and/or in an informal setting outside school (e.g., on a playground), whereas a language of
wider communication or a prestige language functions as a High

Bilingualism and Multilingualism


variety, which is learned formally in schools. In a diglossic
society, a single language develops two distinct varieties, the
L- and the H-variety.
People become bilingual for a wide variety of reasons: immigration, jobs, marriage, or religion, among others. These factors
create a language contact situation but do not always lead to
stable bilingualism. For instance, it is well known that immigrant
communities in the United States often give up their mother
tongue in favor of English and become monolingual after a brief
period of bilingualism.
The classification of countries as monolingual, bilingual, or
multilingual often refers to the language policies of a country,
rather than to the actual incidence of bilingualism or multilingualism. Canada is a bilingual country in the sense that its language policies are receptive to bilingualism. It makes provision
for learning French in those provinces that are Anglophone.
Such a provision is called territorial bilingualism. However, it
does not mean that everybody in Canada is a bilingual, nor does
it mean that the country guarantees individual bilingualism (personality bilingualism) outside territorial bilingualism. In multilingual countries such as India, where 20 languages are officially
recognized, the government language policies are receptive to
multilingualism. Indias three-language formula is the official
language policy of the country. In addition to learning Hindi and
English, the conational languages, schoolchildren can learn a
third language, spoken outside their state.

Bilingual Verbal Behavior: Language Separation and


Language Integration
Language separation and language integration are the two most
salient characteristics of bilinguals and thus of the bilingual
brain. Whenever deemed appropriate, bilinguals can turn off one
language and turn on the other language. This enables them to
switch from one language to another with the ease of a driver of
a stick-shift car shifting into different gears whenever necessary.
The fractional view of bilingualism can account for such a verbal
behavior of bilinguals. In addition to keeping the two linguistic
systems separate, bilinguals can also integrate the two systems
by mixing two languages. Language mixing is a far more complex
cognitive ability than language separation. The holistic view of
bilingualism can account for these two types of competencies.
Language mixing comes naturally to bilinguals. Therefore, it is
not surprising that such mixed languages as Spanglish, Hinglish,
Japlish, and Germlish are emerging around the globe.
Contrary to the claims of earlier research, the grammar of language mixing is complex yet systematic. The search for explanations of cross-linguistic generalizations about the phenomenon
of code mixing (particularly, code mixing within sentences) in
terms of independently justified principles of language structure
and use has taken two distinct forms. One approach is formulated
in terms of the theory of linguistic competence, for example, Jeff
MacSwan (2005). The other approach as best exemplified by
the Matrix Language Frame (MLF) model (Myers-Scotton and
Jake 1995; see codeswitching) is grounded in the theory of
sentence production, particularly that of M. Garrett (1988) and
W. Levelt (1989) (see Bhatia and Richie 1996, 6557 for discussion). For further development of these ideas and a critique, see
Bhatia and Ritchie (1996) and MacSwan (2005).

Effects of Bilingualism/Multilingualism
What is the effect of bilingualism/multilingualism on an individual, particularly on a child? The research on this question
is fundamentally driven by two hypotheses: the linguistic deficit hypothesis and the linguistic augmentation hypothesis.
According to the former, bilingual children show serious linguistic and cognitive adverse effects of bilingualism. Exposure to two
languages leads to semilingualism; that is, they become deficient
in both languages, which in turn leads to other disabilities (e.g.,
stuttering) and cognitive impairments (low intelligence, mental
retardation, and even schizophrenia).
Such a hypothesis has become obsolete in light of the findings of the research driven by the linguistic augmentation
hypothesis. Solid on theoretical and methodological grounds,
research by Elizabeth Peal and Wallace E. Lambert (1962) put
to rest such negative and frightening effects of bilingualism.
Their research and the findings of the succeeding research provide ample evidence that the negative conclusions of the earlier
research were premature and misguided due to the theoretical
and methodological flaws. Contrary to the findings of the previous research, bilingual children exhibit more cognitive flexibility
than do monolinguals and perform better on verbal and nonverbal measures. Peal and Lamberts study, which was conducted in
Montreal, revolutionized research on bilingualism and multilingualism by highlighting a positive conception of bilinguals. Their
research has been replicated in many countries, confirming
the augmenting rather than subtracting effect of bilingualism.
Beyond this research, the economic, communicative (intergenerational and cross-cultural), and relational (building relations)
advantages of bilingualism are inarguable.

Conclusion
In short, bilingualism/mulitilingualism is a global phenomenon
that continues to gain further momentum in the age of globalization. It is a by-product of a number of biological, sociopsychological, and linguistic factors. These factors lead to individuals with
varying degree of language competencies. Therefore, it is not surprising that defining and measuring bilingualism/multilingualism continues to be a challenging task. Bilinguals are complex
and colorful in the way they manage and optimize their linguistic
resources. For that reason, they are not a sum of two monolinguals.
Language mixing and shifting are two defining characteristics of
bilinguals. Current socio- and psycholinguistic research attempts
to account for these two salient properties of the bilingual brain.
Tej K. Bhatia
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bhatia, Tej K., and William C. Ritchie, eds. 2006. The Handbook of
Bilingualism. Oxford: Blackwell. This book presents a multidisciplinary
and comprehensive collection of state-of-the-art research on bilingualism and multilingualism. Chapters 7 and 8 deal with bilingual production models, including the discussion of Garrett 1988 and Levelt 1989.
Bhatia, Tej K., and William C. Ritchie. 1996. Bilingual language mixing,
Universal Grammar, and second language acquisition. In Handbook
of Second Language Acquisition, ed. W. C. Ritchie and T. K. Bhatia,
62782. San Diego, CA: Academic Press.
Bloomfield, Leonard. [1933] 1984. Language. Chicago: University of
Chicago Press.

127

Binding
de Bot, Kees, and Judith F. Kroll. 2002. Psycholinguistics. In An
Introduction to Applied Linguistics, ed. Norbert Schmitt, 13349.
London: Arnold.
Edwards, John. 2006. Foundations of bilingualism. In Bhatia and Richie
2006, 731.
Garrett, M. E. 1988. Process in sentence production. In The
Cambridge Linguistic Survey. Vol. 3. Ed. F. Newmeyer, 6996.
Cambridge: Cambridge University Press.
Grosjean, Francois. 1989. Neurolinguists, beware! The bilingual is not
two monolinguals in one person. Brain and Language 36: 315.
Hakuta, Kanji. 1986. Mirror of Language. New York: Basic Books. This
work offers an excellent multidisciplinary account of bilingualism in
general and bilingualism in the United States in particular. Among
other topics, it presents an excellent account of the linguistic deficiency and linguistic augmentation hypotheses.
Levelt, W. 1989. Speaking: From Intention to Articulation. Cambridge,
MA: MIT Press.
MacSwan, Jeff. 2005. Remarks on Jake, Myers-Scotton and Grosss
response: There is no Matrix Language. Language and Cognition
8.3: 27784.
Myers-Scotton, Carol and J. Jake. 1995. Matching lemmas in a bilingual
language competence and production model: Evidence from intrasentential code switching. Linguistics 33: 9811024.
Peal, Elizabeth, and Wallace E. Lambert. 1962. Relation of bilingualism
to intelligence. Psychological Monographs 76: 123.
Ritchie, William C., and Tej K. Bhatia. 2007. Psycholinguistics. In
Handbook of Educational Linguistics, ed. Bernard Spolsky and Francis
Hult, 3852. Oxford: Blackwell.

logic: The bindee must be contained in the sister constituent


to the binder, a relation usually called c(onstituent)-command
(see c-command). For movement relations, this amounts to the
ban on downward or sideways movement, the proper binding
condition, which is pervasive across languages. For quantifierpronoun relations, it blocks sideways binding as in (4a) (neither
noun phrase [NP] c-commands the other), and upward binding
as in (4b) (the putative binder is c-commanded by the pronoun);
note that in both examples the pronouns have to be interpreted
as referentially independent of no one/actress:
(4) a. If no one is here, hes elsewhere.
b. Her calendar showed that no actress had left early.

A systematic class of exceptions to the c-command requirement is found in so-called indirect binding, for example (5), where
the object can be bound from within the subject (sideways):
(5)

Unlike semantic binding, coreference among two NPs does not


require c-command:
(6)

(1)

Every cat chased its tail.

(2) a. Sue hopes that she won.


b. Edgar spoke for himself.
c. Wesley called PRO to apologize.
(3) a. Which book did Kim read t?
b. Antonia was promoted t.

Semantically, only (1) and (3) are clear instances of binding


(the pronouns/traces are interpreted like variables, and their
antecedents are often nonreferring), yet coreference is almost
universally subsumed under the binding label in linguistics.
All three binding relations are frequently represented by
coindexing the binder (or antecedent) and the bound element
(e.g., Every cat6 chased its6 tail), though richer, asymmetrical
representations have been proposed and are arguably required
for semantic interpretation.
Semantic binding relations are subject to a structural constraint, to a first approximation, the same as in quantified

128

His/Jacquess teacher said that he/Jacques failed.

Yet certain prohibitions against coreference, for example, that


nonreflexive pronouns in English cannot corefer with expressions in the same finite clause, only regard NPs that c-commands
the pronoun, (7) (similarly for nonpronominal NPs):
Your mother

BINDING
In quantified logic, binding names the relation between a
quantifier and one or more variables, for example, x and x
in x[P(x) Q(x)]. In linguistics, the term has been used in at
least three domains: first, for the relation between quantified
expressions and pronouns that referentially depend on them,
(1); second for coreference, the relation between two referring
expressions with the same referent, (2), including hypothesized
empty pronouns, (2c); and third, in theories that assume transformations, for the relation between a dislocated phrase and its
trace, (3) (see government and binding theory):

Somebody from every city likes its beaches.

(7)

defended you.

*You
Likewise, reflexive pronouns in English need an antecedent
that is not just within the same finite clause but also c-commands
them:
She
(8)

defended herself.

*Her mother
These conditions on the distribution of reflexive and nonreflexive restrict binding by quantified nominals as well and are
indiscriminately referred to as binding conditions.
While c-command seems relevant in binding conditions
cross-linguistically, other aspects, such as the number of morphological classes (reflexives, nonreflexives, etc.) or the size of
relevant structural domains, vary widely.
Daniel Bring
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bring, Daniel. 2005. Binding Theory. Cambridge Textbooks in Linguistics.
Cambridge: Cambridge University Press.
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht,
the Netherlands: Foris Publications.
Dalrymple, Mary. 1993. The Syntax of Anaphoric Binding. Stanford,
CA: CSLI.
Kuno, Susumo. 1987. Functional Syntax Anaphora, Discourse and
Empathy. Chicago: Chicago University Press.

Biolinguistics

BIOLINGUISTICS
Biolinguistics is the study of the biology of language. The modern biolinguistic program was initiated by Noam Chomsky in
the 1950s (Chomsky 2006), although it has much earlier historical antecedents (see cartesian linguistics). It investigates
the form and function of language, the development (ontogeny)
of language, and the evolution of language (phylogeny), among
other topics. Biolinguists study such questions as the following:
(1) What is knowledge of language?
(2) How does knowledge of language develop in the child?
(3) How does knowledge of language evolve in the species?
To answer the question of what knowledge of language is (1),
biolinguists have proposed various generative grammars,
that is, explicit models of the faculty of language.
The study of generative grammars draws from a variety
of areas, including syntax, semantics, the lexicon, morphology, phonology, and articulatory and acoustic
phonetics.
In addition, the biolinguist investigates the neurological mechanisms underlying the faculty of language (see

syntax, neurobiology of; semantics, neurobiology


of; morphology, neurobiology of; phonetics and
phonology, neurobiology of). Such studies of brain and
language include studies of expressive and receptive aphasia, split brain patients, neuroimaging, and the electrical
activity of the brain.
The biolinguist also studies performance models (language
processing), including parsing, right hemisphere lan-

guage processing, left hemisphere language processing, and speech perception.


To answer the question of how knowledge of language develops in the child (2), one may visualize this as the study of the
language acquisition device:
experience ? language (English, Japanese, etc.)

where the box represents what the child brings to language learning. We ask how the child maps experience (primary linguistic data)
to a particular language. It is posited that the child moves through a
number of states from an initial state, corresponding to the childs
genetic endowment, to a final state, corresponding to a particular
language. For each subarea discussed, biolinguistics studies the
development or growth of language, often referred to as language
acquisition (e.g., syntax, acquisition of; semantics, acquisition of; and phonology, acquisition of).
The initial state may be characterized by a universal grammar, which is a set of general principles with parameters that
are set by experience, thus accounting for the variation across
languages. For example, there are general principles of word
order that permit some variation; for example, the verb precedes
the object (English) or the verb follows the object (Japanese) (see
x-bar theory). Such a theory is referred to as a principles
and parameters theory. (For specific subareas, see syntax,

universals of; semantics, universals of; morphology,


universals of; and phonology, universals of.) For some
different parametric proposals, see the microparametric approach

of Richard Kaynes (2004) and Charles Yangs (2004) work on competitive theories of language acquisition. (From other perspectives,
see universals, nongenetic; absolute and statistical
universals, implicational universals, and typological
universals.)
In addition to comparative grammar (see also morpholog-

ical typology), universals of language change, syntactic change, semantic change, pidgins, and creoles
provide additional evidence for the nature of universal grammar
and language acquisition.
Moreover, the study of genetic language disorders, as well as
familial and twin studies, has been very fruitful for the study of
language acquisition (see genes and language; specific
language impairment; see also the extensive literature on the
FOXP2 gene [Marcus and Fisher 2003]). Studies of language-isolated children provide information about the critical period
for language learning. The study of sign languages has been
invaluable for investigating language outside the modality of
sound (see also sign language, acquisition of; sign languages, neurobiology of). Finally, the study of linguistic
savants has been quite useful for delineating the modularity
of the language faculty as distinct from other cognitive faculties.
To answer the question of how knowledge of language evolves
in the species, (3), biolinguists integrate data from a variety of
areas, including comparative ethology (see Hauser, Chomsky,
and Fitch 2002; see also animal communication and human
language; speech anatomy, evolution of), comparative
neuroanatomy, and comparative genomics. Since evolution of
language took place in the distant past, mathematical modeling
of populations of speaker-hearers has recently attracted much
interest in work on dynamical systems (see self-organizing
systems). Such studies have proven useful not only for the study
of evolution but also for the study of language acquisition and
change. (For some hypotheses on origins of language, see origins of language; grooming, gossip, and language.)
Questions (1) to (3) might be called the what and how questions of biolinguistics. One can also ask why the principles of
language are what they are, a deeper and more difficult question
to answer. The investigation into why questions is sometimes
referred to as the minimalist program or minimalism. In addition, there is the related question of how the study of language
can be integrated with the other natural sciences, a problem that
Chomsky calls the unification problem (see Jenkins 2000). All
of these questions are certain to continue to fascinate investigators of the biology of language for decades to come.
(For more information on other explicit models of the language faculty, see transformational grammar; standard

theory and extended standard theory; categorial


grammar; head-driven phrase structure grammar;
lexical-functional grammar; optimality theory; role
and reference grammar; cognitive grammar; connectionism and grammar; construction grammars).
Lyle Jenkins
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 2006. Biolinguistics and the human capacity. In
Language and Mind, 17385. Cambridge: Cambridge University Press.

129

Birdsong and Human Language


Hauser, M. D., N. Chomsky, and W. Tecumseh Fitch. 2002. The faculty
of language: What is it, who has it, and how did it evolve? Science
298: 156979.
Kayne, Richard. 2004. Antisymmetry and Japanese. In
Variation and Universals in Biolinguistics, ed. Lyle Jenkins, 35.
Amsterdam: Elsevier.
Jenkins, Lyle. 2000. Biolinguistics: Exploring the Biology of Language.
Cambridge: Cambridge University Press.
Marcus, G. F., and S. E. Fisher. 2003. FOXP2 in focus: What can genes
tell us about speech and language? Trends in Cognitive Sciences
7.6: 25762.
Yang, Charles. 2004. Toward a theory of language growth. In
Variation and Universals in Biolinguistics, ed. Lyle Jenkins, 3756.
Amsterdam: Elsevier.

BIRDSONG AND HUMAN LANGUAGE


Language is often claimed to be uniquely human (see animal communication and human language ). This belief
has discouraged efforts to identify potential animal models of
language, even though animal models have been essential in
ascertaining the neurobiology of other cognitive functions.
It is conceivable, however, that useful homologies or analogies exist between human language and the communicative
systems of other species, even if language is unique in some
respects.
One particularly interesting homology might exist between
human language and birdsong. Songbirds rely on a specialized frontal lobebasal ganglia loop to learn, produce,
and perceive birdsong (Brenowitz and Beecher 2005) (see also
brocas area). Disruptions to this circuit disrupt the sensorimotor learning needed to acquire song, and also the sequencing
skills needed to produce and properly perceive it. Recent work
has revealed a remarkable homology in this circuit between
birds and mammals (Doupe et al. 2005). The homologous circuit
in human and nonhuman primates involves loops connecting
many regions in the frontal cortex to the basal ganglia. Afferents
from the frontal cortex densely innervate the striatum of the basal
ganglia, which also receives inputs from many other areas of the
cortex. The striatum seems to control behavioral sequencing in
many species (Aldridge and Berridge 1998). Spiny neurons, the
principal cells of the striatum, have properties that make them
ideal for recognizing patterned sequences across time (Beiser,
Hua, and Houk 1997). Damage to this loop in primates produces
problems with motor and cognitive skills that require planning and manipulating patterns of sequences over time (Fuster
1995).
These observations lend plausibility to the notion that the
frontal cortexbasal ganglia circuit might play a role in the syntax of human language. If so, then it is probably not coincidental
that the acquisition of human language and birdsong have compelling parallels (Doupe and Kuhl 1999). Humans and songbirds
learn their complex, sequenced vocalizations in early life. They
similarly internalize sensory experience and use it to shape vocal
outputs, by means of sensorimotor learning and integration.
They show similar innate dispositions for learning the correct
sounds and sequences; as a result, humans and some species
of songbird have similar critical periods for vocal learning,
with a much greater ability to learn early in life. These behavioral

130

parallels are what one would expect if both species rely on a similar neural substrate for learning and using their communicative
systems.
Relevant genetic evidence is also available. The much-discussed FOXP2 gene is similarly expressed in the basal ganglia of
humans and songbirds (Teramitsu et al. 2004; Vargha-Khadem
et al. 2005). A FOXP2 mutation in humans results in deficits in
language production and comprehension, especially aspects of
(morpho)syntax that involve combining and sequencing linguistic units (Marcus and Hisher 2003; Vargha-Kadham et al. 2005).
One of the neurobiological effects of the mutation is a notable
reduction in the gray matter of the striatum (Vargha-Kadham
et al. 2005). Perhaps, then, the combinatorial aspects of human
language were enabled by the preadaptation of an anterior neural circuit that has been highly conserved over evolutionary time
and across species, and by a genetic mutation in this circuit that
increased its computational space.
Finally, some birdsong, like human language, is compositional; songbirds learn units and rules of combination (Rose et
al. 2004), although the rules of combination are obviously far
less sophisticated than those that characterize human language.
A skeptic might argue that the syntax of human language is too
complex (too highly structured, too recursive, too creative; see
recursion, iteration, and metarepresentation ) to be
modeled as a simple patterned sequence processor that relies
on associative learning mechanisms. In fact, the explanatory
burden placed on rule-based, recursive syntax has diminished
over recent decades. Modern grammars tend to be lexicalist
in nature; that is, much of the knowledge relevant to sentence
structure is stored in the lexicon with individual words, rather
than being computed by abstract phrase structure rules (see
lexical-functional grammar ). Recursion, while clearly
a characteristic of human language, is much more limited in
actual language usage than would be predicted given the standard model. And, because conceptual knowledge (see semantics) has its own structure (Jackendoff 1990), it seems plausible
that some of the burden for structuring the input rests with the
conceptual stream (Jackendoff 2002), rather than entirely with
the syntax.
Birds and humans are fundamentally different in many ways,
as are their systems of communication. Nonetheless, birds and
humans are two of only a handful of vocal learners, and recent
work points to communication-relevant homologies and similarities. It is not unreasonable to think that a comparative approach
might provide important clues to how language evolved and,
perhaps, to the nature of language itself.
Lee Osterhout
WORK CITED AND SUGGESTIONS FOR FURTHER READING
Aldridge, J. Wayne, and Kent C. Berridge. 1998. Coding serial order
by neostriatal neurons: A natural action approach to movement
sequence. Journal of Neuroscience 18: 277787.
Beiser, David G., Sherwin S. Hua, and James C. Houk. 1997. Network
models of the basal ganglia. Current Opinion in Neurobiology
7: 18590.
Brenowitz, Eliot, and Michael D. Beecher. 2005. Song learning in
birds: Diversity and plasticity, opportunities and challenges. Trends
in Neurosciences 28: 12732.

Blended Space

Blindness and Language


constructs in the conceptual blending framework. Some researchers use blended space and blend interchangeably to refer to the
particular kind of mental space described here (e.g., Fauconnier
and Turner 1994). Elsewhere blend is used to describe the entire
integration network, as in double-scope blend (e.g., Nez 2005),
or the process of generating such a network, as in running the
blend (e.g., Fauconnier and Turner 2002, 48). Where the use may
be ambiguous, blended space provides maximal clarity.

Doupe, Allison J., and Patricia Kuhl. 1999. Birdsong and human
speech: Common themes and mechanisms. Annual Review of
Neuroscience 22: 567631.
Doupe, Allison J., David J. Perkel, Anton Reiner, and Edward A. Stern.
2005. Birdbrains could teach basal ganglia research a new song.
Trends in Neurosciences 28: 35363.
Fuster, Joaquin M. 1995. Memory in the Cerebral Cortex: An Empirical
Approach to Neural Networks in the Human and Nonhuman Primate.
Cambridge, MA: MIT Press.
Jackendoff, Ray. 1990. Semantic Structures. Cambridge, MA: MIT Press.
. 2002. Foundations of Language: Brain, Meaning, Grammar,
Evolution. New York: Oxford University Press.
Lieberman, Philip. 2000. Human Language and Our Reptilian Brain.
Cambridge: Harvard University Press.
Marcus, Gary F., and Simon E. Fisher. 2003. FOXP2 in focus: What can
genes tell us about speech and language? Trends in Cognitive Sciences
7: 25762.
Rose, Gary, Franz Goller, Howard J. Gritton, Stephanie L. Plamondon,
Alexander T. Baugh, and Brendon G. Cooper. 2004. Species-typical
songs in white-crowned sparrows tutored with only phrase pairs.
Nature 432: 7538.
Teramitsu, Ikuku, Lili C. Kudo, Sarah E. London, Daniel H. Geschwind,
and Stephanie A. White. 2004. Parallel FOXP1 and FOXP2 expression in songbirds and human brain predicts functional interaction.
Journal of Neuroscience 24: 315263.
Vargha-Khadem, Faraneh., David G. Gadian, Andrew Copp, and
Mortimer Mishkin. 2005. FOXP2 and the neuroanatomy of speech
and language. Nature Reviews: Neuroscience 6: 1318.

Fauconnier, Gilles, and Mark Turner. 1994. Conceptual projection and


middle spaces. UCSD Department of Cognitive Science Technical
Report 9401.
. 1998. Principles of conceptual integration. In Discourse
and Cognition, ed. Jean-Pierre Koenig, 26983. Stanford, CA: CSLI
Publications.
. 2002. The Way We Think: Conceptual Blending and the Minds
Hidden Complexities. New York: Basic Books.
Grady, Joseph, Todd Oakley, and Seana Coulson. 1999. Conceptual
blending and metaphor. In Metaphor in Cognitive Linguistics, ed.
Raymond W. Gibbs, Jr., and Gerard J. Steen, 10124. Amsterdam and
Philadelphia: John Benjamins.
Nez, Rafael E. 2005. Creating mathematical infinities: Metaphor,
blending, and the beauty of transfinite cardinals. Journal of Pragmatics
37: 171741.

BLENDED SPACE

BLINDNESS AND LANGUAGE

A blended space is one element of the model of meaning construction proposed by conceptual blending theory. In this
framework, mental representations are organized in small, selfcontained conceptual packets (Fauconnier and Turner 2002,
40) called mental spaces, which interconnect to form complex
conceptual networks. In a conceptual integration network, or
blend, some mental spaces serve as input spaces that contribute elements to a new, blended mental space (Fauconnier and
Turner 1994, 1998, 2002).
The minimal conceptual integration network connects four
mental spaces: two inputs, a generic space that contains all the
structures that the inputs seem to share, and a blended space. The
conventional illustration of this prototypical network, sometimes
called the Basic Diagram (Fauconnier and Turner 2002, 467),
shows four circles marking the points of a diamond, with the
circle representing the generic space at the top and the blended
space at the bottom. However, this four-space model is only the
minimal version of the integration network; in conceptual blending theory, networks can contain any number of input spaces.
Blended spaces can also serve as inputs to new blends, making
elaborate megablends (Fauconnier and Turner 2002, 1513).
What makes a blended space special is that it contains newly
emergent structure that does not come directly from any of the
inputs. For example, understanding This surgeon is a butcher
involves selective projection from inputs of butchery and surgery, but the inference that the surgeon is incompetent arises
only in the blended space.
There is some potential for confusion regarding the terminology used to distinguish blended spaces from other theoretical

Reading by Touch

Vera Tobin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING

Blind people achieve literacy by reading braille, a tactile coding


system for reading and writing. Coding is based on raised dots
arranged in rectangular cells that consist of paired columns of
three dots each. Patterns of one or more dots represent letters,
numbers, punctuation marks, or partial and whole word contractions (Figure 1). Initially, braille was coded for the Latin
alphabets of French or English. For languages with non-Latin
alphabets, braille patterns are assigned according to a transliteration of the Latin alphabet. For example, the third Greek letter
gamma has the dot pattern for the third Latin letter c. Chinese
and other Asian languages use phonetic adaptations of braille.
Chinese braille codes syllables into one, two, or three patterns for, respectively, an initial consonant sound, a final vowel
sound, and a word tone. There are no braille patterns for individual Chinese ideograms. Japanese orthography is more complex, as it includes a combination of Kanji (ideograms imported
from China), Kana (phonograms), Western alphabet, and Arabic
numerals. Kanji is converted to Kana first before translation to
braille. While alphabet represents a single sound, Kana represents a syllable (a consonant and a vowel).
Standard braille in English and many European languages is
usually read in a contracted form (Grade II) in which selected
single patterns signify commonly used words, part-words, or syllables. Hence, many words require only one, two, or three braille
cells and spaces, which reduce reading effort and space for text.
The same braille pattern can represent a letter or a contraction,
depending on context, thus expanding 63 to 256 interpretable
dot patterns in Grade II English braille. Although all alphabet-

131

Blindness and Language

Figure 1. American standard braille cell patterns for


alphabet, punctuation marks, some contractions, and
whole words.

based languages use the same braille patterns, associated contractions vary. Thus, multilingual reading requires the learning
of language-unique contractions.
During reading, scanning movement across text evokes
intermittent mechanical stimulation from contacting successive braille cells, which activates most low-threshold cutaneous
mechanoreceptors found in the fingertip (Johnson and Lamb
1981). A spatial-temporal transformation of the evoked peripheral activity indicates an isomorphic reproduction of braille
cell shapes across a population of mechanoreceptors (Phillips,
Johansson, and Johnson 1990). Through connecting sensory
pathways, these physiological representations of braille cell
shape are conveyed to primary somatosensory cortex (Phillips,
Johnson, and Hsaio 1988) in the parietal lobe. Despite the
expected isomorphic representation of braille cell shapes in
somatosensory cortex, we do not know whether tactile reading in
fact relies on holistically discriminating shape.
Braille cell patterns also differ in the density of dot-gaps,
which is perceived as variations in texture (Millar 1985). These
texture changes produce a dynamically shifting lateral mechanical shearing across the fingertip as it moves over braille text in
fluent reading. Good braille readers attend to these temporally
extended stimulation patterns, as opposed to global-holistic
spatial shapes (Millar 1997, 337). In addition, top-down linguistic content drives perceptual processing in skillful readers, for
whom the physical attributes of the text are subservient to lexical
content. In other words, they do not puzzle out words, letter by
letter; instead, they recognize them due in part to their physical
properties but also to semantic context, the familiarity of words
stored in their mental lexicon, and so on. Less accomplished
readers trace shape by making more disjointed trapezoidal finger movements over individual cells, a strategy that fluent readers utilize when asked to identify particular letters, which is a
shape-based task (Millar 1997, 337).

132

Braillists generally prefer bimanual reading (Davidson,


Appelle, and Haber 1992), with each hand conveying different
information. While one hand reads, the second marks spatial
position in the text (e.g., lines, locations within lines, spaces
between braille cells or words). Photographic records reveal skin
compression of only one fingertip even during tandem movements across text; there is no coincident reading of different
braille cells by multiple fingers (Millar 1997, 337). Text and spatial layout are tracked simultaneously in bimanual reading; there
is no best hand (Millar 1984). Some individuals read an initial
line segment with the left and a final segment with the right hand
(Bertelson, Mousty, and DAlimonte 1985). Despite bimanual
reading, the left hemisphere is generally dominant for language even in left-handed braillists (Burton et al. 2002a).

Visual Cortex Contribution to Language


Blindness requires numerous adjustments, especially for language.
These adjustments appear to involve substantial reorganization of
the visual cortex (occipital lobe), which in sighted people is
dominated by visual stimulation. In blind people, the visual cortex responds more readily to nonvisual stimulation and especially
contributes to language processing. A clinical case study of a congenitally blind, highly fluent braille reader is particularly salient.
Following a bilateral posterior occipital ischemic stroke, she lost
the ability to read braille (Hamilton et al. 2000). However, auditory
and spoken language were unimpaired, and she retained normal
tactile sensations on her braille reading hand despite a destroyed
visual cortex. A similar but transient disruption in tactile reading
occurs in congenitally blind people following repetitive transcranial magnetic stimulation (rTMS) to occipital cortex (Hamilton
and Pascual-Leone 1998; Pascual-Leone et al. 2005).
The obvious explanation for these observations is that the
visual cortex reorganizes after blindness. But things are more
complex. First, occipital cortex normally processes some tactile

Blindness and Language


information in sighted people, especially following short periods of visual deprivation. Thus, blindfolding sighted people for
five days, during which they train to discriminate braille letters,
leads to visual cortex activity to tactile stimulation and sensitivity to disrupting braille letter discrimination by occipital rTMS
(Pascual-Leone and Hamilton 2001). Even without visual deprivation, occipital rTMS impairs macrogeometric judgments of
raised-dot spacing in sighted people (Merabet et al. 2004). These
findings indicate that visual cortex normally contributes to some
tactile discrimination.
Brain-imaging studies have dramatically revealed the role of
visual cortex in language for blind people. For example, generating a verb to an offered noun activates visual cortex in blind
people (see Color Plate 2), irrespective of whether the noun is
read through braille (Burton et al. 2002a) or heard (Burton et al.
2002b). In early blind individuals, verb generation engages both
lower-tier (e.g., V1, V2, VP) and higher-tier (e.g., V7, V8, MT)
visual areas (Color Plate 2). Similar adaptations occur in late
blind individuals, though fewer areas are affected. The semantic
task of discovering a common meaning for a list of heard words
also evokes extensive visual cortex activation in the early blind
and a smaller distribution in the late blind (Color Plate 2). Similar
distributions of visual cortex activity occur when the early blind
listen to sentences with increased semantic and syntactic complexity (Rder et al. 2002) and during a semantic retrieval task
(Noppeney, Friston, and Price 2003). A phonological task of
identifying a common rhyme for heard words activates nearly all
visual areas bilaterally in early blind but few in late blind people
(Color Plate 2). The sublexical task of identifying block capital
letters translated passively across a fingertip activates only parts
of visual cortex in early and late blind people (Color Plate 2).
In general, semantic language tasks activate a greater extent
of visual cortex than lower-level language and most perceptual
tactile or auditory processing tasks. The functional relevance of
occipital cortex to semantic processing is demonstrated when
rTMS over left visual cortex transiently increases errors in verb
generation to heard nouns in the early blind without interrupting
the articulation of words (Amedi et al. 2004). Of course, performing any semantic task depends on retrieving word associations.
Thus, two studies report a stronger relationship between retention performance and visual cortex activity, predominantly in
V1, than with verb generation. V1 response magnitudes correlate more positively with verbal retention (Amedi et al. 2003) and
the accuracy of long-term episodic memory (Raz, Amedi, and
Zohary 2005) in congenitally/early blind participants.
As in sighted people, blind individuals still utilize traditional
left-lateralized frontal, temporal and parietal language areas
(Burton et al. 2002a, 2002b; Burton, Diamond, and McDermott
2003; Noppeney, Friston, and Price 2003; Rder et al. 2002).
Thus, the visual cortex activity represents an addition to the cortical language-processing areas. Visual cortex activity distributes
bilaterally in all blind people for semantic tasks. However, left
visual cortex is more active in early blind individuals (Color Plate
2). In contrast, right hemisphere responses predominate in
late blind participants when they read braille with the right hand
but are more symmetrically bilateral when verbs are generated
to heard nouns (Color Plate 2). It is currently unknown whether
reorganized visual cortex contains specific language domains.

Neuroplasticity has been observed in visual cortex of blind


individuals at all ages of blindness onset. Such observations
garner surprise only when primary sensory areas are viewed as
unimodal processors that funnel computations into a cascade of
cortical areas, including multisensory regions. Activation of reorganized visual cortex by nonvisual stimulation most parsimoniously reflects innate intracortical connections between cortical
areas that normally exhibit nonvisual and multisensory responsiveness in sighted people. The demanding conditions of blindness possibly alter and expand the activity in these connections
and thereby reallocate visual cortex to language processing.
Harold Burton
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Amedi, A., N. Raz, P. Pianka, R. Malach, and E. Zohary. 2003. Early
visual cortex activation correlates with superior verbal memory performance in the blind. Nature Neuroscience 6: 75866.
Amedi, A., A. Floel, S. Knecht, E. Zohary, and L. G. Cohen. 2004.
Transcranial magnetic stimulation of the occipital pole interferes with
verbal processing in blind subjects. Nature Neuroscience 7: 126670.
Bertelson, P., P. Mousty, and G. DAlimonte. 1985. A study of braille
reading: 2. Patterns of hand activity in one-handed and two-handed
reading. Quarterly Journal of Experimental Psychology A: Human
Experimental Psychology 37: 23556.
Burton, H., J. B. Diamond, and K. B. McDermott. 2003. Dissociating cortical regions activated by semantic and phonological tasks to heard
words: A fMRI study in blind and sighted individuals. Journal of
Neurophysiology 90: 196582.
Burton, H., D. G. McLaren, and R. J. Sinclair. 2006. Reading embossed
capital letters: A fMRI study in blind and sighted individuals. Human
Brain Mapping 27: 32539.
Burton, H., A. Z. Snyder, T. E. Conturo, E. Akbudak, J. M. Ollinger, and
M. E. Raichle. 2002a. Adaptive changes in early and late blind: A fMRI
study of braille reading. Journal of Neurophysiology 87: 589611.
Burton, H., A. Z. Snyder, J. Diamond, and M. E. Raichle. 2002b. Adaptive
changes in early and late blind: A fMRI study of verb generation to
heard nouns. Journal of Neurophysiology 88: 335971.
Davidson, P. W., S. Appelle, and R. N. Haber. 1992. Haptic scanning of
braille cells by low- and high-proficiency blind readers. Research in
Developmental Disabilities 13: 99111.
Hamilton, R., J. P. Keenan, M. Catala, and A. Pascual-Leone. 2000. Alexia
for braille following bilateral occipital stroke in an early blind woman.
Neuroreport 11: 23740.
Hamilton, R., and A. Pascual-Leone. 1998. Cortical plasticity associated
with braille learning. Trends in Cognitive Sciences 2: 16874.
Johnson, K. O., and G. D. Lamb. 1981. Neural mechanisms of spatial tactile discrimination: Neural patterns evoked by braille-like dot patterns
in the monkey. Journal of Physiology (London) 310: 11744.
Merabet, L., G. Thut, B. Murray, J. Andrews, S. Hsiao, and A. PascualLeone. 2004. Feeling by sight or seeing by touch? Neuron 42: 1739.
Millar, S. 1984. Is there a best hand for braille? Cortex 20: 7587.
. 1985. The perception of complex patterns by touch. Perception
14: 293303.
. 1997. Reading by Touch. London: Routledge.
Noppeney, U., K. J. Friston, and C. J. Price. 2003. Effects of visual deprivation on the organization of the semantic system. Brain 126: 16207.
Pascual-Leone, A., A. Amedi, F. Fregni, and L. B. Merabet. 2005.
The plastic human brain cortex. Annual Review of Neuroscience
28: 377401.
Pascual-Leone, A., and R. Hamilton. 2001. The metamodal organization
of the brain. Progress in Brain Research 134: 42745.

133

Bounding
(3) *[Handsomei though [S I believe [NP the claim that [S Dick is ti]]],
Im still going to marry Herman.

Phillips, J., R. Johansson, and K. Johnson. 1990. Representation of


braille characters in human nerve fibers. Experimental Brain Research
81: 58992.
Phillips, J. R., K. O. Johnson, and S. S. Hsiao. 1988. Spatial pattern representation and transformation in monkey somatosensory cortex.
Proceedings of the National Academcy of Sciences (USA) 85: 131721.
Raz, N., A. Amedi, and E. Zohary, E. 2005. V1 activation in congenitally
blind humans is associated with episodic retrieval. Cerebral Cortex
15: 145968.
Rder, B., O. Stock, S. Bien, H. Neville, and F. Rosler. 2002. Speech processing activates visual cortex in congenitally blind humans. European
Journal of Neuroscience 16: 9306.
Van Essen, D. C. 2004. Organization of visual areas in macaque and
human cerebral cortex. In The Visual Neurosciences, ed. L. Chalupa
and J. S. Werner, 50721. Cambridge, MA: MIT Press.

To correct this undesirable effect of subjacency, Chomsky


hypothesized that long-distance movement proceeds in short
steps, passing through successive cycles. In particular, he postulated that movement can stop by at the edge of the clause (Sor
COMP; the modern complementizer phrase [CP] area). In other
words, instead of moving long distance in one fell swoop, movement first targets the closest clausal edge and from there proceeds from clausal edge to clausal edge, typically crossing only
one S(/IP)-node at a time:

BOUNDING

(5) [Handsomei though [S I believe [S ti that [S Dick is ti]]], Im still


going to marry Herman.

For all its modernity and insights into the fundamental workings
of language,Noam Chomskys early writings (1955, 1957) contain a curious gap: They do not contain any explicit discussion of
locality. One does not even find extensive discussion of the fact
that movement appears to be potentially unbounded. This gap is
all the more curious from our current perspective where locality
and long-distance dependencies are arguably the major area of
study in theoretical syntax.
We owe our modern interest in locality to John R. Rosss
([1967] 1986) seminal work in which the concept of island was
introduced. Rosss thesis is full of examples of long-distance
dependencies like (1a and b).
(1) a. Handsome though Dick is, Im still going to marry
Herman.
b. Handsome though everyone expects me to try to force Bill
to make Mom agree that Dick is, Im still going to marry
Herman.

Ross systematically investigated the fact that seemingly minute


manipulations dramatically affected the acceptability of sentences. Witness (2a and b).
(2) a. Handsome though I believe that Dick is, Im still going to
marry Herman.
b. *Handsome though I believe the claim that Dick is, Im still
going to marry Herman.

Rosss thesis contains a list of contexts, technically known as


islands, which disallow certain types of dependencies.
Chomsky (1973) set out to investigate what the various
domains identified by Ross as islands have in common. Thus
began the modern study of locality and, in many ways, the nature
of current linguistic theorizing.
Chomskys central insight in 1973 is that movement is subject
to the subjacency condition, a condition that forbids movement from being too long. Specifically, his notion of subjacency
prevented movement from crossing two bounding nodes. For
Chomsky, the bounding nodes were the top clausal node (S for
sentence; our modern inflectional phrase [IP]) and NP (noun
phrase; our modern DP [determiner phrase]). The condition correctly captured the unacceptability of (2b), but wrongly predicted
(2a) to be out (see [3] and [4]).

134

(4) [Handsomei though [S I believe that [S Dick is ti]], Im still going


to marry Herman.

Successive cyclicity may at first seem like a patch, an exemption


granted to fix a bad problem (without it, the theory would wrongly
rule out acceptable constructions). But subsequent research has
uncovered a wealth of data, reviewed in Boeckx (2007), that converge and lend credence to the successive cyclic movement
hypothesis, making it one of the great success stories of modern
generative grammar. It appears to be the case that long-distance, unbounded dependencies are the result of the conjunction of small, strictly bounded steps.
Currently, our most principled explanation for the phenomenon of successive cyclicity is that it is the result of some economy
condition that requires movement steps to be kept as short as
possible (see Chomsky and Lasnik 1993; Takahashi 1994; Boeckx
2003; Bokovi 2002).
As eljko Bokovi (1994) originally observed, some additional condition is needed to prevent this economy condition
requiring movement steps to be kept as short as possible from
forcing an element that has taken its first movement step to be
stuck in creating infinitesimally short steps. Put differently, some
condition is needed to prevent chain links from being too short.
The idea that movement that is too short or superfluous
ought to be banned has been appealed to in a variety of works
in recent years, under the rubric of anti-locality (see Grohmann
2003 for the most systematic investigation of anti-locality; see
also Boeckx 2007 and references therein.)
The anti-locality hypothesis is very desirable conceptually. It
places a lower bound on movement, as Chomskys subjacency
condition places an upper bound on movement. Since it blocks
vacuous movement, it is also an economy condition (dont do
anything that is not necessary), on a par with the underlying
force behind subjacency.
We thus arrive at a beautifully symmetric situation, of the kind
promoted by the recently formulated minimalist program for
linguistic theory (Chomsky 1995 and Boeckx 2006, among others): Long-distance dependencies, which are pervasive in natural
languages, are not taken in one fell swoop. The patterns observed
in the data result from the conjunction of two economy conditions: movement must be kept as short as possible, but not too
short.
Cedric Boeckx

Brain and Language


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Boeckx, Cedric. 2003. Islands and Chains. Amsterdam: John
Benjamins.
. 2006. Linguistic Minimalism: Origins, Methods, Concepts, and
Aims. Oxford: Oxford University Press.
. 2007. Understanding Minimalist Syntax: Lessons from Locality in
Long-Distance Dependencies. Oxford: Blackwell.
Bokovi, eljko. 1994. D-structure, -criterion, and movement into
-positions. Linguistic Analysis 24: 24786.
. 2002. A-movement and the EPP. Syntax 5: 167218.
Chomsky, Noam. 1955. The logical structure of linguistic theory.
Manuscript, Harvard/MIT. Published in part in 1975 by Plenum, New
York.
. 1957. Syntactic Structures. The Hague: Mouton.
. 1973. Conditions on transformations. In A Festschrift for Morris
Halle, ed. S. Anderson and P. Kiparsky, 23286. New York: Holt,
Rinehart, and Winston.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
Chomsky, Noam, and Howard Lasnik. 1993. Principles and parameters theory. In Syntax: An International Handbook of Contemporary
Research, ed. J. Jacobs, A. von Stechow, W. Sternefeld, and T.
Vennemann, 50669. Berlin: de Gruyter.
Grohmann, Kleanthes K. 2003. Prolific Domains. Amsterdam: John
Benjamins.
Ross, John R. [1967] 1986. Constraints on variables in syntax. Ph.D.
diss., MIT. Published as Infinite Syntax! Norwood, NJ: Ablex.
Takahashi, Daiko. 1994. Minimality of movement. Ph.D. diss.,
University of Connecticut.

BRAIN AND LANGUAGE


The brains of humans have developed to control our articulators
and our sensory systems in ways that permit human language.
Our knowledge of how the brain subserves language was cursory in the early millennia of recorded history; in the late 1800s,
developments in neurology in Europe gave us the tools to form
a more precise understanding of how brains support language.
Subsequent advances in neurolinguistic knowledge arose when
clinician-scientists abstracted patterns and other researchers
developed technical tools (of neuropathology, linguistics, psychology, psycholinguistics, and brain localization via imaging)
that permitted both groups to understand the complexity of
brain-behavior relationships at ever-finer-grained levels.

Clinical Observation: The Behavioral Sequelae of Brain


Damage and the Brain Structures Underlying Them
As neurology developed in France, psychiatric patients were distinguished during their lives from aphasics (those with language
problems but not emotional ones), and post mortem dissection
and advances in staining techniques permitted localization of
the brain areas that could be linked to the language behaviors
recorded prior to the patients death. The developing understanding of French neurologist Paul Broca that not only the
frontal lobe (1861) but also the dominant, left, hemisphere
(1865) was linked to articulated language was extended by
German neurologist Carl Wernicke (1874). Wernicke suggested
an additional region, farther back in the brain, that was responsible not for articulated speech but, rather, for comprehension of
it. In his paper, moreover, Wernicke proposed a model of centers

for language that predicted another type of aphasia that could


be, and was, found: conduction aphasia. In conduction aphasia, it was not centers but connections between them that were
impaired: Neither brocas area of the brain nor wernickes
area was itself damaged, but the link between them was; the production of speech and comprehension of it were not impaired,
but, rather, repetition of auditory input became problematic.
The model postulated by Wernicke in his paper showed centers for speech, comprehension, and ideas overlaid on the image
of a right [sic] hemisphere, and his colleague Ludwig Lichtheim
abstracted this localizationist model away from a picture of underlying brain, expanding it to include reference to centers for reading
and writing. In England, John Hughlings Jackson took exception to
this localizationist/connectionist approach, taking the holist position. He pointed out that even in patients with extensive damage to
the dominant hemisphere, some language remained (e.g., a subset of emotional words, often curse words, as had been the case
with Brocas first patient), suggesting, by way of the subtractionist
logic these researchers employed, that the nondominant hemisphere also participated in language in the healthy individual.
In France, the debate between those who believed in multiple
types of aphasia, each associated with brain damage in a different area, and those who believed in a unitary aphasia associated
with a single location continued in a series of debates (analyzed in
English by Lecours et al. 1992). Neuropathologist Auguste Djrine
and her neurologist husband Jules led proponents for the multiple-connected-centers position, whereas Pierre Marie argued for
a unitarist one. He asserted that what we now call Maries quadrilateral, a region near the insula that has only recently been seriously implicated in language again, was the seat of all language.
In addition to discussions of localization, bilingualism
earned a place in explanations of neuropsychological phenomena among students of the neurologist Jean-Martin Charcot in
the later nineteenth century, as Sigmund Freud, in his 1891 book
On Aphasia, and Albert Pitres, in his 1895 article on bilingual
aphasia, respectively championed the first-learned versus the
best-known language in predicting patterns of differential recovery from aphasia.
Into the early twentieth century, European neurologists continued developing their careful clinical examination of patients,
which they then followed with an examination of their brains,
via advanced staining techniques, post mortem. In 1906, the
German neurologist Alois Alzheimer isolated a type of disease
among those theretofore housed in psychiatric institutions
when he discovered distinctive cellular changes in specific
levels of cortex associated with what we now call Alzheimers
dementia. His extended descriptions of the communication
problems of his patients are models of the careful observation
of the semantic and conversational breakdown associated with
this disease.

Rehabilitation
The next major step forward in neurolinguistics lay in developing
rehabilitation techniques for those with language impairment.
One important group had been identified by the ophthalmologist James Hinshelwood (1902), who described the case of a child
who had particular difficulty learning to read despite normal

135

Brain and Language


intelligence and vision. The American neurologist Samuel Orton,
who examined an increasing number of such children through
referrals, published his 1937 book Reading, Writing and Speech
Problems in Children and Selected Papers [sic], classifying their
problems. He worked with Anna Gillingham to develop a multisensory, systematic, structured system for training which, like
others that have been derived from it, enables children whose
brains do not naturally pick up reading today termed dyslexics or developmental dyslexics (see dyslexia) to learn to do so.
The recognition that dyslexics might have normal (or better than
normal) intelligence but have substantial difficulty learning to
read confirmed a second point that Wernicke had included in his
1874 article, that language and intelligence were dissociable. It
may also be seen as the earliest vision of individual differences in
brain organization that went beyond the donation of their brains
by the Paris Anthropology Society to determine whose was bigger
and phrenologys assertion that differing sizes of cortical regions,
as evidenced by differences in skull configuration, explained personality differences.
This focus on rehabilitation resulted in the initiation of the
field of speech therapy, today speech-language pathology in North
America and logopedics elsewhere. The seminal work of Hildred
Schuell, James Jenkins, and E. Jimenez-Pabon (1964) classified
the language disorders resulting from injury to the brain in adulthood according to the primary impairment of either comprehension or production. A more holist approach developed alongside
this one, that of Bruce Porch, whose system of classification
showed a set of language abilities clustering together.

Lateral Dominance
In the 1950s and 1960s, American psychology was also developing
more rigorous methods of studying behavior, and brain, though
not necessarily linking them yet. Brocas late-nineteenth-century
observation that aphasia tended to arise primarily from damage
to the left hemisphere of the brain, rather than the right, took on
a new life as the techniques of dichotic listening and tachistoscopic presentation evolved to study lateral dominance in nonbrain-damaged individuals. In dichotic listening, series of three
or so pairs of words are presented simultaneously, one to each
ear, and participants are asked to recall as many words as they
can. Because the primary connections between ear and brain are

to the brain hemisphere opposite a given ear (the contralateral


one), participants are better able to recall more words from the
ear contralateral to their language-dominant hemisphere, that
is, the right ear for language stimuli.
Tachistoscopic presentation permitted a visual analogue to
dichotic listening: When visual information is flashed so that it is
visible only to a single visual field, that information is processed first
by the brain hemisphere contralateral to the visual field. The eyes
cannot turn quickly enough to see the stimulus in the central visual
field, which projects to both hemispheres. Thus written language,
but not non-language visual information such as pictures, is processed faster and better by the language-dominant hemisphere.
From such laterality studies we came to understand the dominant importance of the left hemisphere for processing auditory
and written language for most humans, and the link between this
lateralized brain dominance and handedness. For a period, such
techniques were used as well to determine if bilinguals brains
employed relatively more right hemisphere in language processing than monolinguals did, following up the suggestions of
a number of researchers that bilingualism might be more bilaterally organized, or that early less-proficient language abilities
might rely relatively more on right hemisphere contributions.
They hypothesized this possibility because they thought they
saw a disproportionately large number of instances of crossed
aphasia, that is, aphasia resulting from right hemisphere damage rather than the more usual left hemisphere damage. Today
it appears, instead, that crossed aphasia is no more frequent
among bilinguals than among monolinguals.
During this same midcentury period, behavioral neurologist
Norman Geschwind and neuropathologist Marjorie LeMay undertook post mortem studies of sizable numbers of brains, demonstrating that the apparent symmetry of the hemispheres is misleading.
Rather, they demonstrated precisely that the cortical region around
the Sylvian fissure (the perisylvian region; see Figure 1) that was
understood to be crucial for language differed markedly, with a
steeper rise of the Sylvian fissure in most right hemispheres corresponding to more temporal lobe cortex available for language on
the left. (In a small percentage of brains, presumably those from
left-handers for the most part, the two hemispheres were indeed
identical in this regard; in another small set the cortical asymmetries were reversed.)

Figure 1. Brain and language (cortical structures).

136

Brain and Language

Lenticular nucleus
(putamen and globus pallidus)

BASAL GANGLIA
OF FORK-BRAIN

The Return of Localization


Geschwind brought a localizationist approach back to aphasiology
in his work at the Boston Veterans Administration Hospital during
the mid-1960s. He anonymously translated Wernickes 1874 article
into English, and himself published a seminal work (1965) on disconnection syndromes, reminding readers of the particular pair of
brain-damaged sites required for alexia without agraphia, that is, a
difficulty in reading but not in rewriting resulting from brain damage in adults who had previously been literate. With his colleagues
at the Aphasia Research Center of the Boston VA Hospital, Edith
Kaplan and Harold Goodglass, he developed this approach into
the aphasia classification schema behind the Boston Diagnostic
Aphasia Exam (Goodglass and Kaplan, 1972), which includes categories for Wernickes and Brocas aphasias, as well as conduction
aphasia, anomia, and the transcortical aphasias. (This test and
classification system is quite similar to that of their student Andrew
Kertesz, the Western Aphasia Battery). In the 1970s, aphasia grand
rounds at the Boston VA Hospital were structured as a localizationist quiz: Neurologists, neuropsychologists, and speech-language
pathologists who had tested a patient would first present their findings; then Geschwind or Goodglass would test the patient in the
front of the room. After the patient had left, those gathered would
guess what the angiography and, later, CT scan results would demonstrate; then Naeser would report the lesion location.
Perhaps precisely because CT scans permitted inspection of
brain damage beneath the cortex, an understanding of the subcortical aphasias was developed from the linking of patients who
were neither fluent (as Wernickes aphasics are) nor nonfluent
(as Brocas aphasics are in that the speech they produce consists largely of substantives, with functor grammatical words and
affixes omitted or substituted for). Early work distinguished the
cortical aphasias from the subcortical ones (e.g., Alexander and
Naeser 1988), while more recent work has distinguished aphasias in ever more discrete subcortical regions (e.g., thalamus
and globus pallidus; see basal ganglia and Figure 2).
With further advances in neuroimaging technology such as
magnetic resonance imaging (MRI) to study brain regions and,
more recently, diffusion tensor imaging (DTI) to study pathways,

Figure 2. Schematic representation of the chief ganglionic categories. Adapted from Gray, Henry. Anatomy
of the Human Body. Edited by Warren H. Lewis.
Philadelphia: Lea & Febiger, 1918.

more precision has become available to distinguish areas of the


brain that are damaged when language is impaired. In addition,
as functional tools are developed (e.g., functional MRI, also called
fMRI), it is no longer necessary to rely on the traditional neurolinguistic premise (if area X is damaged and language problem Y
results, then X must be where Y is localized, or, at least, X is crucial
for normal performance of Y). It is interesting to note that these
advances in technology point to a number of regions outside the
traditional perisylvian language area of cortex that appear linked
to language areas such as prefrontal cortex, the supplementary
motor area, and the like. J. Sidtis (2007) points out the logical problem that arises as one would want to reconcile the aphasia data,
which suggest a relatively delimited area of the dominant hemisphere that serves language, with the imaging data that indicate
areas beyond the perisylvian region in the dominant hemisphere,
as well as many subcortical and non-dominant-hemisphere
regions (usually, counterparts to those of the dominant hemisphere) that appear to be involved in language processing.

Linguistic Phenomena Driving Neurolinguistic Study


Alongside developments in tools for measuring brain regions
involved in language during the last quarter of the twentieth
century and the beginning of the twenty-first, developments in
linguistics both within and beyond Chomskys (itself protean)
school have permitted refinements of the questions asked in
neurolinguistics and, thus, the answers received.
Sheila Blumstein opened up the study of phonology within
aphasiology, researching, first, the regression hypothesis of her
mentor Roman Jakobson and demonstrating, consistent with
the literature on speech errors in non-brain-damaged individuals, that errors tended to differ from their targets by fewer,
rather than more, distinctive features. She and her students have
studied suprasegmental phonology as well as segmental phonology, demonstrating differences in intonation patterns used
exclusively for languages from those that are not. Indeed, other
speech scientists have turned to tone languages, such as Thai
and Chinese, to demonstrate the parallels there: Brain areas
of the dominant hemisphere process those features that ones

137

Brain and Language


language treats as linguistic, even if ones language is a signed
one (sign language), where phonemic elements are not,
strictly speaking, phonemic.
Phonology of lexical items is treated, in some classificatory
schemata, as a part of semantics. Lexical-word-shape, however, can be divorced from word-meanings, as patients with the
syndrome of word-deafness the ability to recognize that they
know a lexical item without knowing what it means demonstrate. Studies of aphasic errors have demonstrated the psychological reality of such concepts as phoneme, syllable, word
stress, and the like. Moreover, priming studies show that lexical
items are organizationally linked to others in their phonological
(or spelling or both) neighborhoods.
morphology has been studied via aphasiology with particular reference to agrammatism, that syndrome associated with
Brocas aphasia in which the language that patients produce is
telegraphic, consisting largely of substantive nouns and, to a lesser
extent, verbs, and relatively devoid of functor grammatical elements, including inflectional and derivational affixes. A number
of theories have been developed to account for this interesting
phenomenon, and it is clear that the salience of the omissible elements plays a role, as their production can be induced in a number
of ways, suggesting that they are not lost but, rather, costly to produce in Brocas aphasia. Evidence for salience varying across different languages can be found in the reports on agrammatism in
14 languages in Menn and Obler (1990) and in the work of E. Bates
and her colleagues (see Bates and Wulfeck 1989). Compounding,
too, has recently gained attention in neurolinguistics, and can be
seen to pose problems for agrammatic patients.
Agrammatic patients are particularly pertinent in studying
syntax as well, since not only does their production suggest
that they minimize syntactic load, but their comprehension is
arguably impaired syntactically as well. Whether this is because
traces are nonexistent in such individuals or because of more
general processing deficits associated with brain damage, perhaps half of the patients with agrammatism have difficulty in
processing passive sentences, suggesting that the brain areas
impaired in Brocas aphasia are required for full syntactic function. In non-brain-damaged individuals, fMRI studies suggest
that substantial regions beyond the traditional perisylvian language areas regions such as prefrontal cortex, linked to general
control functions of brain activity subserve comprehension
(e.g., Caplan et al. 2007).
Semantics is better studied in patients with dementing diseases such as Alzheimers disease, in whom it breaks down,
than in patients with aphasia, in whom it appears better
spared, at least for nonfluent (Brocas) and anomic aphasics.
Nevertheless, there have been indications that patients with
severe, global aphasia have difficulty with aspects of semantic
processing.
Psycholinguistic studies of lexical priming are also useful in
studying semantics in non-brain-damaged individuals. When
they have to judge if nurse is a word or not in English, they are
faster if they have seen doctor previously than the word horse,
suggesting semantic networks in our lexica. Event-related potential (ERP) measures, moreover, demonstrate that we process
top-down for semantic consistency. When we are presented with
a sentence that includes a word that is semantically anomalous,

138

a characteristic electrical response around 400 msec after that


word indexes our surprise.
Study of pragmatics in brain-damaged individuals rarely
focuses on aphasics, as their pragmatic abilities tend to be
remarkably spared. Rather, it is patients with damage to the
right hemisphere who evidence problems with pragmatic abilities, such as verbal humor appreciation, inferencing, conversational coherence, and the like. Patients with the dementias
show an interesting combination of the sparing of some basic
pragmatic abilities (e.g., eye contact during communication, use
of formulaic language) and deficits in higher-level pragmatic
behaviors, including, among those who are bilingual, the appropriate choice of whether to address their interlocutors in one or
the other, or both, of their two languages.
Written language, of course, is not studied differently from
auditory language by linguists, and rarely even by psycholinguists, but the literature on brain-damaged individuals offers
numerous examples of selective impairment of one of these
modes of input and/or output. Historically, such studies focused
on alexia and agraphia, that is, disturbances of reading and/or
writing in previously literate individuals who had difficulties with
these skills as the result of adult-onset brain damage. Currently,
information about brainlanguage links for reading and writing
comes from the study of dyslexia, which, of course, is not linked
to frank brain damage but has been shown to co-occur with
unusual distribution of certain cellular types in language-related
brain areas (e.g., Galaburda and Kemper, 1979). Psycholinguistic
and brain-imaging studies of both groups of individuals have
shown differences, as well as similarities, between the processing of written and spoken language. The same can be said for
signed languages, as evident from those who are bilingual
speakers of a signed and a spoken language.

Conclusion: Language in Humans Brains


In many branches of science, pendulum swings are evident
between a focus on overarching patterns achieved by ignoring
details of individual differences and a focus on those individual differences. In neurolinguistics, the latter can show the full
range of human brains substrates for humans language abilities. We assume that all individuals (except those with specific
language impairment) learn their first language in pretty much
similar fashion, though we are well aware that even in the first
year of life, some of us start talking sooner and others later. In
adulthood, too, we acknowledge certain individual differences
that are linked to language performance: Some of us are detail
oriented and others more big picture in cognitive style; some
of us are good second-language learners postpubertally and others less so, some of us naturally good spellers and others not, and
some of us slow readers and others of us fast. Indeed, we can
assume that at many levels, from the cellular to brain regional,
from the electrophysiological to fiber connectivity, differences
subserve our human ability to communicate via language. From
the first century of work primarily in Europe to the late-twentieth-century addition of North American contributions, centers
worldwide now participate in moving the field of neurolinguistics forward.
Loraine K. Obler

Brain and Language

Brocas Area

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

BROCAS AREA

Alexander, M. P., and M. A. Naeser. 1988. Cortical-subcortical differences in aphasia. In Language, Communication and the Brain
Research Publications: Association for Research in Nervous and Mental
Disorders. Vol. 66. Ed F. Plum, 21528. New York: Raven Press.
Bates L. and B. Wulfeck. 1989. Comparative aphasiology: A crosslinguistic approach to language breakdown. Aphasiology 3: 11142.
Caplan, D., G. Waters, D. Kennedy, N. Alpert, N. Makris, G. DeDe, J.
Michaud, and A. Reddy. 2007. A study of syntactic processing in aphasia II: Neurological aspects. Brain and Language 101: 15177.
Freud, S. [1891] 1953. Zur Auffassung der Aphasien. Trans. E. Stengel as
On Aphasia. New York: International University Press.
Galaburda, A., and T. Kemper. 1979. Cytoarchitectonic abnormalities in
developmental dyslexia: A case study. Annals of Neurology 6: 94100.
Geschwind, N. 1965. Disconnexion syndromes in animals and man.
Brain 88: 585644.
Goodglass, H., and E. Kaplan. 1972. The Assessment of Aphasia and
Related Disorders. Philadelphia: Lea & Febiger.
Goodglass, H., and Wingfield, A. 1997. Anomia: Neuroanatomical and
Cognitive Correlates. San Diego, CA: Academic Press.
Lecours, A.R., F. Chain, M. Poncet, J.-L. Nespoulous, and Y. Joanette.
1992. Paris 1908: The hot summer of aphasiology or a season in the
life of a chair. Brain and Language 42: 10552.
Menn, L., and L. K. Obler, eds. 1990. Agrammatic Aphasia: A CrossLanguage Narrative Sourcebook. Vol. 3. Amsterdam: John Benjamins.
Obler, L.K., and K. Gjerlow. 1999. Language and the Brain.
Cambridge: Cambridge University Press.
Orton, S. [1937] 1989. Reading, Writing and Speech Problems in Children
and Selected Papers. Repr. Austin, TX: International Dyslexia
Association.
Pitres, A. 1895. Etude sur laphasie des polyglottes. Rev. Md.
15: 87399.
Schuell, H., J. Jenkins, and E. Jimenez-Pabon. 1964. Aphasia in Adults,
New York: Harper & Row.
Sidtis, J. 2007. Some problems for representations of brain organization based on activation in functional imaging. Brain and Language
102: 13040.

In 1861, Pierre Paul Broca presented findings from two patients


who had difficulty speaking but relatively good comprehension
(Broca 1861a, 1861b, 1861c). At autopsy, he determined that both
of these patients, Leborgne and Lelong, suffered from injury to
the inferolateral frontal cortex. He concluded, The integrity
of the third frontal convolution (and perhaps of the second) seems
indispensable to the exercise of the faculty of articulate language
(1861a, 406). Four years later, Broca realized that these and subsequent cases all had lesions to the left inferior frontal gyrus, thus
making the association between language and the LEFT hemisphere (Broca 1865). This assertion proved to be a landmark
discovery that laid the groundwork not only for the study of language but also for modern neuropsychology.
The region of left hemisphere cortex described by Broca subsequently came to be known as Brocas area and the speech disorder, Brocas aphasia. Today, Brocas area is generally defined
as Brodmanns cytoarchitectonic areas (BA) 44 and 45, corresponding to the pars opercularis and pars triangularis, respectively. These regions make up the posterior part of the inferior
frontal gyrus (see Figure 1). Recent investigations have suggested
important differences between BA 44 and 45, both in anatomical asymmetries and in function. However, Broca himself never
designated the region so specifically, considering the posterior
half of the inferior frontal gyrus to be most crucial for the speech
disturbance he described (Dronkers et al. 2007).
Although Brocas area is widely described as a critical speech
and language center, its precise role is still debated. Broca originally thought of this region as important for the articulation of
speech (Broca 1864). More recently, a large body of research has
discussed its potential role in processing syntax. This premise
first arose from behavioral studies of patients with Brocas aphasia in the early part of the twentieth century. It was noted that
patients with Brocas aphasia produced agrammatic speech,
often omitting functor words (e.g., a, the) and morphological markers (e.g., -s, -ed). The following is an example of such

Journals worth checking: Behavior and Brain Science, Brain, Brain


and Language, Journal of Cognitive Neuroscience, Journal of
Neurolinguistics, Nature Reviews Neuroscience, and NeuroReport.

Figure 1. Three-dimensional MRI reconstruction of the


lateral left hemisphere of a noraml brain in vivo, showing
the pars operculars (Brodmann's area 44, anterior to the
precentral sulcus) and the pars triangularis (Brodmann's
area 45, between the ascending and horizontal limbs of
the sylvian fissure; see perisylvian cortex).
Reprinted with permission from Brain (2007), 130, pg.
1433, Oxford University Press.

139

Brocas Area

Figure 2. Lesion reconstruction of a patient with


Brocas aphasia who does not have a Brocas area
lesion (left) and a patient with a Brocas area lesion
without Brocas aphasia (right).

Figure 3. Photographs of the brains of Leborgne


(A) and Lelong (C), Paul Brocas first two aphasic
patients, with close-ups of the lesion in each brain
(B and D). Reprinted from N. F. Dronkers, O. Plaisant,
M. T. Iba-Zizen, E. A. Cabanis (2007), Paul Brocas
historic cases: High resolution MR imaging of the
brains of Leborgne and Lelong, Brain, 130.5: 1436,
by permission of Oxford University Press.

telegraphic speech in a patient with Brocas aphasia, describing a


drawing of a picnic scene by a lake:
O, yeah. Dets a boy an a girl an a car house
light po (pole). Dog an a boat. N dets a mm a coffee,
an reading. Dets a mm a dets a boy fishin.

During the 1970s, a number of studies reported that comprehension of complex syntactic forms was also disrupted in this
patient group. A seminal study by A. Caramazza and E. B. Zurif
(1976) reported that Brocas aphasics had particular difficulty
understanding semantically reversible versus irreversible sentences (e.g., The cat that the dog is biting is black vs. The apple
that the boy is eating is red). This study concluded that Brocas
area mediated syntactic processes critical to both production
and comprehension. By extension, Brocas area came to be associated with syntactic processing.
Although many subsequent studies supported this general
notion, several others pointed out the need for caution. For example, M. C. Linebarger, M. Schwartz, and E. Saffran (1983) showed
that agrammatic patients could make accurate grammaticality judgments, which would seem to challenge the notion
of Brocas area being broadly involved in syntactic processing.
Second, many patients with fluent aphasia and lesions outside of
Brocas area have been found to exhibit grammatical deficits that

140

overlap with those of Brocas aphasics (Caplan, Hildebrandt, and


Makris 1996). In addition, individual patients with Brocas aphasia do not always show the same pattern of grammatical deficit
but, rather, vary in the types of errors they make (Caramazza et al.
2001). Cross-linguistic studies also put a damper on the agrammatic theory of Brocas aphasia. E. Bates and colleagues have
shown that in other languages such as German, where grammatical markers are critical for conveying meaning and semantic
content, patients with Brocas aphasia do not omit morphemes
as they do in English (Bates, Wulfeck, and MacWhinney 1991).
Finally, recent studies have shown that grammatical errors can be
induced in normal participants through the use of degraded stimuli or stressors, such as an additional working memory load
(Dick et al. 2001). These findings would argue against a grammar
center but, rather, suggest that competition for resources could
also underlie deficits observed in Brocas aphasia.

Lesion Studies of Brocas Area


As with the previous claims, it is often assumed that all patients
with Brocas aphasia have lesions in Brocas area, and thus that
the deficits in Brocas aphasia equate to dysfunction in Brocas
area. However, many studies making claims about Brocas area
and its functions did not actually verify lesion site. In fact, lesion
studies have shown that Brocas aphasia typically results from

Brocas Area
a large left hemisphere lesion that extends beyond Brocas area
to include underlying white matter, adjacent frontal cortex, and
insular cortex (Alexander, Naeser, and Palumbo 1990; Mohr et
al. 1978). Color Plate 3 shows a lesion overlay map of 36 patients
with chronic Brocas aphasia persisting more than one year. As
can be seen, the region of common overlap (shown in dark red)
is not Brocas area but rather medial regions (i.e., more central),
namely, insular cortex and key white matter tracts. Indeed, only
5060 percent of patients with lesions extending into Brocas
area have a persistent Brocas aphasia. Lesions restricted to
Brocas area tend to cause a transient mutism followed by altered
speech output, but not a chronic Brocas aphasia (Mohr et al.
1978; Penfield and Roberts 1959). These findings would suggest
that Brocas area proper might be more involved in later stages of
speech production (e.g., articulation).
Even early studies reported contradictory cases, namely,
patients with Brocas area affected but no Brocas aphasia or
patients with Brocas aphasia but no lesion in Brocas area (e.g.,
Marie 1906; Moutier 1908). Figure 2 shows an example of an
individual with Brocas aphasia whose lesion spares Brocas area
(left) and one with word-finding problems but no Brocas aphasia after a lesion to Brocas area (right).
Although Broca deduced that the critical region for his
patients articulation disturbance was the inferior frontal gyrus,
he realized that his patients lesions most likely extended more
medially. However, he wanted to maintain the brains for posterity and chose not to dissect them (see Figure 3). Recently, N. F.
Dronkers and colleagues (2007) had the opportunity to acquire
three-dimensional MRI images of the brains of Brocas two original patients (Leborgne and Lelong), which are kept in a Paris
museum. They found that the lesions in both patients extended
quite medially, involving underlying white matter, including the
superior longitudinal fasciculus. Moreover, one of the patients
brains (Leborgnes) had additional damage to the insula, basal
ganglia, and internal and external capsules. With respect to
the extent of frontal involvement, Leborgnes lesion affected the
middle third of the inferior frontal gyrus to the greatest extent,
with only some atrophy in the posterior third. In Brocas second
patient, Lelong, the lesion spared the pars triangularis, affecting only the posterior portion of the pars opercularis. Thus, even
what is commonly referred to as Brocas area (BA 44, 45) is not
exactly the region affected in Brocas original patients.

Functional Neuroimaging of Brocas Area


More recently, functional neuroimaging techniques, such
as functional magnetic resonance imaging (fMRI) and positron
emission tomography (PET), have opened up new avenues for
the study of brain areas involved in language and cognition.
Consistent with the behavioral studies described previously, a
number of functional neuroimaging studies with normal participants have suggested a link between Brocas area and syntactic processing (e.g., Caplan et al. 2000; Friederici, Meyer, and
von Cramon 2000). However, Brocas area has also been linked
to a number of other nonsyntactic cognitive processes in the
left hemisphere, such as verbal working memory, semantics,
frequency discrimination, imitation/mirror neurons (see mirror systems, imitation, and language ), tone perception (in tonal languages), and phonological processing. It is

possible that subregions of what is now interpreted as Brocas


area (i.e., BA 44, 45) may be functionally distinct, which could
explain the heterogeneity of functions associated with this
area. An alternative explanation is that functional activations of
Brocas area may be due to task demands involving articulation
and/or subvocal rehearsal. These ideas remain to be explored.
A number of theories have arisen to try to reconcile the early
lesion studies of agrammatism and newer functional neuroimaging findings. For example, it has been suggested that Brocas area
is crucial not for syntactic processing but for aspects of on-line
storage (i.e., verbal working memory) that in turn may underlie the ability to process complex grammatical forms (Stowe,
Haverkort, and Zwarts 2005). C. J. Fiebach and colleagues (2005)
showed that BA 44 was active when participants processed sentences with a large working-memory load, and many neuroimaging studies have suggested that Brocas area is involved in verbal
working memory, in particular (e.g., Awh, Smith, and Jonides
1995). Such findings would suggest that Brocas area may be crucial for the understanding of complex syntactic forms due to a
basic role in subvocal rehearsal (but see Caplan et al. 2000).
In sum, though the term Brocas area has persisted for more
than a hundred years, the precise anatomical demarcation of
this brain region, along with its exact role in speech and language
processing, are still being debated. At a minimum, Brocas area
plays a role in end-stage speech production but is unquestioningly a significant part of a larger network that supports speech
and language functions in the left hemisphere. Further work is
needed to determine whether it plays a direct or indirect role in
a number of other cognitive processes that have been suggested
and whether these relate to neighboring regions within the inferior frontal gyrus or distinct functional subregions within the territory known as Brocas area.
Nina F. Dronkers and Juliana V. Baldo
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alexander, M., M. Naeser, and C. Palumbo. 1990. Brocas area aphasias: Aphasia after lesions including the frontal operculum. Neurology
40: 35362.
Awh, E., E. Smith, and J. Jonides. 1995. Human rehearsal processes and
the frontal lobes: PET evidence. Annals of the New York Academy of
Sciences 769: 97117.
Bates, E., B. Wulfeck, and B. MacWhinney. 1991. Crosslinguistic research
in aphasia: An overview. Brain and Language 41: 12348.
Broca, P. 1861a. Nouvelle observation daphmie produite par une
lsion de la troisime circonvolution frontale. Bulletins de la Socit
danatomie (Paris), 2e serie, 6: 398407.
Broca, P. 1861b. Perte de la parole: Ramollissement chronique
et destruction partielle du lobe anterieur gauche du cerveau. Bulletins
de la Socit danthropologie, 1re serie, 2: 2358.
. 1861c. Remarques sur le sige de la facult du langage articul,
suivies dune observation daphmie (perte de la parole). Bulletins de
la Socit danatomie (Paris), 2e serie, 6: 33057.
. 1864. Sur les mots aphmie, aphasie et aphrasie; Lettre M. le
Professeur Trousseau. Gazette des hopitaux 23.
. 1865. Sur le sige de la facult du langage articul. Bulletin de la
Socit dAnthropologie 6: 33793.
Caplan, D., N. Alpert, G. Waters, and A. Olivieri. 2000. Activation of
Brocas area by syntactic processing under conditions of concurrent
articulation. Human Brain Mapping 9: 6571.

141

Cartesian Linguistics
Caplan, D., N. Hildebrandt, and N. Makris. 1996. Location of lesions in
stroke patients with deficits in syntactic processing in sentence comprehension. Brain 119: 93349.
Caramazza, A., E. Capitani, A. Rey, and R. S. Berndt. 2001. Agrammatic
Brocas aphasia is not associated with a single pattern of comprehension performance. Brain and Language 76: 15884.
Caramazza, A., and E. B. Zurif. 1976. Dissociation of algorithmic and
heuristic processes in language comprehension: Evidence from aphasia. Brain and Language 3: 57282.
Dick, F., E. Bates, B. Wulfeck, M. Gernsbacher, J. A. Utman, and
N. Dronkers. 2001. Language deficits, localization, and grammar: Evidence for a distributive model of language breakdown in
aphasics and normals. Psychological Review 108: 75988.
Dronkers, N. F., O. Plaisant, M. T. Iba-Zizen, and E. A. Cabanis. 2007.
Paul Brocas historic cases: High resolution MR imaging of the brains
of Leborgne and Lelong. Brain 130.5: 143241.
Fiebach, C. J., M. Schlesewsky, G. Lohmann, D. Y. von Cramon, and A.
D. Friederici. 2005. Revisiting the role of Brocas area in sentence
processing: Syntactic integration versus syntactic working memory.
Human Brain Mapping 24: 7991.
Friederici, A. D., M. Meyer, and D. Y. von Cramon. 2000. Auditory
language comprehension: An event-related fMRI study on the processing of syntactic and lexical information. Brain and Language
74: 289300.
Linebarger, M. C., M. Schwartz, and E. Saffran. 1983. Sensitivity to
grammatical structure in so-called agrammatic aphasics. Cognition
13: 36193.
Marie, P. 1906. Revision de la question de laphasia: La troiseme circonvolution frontale gauche ne joue aucun role special dans la function du
langage. Semaine Medicale 26: 2417.
Mohr, J., Pessin, S. Finkelstein, H. H. Funkenstein, G. W. Duncan, and
K. R. Davis. 1978. Broca aphasia: Pathologic and clinical. Neurology
28: 31124.
Moutier, F. 1908. Laphasie de Broca. Paris: Steinhell.
Penfield, W., and L. Roberts. 1959. Speech and Brain Mechanisms.
Princeton, NJ: Princeton University Press.
Stowe, L. A., M. Haverkort, and F. Zwarts. 2005. Rethinking the neurological basis of language. Lingua 115: 9971042.

C
CARTESIAN LINGUISTICS
This term began as the title of a 1966 monograph by Noam
Chomsky. It has become the name of a research strategy for the
scientific study of language and mind that Chomsky in other
works calls rationalist or biolinguistic, which he contrasts
to an empiricist strategy (see biolinguistics). Cartesian
Linguistics illuminates these strategies by focusing on contrasting
assumptions concerning mind and language and their study that
are found in the writings of a selection of philosophers and linguists from the late sixteenth century through 1966. The rationalists include Descartes, the Port-Royal Grammarians, Humboldt,
Cudworth, and clearly Chomsky himself in 1966 and now.
The empiricists include Harris, Herder, the modern linguists
(L. Bloomfield, M. Joos, etc.), and again clearly behaviorists,
connectionists, and others attracted to the idea that children
learn languages rather than growing them.

142

Case
Rationalists adopt a nativist (see innateness and innatism)
and internalist approach to the study of language. Support for
nativism is found in poverty-of-the-stimulus observations. To take
these observations seriously, rationalists believe, someone constructing a theory of language is advised to assume that much of
linguistic structure and content is somehow latent in the infants
mind: Experience serves to trigger or occasion structure and
content, not form and constitute them. Descartes himself was a
rationalist and appealed to poverty facts to support his views of
innate and adventitious (but not made up) concepts/ideas.
Until taken up by his Port-Royal followers, however, there was little attention to the innateness of the structure of language itself.
Descartess greater contribution to the strategy with his name
lies in less discussed but equally important observations concerning the creative aspect of language use (see creativity in
language use). Encapsulated in Descartess Discourse V, these
note that speakers can on occasion produce any of an unbounded
set of expressions (unboundedness), without regard to external
and internal stimulus conditions (stimulus freedom), sentences
that nevertheless are appropriate and coherent with respect to
discourse context (appropriateness and coherence). Taking
these seriously suggests a scientific research strategy that focuses
not on linguistic action/behavior (language use) itself, for that
is in the domain of free human action, but on an internal system,
the language faculty. A science of linguistic action would not
only have to take into account the speakers understanding of reasons for speaking and the job that an utterance is understood to
perform but would also have to say what a person will utter on a
specific occasion. No extant science can do that, and likely none
ever will. Given stimulus freedom, independently specifiable
conditions for utterance are unavailable, and there is no upper
bound on sentences appropriate to a speakers understanding of
discourse circumstance. The scientist of language should focus
on language as an internal system, on competence, not on what
people do by using the tools their systems offer. Science can
say what a language can yield; a theory of competence does that.
But it likely cannot say what will be said, when it will be said, or
whether it is appropriate and, if so, why.
Generally speaking, the rationalist strategy treats languages
as natural (native) systems in the head, not artifacts. The empiricist strategy (not empirical science) treats languages as artifacts
in the head, as socially constituted sets of practices or behaviors,
mastery of which (learning) requires training and negative evidence. Cartesian Linguistics emphasizes that there are few, if
any, linguistic practices and that children grow languages.
James McGilvray
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1966. Cartesian Linguistics. New York: Harper and
Row.
. 2003. Cartesian Linguistics. 3rd ed. Ed. and introd. J. McGilvray.
Cambridge: Cambridge University Press.

CASE
This term has been traditionally employed to designate the type of
morphological ending that indicates the syntactic function of the

Categorial Grammar
noun phrase that bears it. In Latin, for instance, the word meaning
girl may surface as puella (nominative case), puellam (accusative
case), or puellae (genitive case), depending on whether it is the subject of the sentence, is the object of a transitive verb, or stands in a
possessor relation with respect to another noun. Languages vary with
respect to the morphological case distinctions they make. Languages
such as Latin have six case distinctions (nominative, genitive, dative,
accusative, vocative, and ablative) whereas languages such as Chinese
have none. One may also find languages like English in which only a
subset of nominal elements display case distinctions as in he (nominative), him (accusative), and his (genitive).
On the basis of a suggestion by Jean-Roger Vergnaud (Rouveret
and Vergnaud 1980), Noam Chomsky (1981) developed a theory
of abstract case (annotated as Case). According to this theory, it is
a property of all languages that noun phrases can only be licensed
in a sentence if associated with Case. Whether or not the abstract
Cases get morphologically realized as specific markings on (some)
noun phrases is a language-specific property. Research in the last
two decades has been devoted to identifying i) which elements
are Case-licensers, ii) which structural configurations allow Case
licensing, and iii) what the precise nature of such licensing is. The
contrast between John/he/*him/*his sang and *John/*he/*him/*his
to sing, for example, has led to the conclusion that in English, the
past tense may license nominative Case, but the infinitival to is not
a Case-licenser. In turn, the contrast between John/he/*him/*his
was greeted and was greeted *John/*he/*him/*his indicates that
nominative is licensed by the past tense if the relevant noun phrase
occupies the subject, but not the object, position.
When a given Case only encodes syntactic information, it is
referred to as structural Case. Nominative and accusative cases
in English are prototypical examples of structural Case. In John
greeted her and she was greeted by John, for example, the pronoun
bearing the thematic role of patient has accusative Case in
the first sentence, but nominative in the second. On the other
hand, when a given Case also encodes thematic information,
it is referred to as inherent Case (Chomsky 1986; Belletti 1988).
The preposition of in English, for example, has been analyzed
as a marker of inherent Case, for it only licenses a noun phrase
that is the complement of the preceding noun: It may license that
country in the invasion of that country but not in *the belief of that
country to be progressive because that country is the complement
of invasion but not of belief.
Jairo Nunes
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Belletti, A. 1988. The Case of unaccusatives. Linguistic Inquiry 19: 134.
Chomsky, N. 1981. Lectures on Government and Binding.
Dordrecht: Foris.
. 1986. Knowledge of Language: Its Nature, Origin and Use.
New York: Praeger.
Rouveret, A., and J. R. Vergnaud. 1980. Specifying reference to the subject: French causatives and conditions on representations. Linguistic
Inquiry 11: 97202.

project word order and build logical forms via general rules of
grammar. First proposed by Kazimierz Ajdukiewicz (1935), it is
arguably the oldest lexicalized formalism. CG is not itself a theory of natural language grammar, but its use has many linguistically relevant ramifications: lexicalization, flexible constituency,
semantic transparency, control over generative capacity, and
computational tractability.

Basic Principles of Categorial Grammar


In Ajdukiewiczs system, words are assigned categories, which
are atomic types like np (noun phrase) and s (sentence) or complex types like s\np, which indicate the arguments that a function
(such as the category for an intransitive verb) subcategorizes for.
Words and phrases combine with others by cancellation of subcategorized arguments through a general operation akin to multiplication. That is, just as 4 (3 4) = 3, there is a grammatical
correlate: np . (s\np) = s. Given that the word Olivia has the type
np and that sleeps has the type s\np, their concatenation Olivia
sleeps has the type s via cancellation of the np argument of sleeps.
CG assumes compositionality: The global properties
associated with a linguistic expression are determined entirely by
the properties of its component parts. Linguistic expressions are
multidimensional structured signs containing phonological/
orthographic (), syntactic (), and semantic () specifications
for the expression. CG is distinguished in that it uses categories
as syntactic types, such as those mentioned. Complex categories encode both subcategorization and linear order constraints,
using the leftward slash \ and the rightward slash /.
Some (simplified) lexical entries are given in the following
format := : .

The transitive verb category (s\np)/np seeks an object noun


phrase to its right and then a subject noun phrase to its left;
after these arguments are consumed, the result is a sentence.
Semantics are given as -calculus expressions that reduce to
predicate and argument structures after syntactic combination. The -calculus is a standard system used in CG (and
many other frameworks) for representing semantics derived by
syntax, where variables in the -terms are bound to corresponding syntactic arguments. For the category for likes, the x variable
is bound to the object (the /np argument), and the y is bound to
the subject (the \np argument) an example of how this works
in a derivation follows. These semantics (and the categories
themselves) are obviously simplified and are intended here only
to demonstrate how the correct dependencies are established.
Leftward and rightward slashes project directionality via two
order-sensitive, universal rules of function application:

CATEGORIAL GRAMMAR
Categorial grammar (CG) is a family of formalisms that model

syntax and semantics by assigning rich lexical categories that


143

Categorial Grammar

Combinatory Categorial Grammar and Categorial


Type Logics

Figure 1.

In words, the forward rule states that a category of type X/Y


can combine with one of type Y found to its right to produce
a category of type X. The symbol > is an abbreviation used in
derivations (as in the next example). The function is applied
similarly for the backward rule. When these rules are used to
combine two syntactic categories, their semantic components
are also combined via function application in the -calculus
(indicated in the rules as f a). For example, the result of
applying the function x.y.like(y,x) to the argument chocolate
is y.like(y,chocolate). This lock-step syntactic-semantic combination underlies the transparent syntax-semantics interface
offered by CG.
With these rules and lexicon, the derivation can be given in
Figure 1.
The subcategorized arguments of the verb are consumed one
after the other, and the semantic reflexes of the syntactic rules
are carried out in parallel. This derivation is isomorphic to a
standard phrase structure grammar (PSG) analysis of such
sentences. Derivational steps can be viewed as instantiations of
rules of a PSG written in the accepting, rather than producing,
direction (e.g., np s\np s instead of s np vp).

Type Dependency
CG with just function application is weakly equivalent to standard context-free phrase structure grammar. Nonetheless, the
approach is radically different: Syntactic well-formedness in CG
is type dependent rather than structure dependent, and derivation is an artifact rather than a representational level. Also, categories labeling the nodes of categorial derivations are much
more informative than the atomic symbols of constituent
structure produced by PSGs. Subcategorization is directly
encoded in categories like s\np, (s\np)/np, and ((s\np)/np)/
np, rather than with stipulated nonterminal symbols such as
V-intrans, V-trans, and V-ditrans that have no transparent connection to their semantic types. Furthermore, there is a systematic correspondence between notions such as intransitive and
transitive: After the transitive category (s\np)/np consumes its
object argument, the resulting category s\np is that of an intransitive verb.
More importantly, type dependency shifts the perspective
on grammar shared with tree-adjoining grammar and headdriven phrase structure grammar away from a topdown one in which phrase structure rules dictate constituent
structure into a bottom-up one in which lexical items project
structure through the non-language-specific rules (i.e., CGs
universal grammar). Recent developments in the transformational grammar tradition, such as minimalism, have
also incorporated such a lexically driven perspective.

144

CG moves further from PSGs and other frameworks by incorporating other rules that provide new kinds of inference over categories. These rules are responsible for the type-driven flexible
constituency for which CG is well known. The two main branches
of CG can be broadly construed as rule based, exemplified by
combinatory categorial grammar (CCG) (Steedman 2000), and
deductive, exemplified by categorial type logics (CTL) (Moortgat
1997; Oehrle in press). We discuss both of these briefly.
CCG adds a small set of syntactic rules that are linear counterparts of the combinators from combinatory logic. Two combinators, composition (B) and type-raising (T), lead to the following
rules, among others:

The rules are guaranteed to be semantically consistent.


Composition of categories leads to composition of the semantic
functions in the -calculus. Type-raising turns an argument into
a function over functions that seek that argument. See the following for an example of both of these rules in a derivation.
CTL is a family of resource-sensitive linear logics, complete
with hypothetical reasoning. This approach began with Joachim
Lambek (1958) recasting basic CG as a logical calculus in which
slashes are directionally sensitive implications; for example, the
English transitive category (s\np)/np is (np s) np. As such,
categories are provided sound and complete model theoretic
interpretations. The application rules given earlier are then just
leftward and rightward variants of modus ponens. Additional
abstract rules may be defined that allow structured sequents of
proof terms to be reconfigured to allow associativity and permutativity. One result is that many rules can be derived as theorems
of a given CTL system. For example, any expression with the category np can be shown to also have the category s/(s\np), among
others. This is an instance of type-raising; similarly, CCGs composition rules (as well as others) can be show to follow from CTL
systems that allow associativity. With CTL, such rules follow from
the logic, whereas rule-based systems like CCG tend to incorporate a subset of such abstract rules explicitly based on empirical
evidence.
As an example of how CCGs rules engender flexible constituency, the sentence Olivia likes chocolate in Figure 2 has an
alternative derivation using composition and type-raising. This
derivation involves a nontraditional constituent with category
s/np for the string Olivia likes, which then combines with chocolate to produce the same result as the previous derivation. A
similar analysis can be given with CTL.
Whether through CCGs rules or through CTL proofs, semantically coherent interpretations for a wide variety of non-traditional
constituents can be created. This forms the core of accounts of
extraction and coordination, as well as intonation and information structure and incremental processing in CG. For example,
subject relative pronouns like who have the category (n\n)/(s\np),
which seeks an intransitive verb type to produce a post-nominal
modifier, while object relativizers like whom have the category

Categorial Grammar

Figure 2.
(n\n)/(s/np), which seeks types which are missing objects such
as Olivia likes and Olivia gave Finn, which both have the type s/
np. Extraction is thus handled without appeal to movement or
traces. long-distance dependencies in object extraction are
captured because forward composition allows the unsaturated
argument to be successively passed up until it is revealed to the
relativizer, as in Figure 3. Under the standard assumption that
coordination combines constituents of like types, then right-node
raising is simply constituent coordination:
[Kestrel heard]s/np and [Finn thinks Olivia saw]s/np the plane
flying overhead.

The compositional semantic terms, omitted here for brevity,


are guaranteed to be consistent with the semantics projected
by the lexical entries because the composition and type-raising
rules themselves are semantically consistent.
These processes conspire in other contexts to create constituents for argument clusters that allow similarly straightforward
analyses for sentences like Kestrel gave Finn comics and Olivia
books.
This phenomenon has been called nonconstituent coordination, reflecting the difficulty in assigning a meaningful phrase
structure that groups indirect objects with direct objects. From
the CG perspective, it is simply treated as type-driven constituent coordination.
One of the key innovations CTL has brought to CG is the incorporation of a multimodal system of logical reasoning (including
unary modes) that allows selective access to rules that permit
associativity and permutation. It is thus possible for a grammar
to allow powerful operations (like permutative ones needed for
scrambling) without losing discrimination (e.g., engendering a
collapse of word order throughout the grammar), while enjoying
a (quite small) universal rule component.
Other rules be they CCG-style rules or CTLs structural
rules support analyses for phenomena such as nonperipheral
extraction, heavy-NP shift, parasitic gaps, scrambling, ellipsis, and others. The commitment to semantic transparency
and compositionality remains strong throughout; for example,
Pauline Jacobson (2008) tackles antecedent-contained deletion
in a directly compositional manner with CG and variable-free
semantics.
See Wood (1993) for a balanced overview of many CG
approaches and analyses. Steedman and Baldridge (in press)
gives a more recent introduction to and overview of work in CCG.
Vermaat (2005) provides a clear and concise introduction to CTL
(including pointers to connections between CTL and minimalism), as well as an extensive cross-linguistic account of wh-questions using a very small set of universal structural rules.

Figure 3.

Current Applications and Developments


While CG is well known but not widely practiced in mainstream
linguistics, it has considerable uptake in both mathematical
logic and computational linguistics. Computational
implementations of CCG are used for parsing and generation
for dialog systems. Current probabilistic CCG parsers, trained
in the CCGbank corpus of CCG derivations for newspaper texts,
are among the fastest and most accurate available for identifying deep syntactic dependencies. A useful aspect of CG for such
implementations that also has implications for its relevance for
psycholinguistics is that the competence grammar is used
directly in performance.
Despite a long divergence between the rule-based and
deductive approaches to CG, recent work has brought them into
greater alignment. CCG has adopted CTLs multimodal perspective. This connection allows efficient rule-based parsing systems
to be generated from CTL grammars the lexicon remains the
same regardless of the approach. CCG itself can be viewed as the
caching out of an underlying definition given in CTL; as such,
CTL can be seen as providing metatheories for ruled-based CGs
like CCG. Work in CTL explores fine-grained control over grammatical processes within the space of sound and complete logics;
work in CCG focuses on cross-linguistic, wide-coverage parsing
and computational grammar acquisition. Researchers in both of
these traditions continue to expand the range of languages and
syntactic phenomena receiving categorial treatments.
Jason Baldridge
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ajdukiewicz, Kazimierz. 1935. Die syntaktische Konnexitt. In Polish
Logic 19201939, ed. Storrs McCall, 20731. Oxford: Oxford University
Press. Translated from Studia Philosophica 1: 127.
Jacobson, Pauline. 2008. Direct compositionality and variable-free
semantics: The case of antecedent contained deletion. In Topics in
Ellipsis, ed. Kyle Johnson, 3368. Oxford: Oxford University Press.
Lambek, Joachim. 1958. The mathematics of sentence structure.
American Mathematical Monthly 65: 15470.
Moortgat, Michael. 1997. Categorial type logics. In Handbook of Logic
and Linguistics, ed. Johan van Benthem and Alice ter Meulen, 99177.
Amsterdam: Elsevier; Cambridge, MA: MIT Press.
Oehrle, Richard. Multi-modal type-logical grammar. In NonTransformational Syntax: A Guide to Current Models, ed. Robert
Borsley and Kersti Brjars. Malden, MA: Blackwell. In press.
Steedman, Mark. 2000. The Syntactic Process. Cambridge, MA: MIT
Press.

145

Categorization
Steedman, Mark, and Jason Baldridge. Combinatory categorial grammar. In Non-Transformational Syntax: A Guide to Current Models, ed.
Robert Borsley and Kersti Brjars. Malden, MA: Blackwell. In press.
Vermaat, Willemijn. 2005. The logic of variation. A cross-linguistic
account of wh-question formation. Ph.D. diss., Utrecht University.
Wood, Mary McGee. 1993. Categorial Grammar. London: Routledge.

CATEGORIZATION
William Labov (1973, 342) stated, If linguistics can be said to be
any one thing, it is the study of categories: that is, the study of how
language translates meaning into sound through the categorization of reality into units and sets of units. Labov is here addressing the relation between linguistic expressions and the things
and situations to which the expressions are used to refer. The
circumstances of the world are limitless in their variety; linguistic resources are finite. Since it is not possible to have a unique
name for every entity that we encounter or a special expression
for every event that happens, we need to categorize the world
in order to speak about it. We need to regard some entities, and
some events, as being the same as others.
The relation between a word and a referent is not direct but
is mediated by the words meaning. It is in virtue of its meaning that a word can be used to refer. A words meaning can be
thought of as a concept, and a concept, in turn, can be thought of
as a principle of categorization. To know the word mug (to take
one of Labovs examples) is to have the concept of a mug, which
in turn means being able to use the word appropriately, namely,
for things that are called mugs. This goes not only for names of
concrete things like mugs but also for names of abstract entities
and for words of other syntactic categories. To state that X is on
Y is to categorize the relation between X and Y as an on-relation
rather than an in- or an at-relation. We can make similar claims
for other elements in a language, such as markers of tense and
aspect. To describe an event in the present perfect as opposed to
the past simple or to use progressive as opposed to non-progressive aspect is to categorize the event in a manner consistent with
the concept designated by the morpho-syntactic elements.
On many counts, therefore, linguists need a theory of categorization. The theory must provide answers to two related questions. On what basis are entities assigned to a category? And why
do we categorize the world in just the way that we do?
According to what has come to be known as the classical
or Aristotelian theory, a category is defined in terms of a set of
necessary and sufficient conditions; it follows that
things belong in a category because they exhibit each of the
defining features. There are many problems associated with
this view. First, it often is just not possible to list the defining
features. What, for example, are the defining features of mug as
opposed to cup? Then there is the question of the features themselves. Each feature will itself define a category, which in turn
must be defined in terms of its necessary and sufficient features.
Unless we are prepared to postulate a set of primitive features
out of which all possible categories are constructed, we are
faced with an infinite regress. Finally, the classical theory makes
no predications about why we should have the categories that
we do. Any conceivable combination of features could constitute a valid category.

146

A major landmark in the development of a nonclassical theory of categorization was the work of psychologist Eleanor Rosch
(1978). She argued that categories have a prototype structure, that is, are centered around good examples, and that things
belong to the category in virtue of their exhibiting some similarities with the prototype. The members of a category, therefore,
do not need to share the same set of features. Moreover, some
members can be better or more representative examples of the
category than others.
Rosch addressed not only the internal structure of categories
but also the question of what makes a good category. Good categories the ones that people operate with, and which are likely
to be encoded in human languages are those that deliver maximum information to the user with minimal cognitive effort. We
can approach this matter in terms of the interplay of cue validity
and category validity. Cue validity means that having observed
that an entity exhibits a certain feature, you can assign the entity,
with a fair degree of confidence, to a certain category. Category
validity means that having learned that an entity belongs to a certain category, you have expectations about the likely properties
of the entity. In this way, we can infer quite a lot about the things
that we encounter, on the basis of minimal information about
them.
Categories also need to be studied against broader conceptual and cultural knowledge having to do with human intentions and purposes, presumed causal relations between things
and events, and beliefs about how the world is structured. We
can imagine all kinds of hypothetical categories say, a category comprising things that are yellow, weighing under five kilograms, and manufactured in 1980. Such a category is unlikely to
be lexicalized in any human language. It displays very low cue
and category validity and would, therefore, not be useful to its
users. It would also have no role to play in any broader knowledge system.
As already recognized by Labov, the issue of categorization
applies not only to the categories we use in talking about the
world but also to the analysis of language itself. The very terminology of linguistic description is replete with names of categories, such as phoneme, noun, direct object, word, dialect,
and so on, and practical linguistic description involves assigning
linguistic phenomena to these various categories. Although the
classical approach to categorization is still very strong among
linguistic theoreticians, it is increasingly recognized that the categories of linguistics may have a prototype structure and are to
be understood, in the first instance, in terms of good examples.
As Labov remarked, linguistics is indeed the study of
categories!
John R. Taylor
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Labov, William. 1973. The boundaries of words and their meanings. In
New Ways of Analyzing Variation in English, ed. C. J. Bailey and R. W.
Shuy, 34072. Washington, DC: Georgetown University Press.
Lakoff, George. 1987. Women, Fire, and Dangerous Things: What
Categories Reveal About the Mind. Chicago: University of Chicago
Press.
Murphy, Gregory. 2002. The Big Book of Concepts. Cambridge, MA: MIT
Press.

Causative Constructions
Rosch, Eleanor. 1978. Principles of categorization. In Cognition
and Categorization, ed. E. Rosch and B. Lloyd, 2748. Hillsdale,
NJ: Lawrence Erlbaum.
Taylor, John R. 2003. Linguistic Categorization. Oxford: Oxford University
Press.

CAUSATIVE CONSTRUCTIONS
Causative construction (CC) is defined as form-meaning mapping that encodes a causative situation (CS) in which an entity
(typically human), or causer, acts upon another entity (typically
human), or causee, to induce some action. Cross-linguistically,
this mapping is known to operate using periphrastic syntactic
construction, valence-increasing morphology, or lexical verb.
Three structural types of CCs thus identified are a) the syntactic
causative, which employs a periphrastic causative verb like make
in English (1); b) the morphological causative, which employs a
causative affix like -(s)ase- in Japanese (2); and the lexical causative, wherein a transitive verb with inherently causative meaning is employed (3), e.g. kiseru to put (clothes) on (someone)
in Japanese.
(1)

Mary made him read the book. (Syntactic)

(2)

Mary-ga musuko-ni huku-o


ki-sase-ta. (Morphological)
nom son-dat clothes-acc wear-caus-past
Mary made her son put on clothes.

(3)

Mary-ga musuko-ni huku-o kise-ta. (Lexical)


nom son-dat clothes-acc put (clothes) on (someone)-past
Mary put clothes on her son.

Grammatical and Semantic Hierarchies of the Causee


Nominal Case Marking
CCs can involve the adjustment of case marking of a causee NP,
accompanied by the increased valence of a causer NP. As shown
in Japanese examples (2)(3), with the causer/subject NP Mary
assigned the nominative case marking -ga, the causee NP John
is demoted or deranked to lower case marking, in this case the
dative ni, since the accusative o is already assumed by another
NP/direct object hon (book). The deranking order of a causee NP reflects a hierarchy of grammatical relations established
cross-linguistically (4):
(4)

Subject > Direct Object > Indirect Object > Oblique (Whaley
1997, 193)

Functional-typological studies (Shibatani 1976a; Givn 1980;


Cole 1983; see functional linguistics, typology) have
noted that differential case marking indexes the differing degrees
of control that a causee can exercise over his or her action relative to the causer. Consider a Japanese example (5).
(5)

Mary-ga John-{o/ni} Tokyo-e ik-ase-ta.


nom
acc/dat
to go-caus-past
Mary {made/let} John go to Tokyo.

The accusative case marker o indexes a lesser degree of control retained by the causee than the dative case marker -ni.
This semantic difference between accusative (patient-marking)
case and dative (experiencer-marking) case is captured by the

semantic hierarchy (6) proposed by Peter Cole (1983). It reflects


the greater-to-lesser degree of control retained by a causee NP.
(6)

Agent > Experiencer > Patient

Further Semantic/Pragmatic Dimensions of CCs


It is not unusual for a language to have more than one type of
CC, for example, Japanese morphological and lexical causatives
(3) and (2). In lexical causative (3), causative and causativized
verbs are completely fused, while in morphological causative (2)
they are separated by morpheme boundary. Crucially, as demonstrated by John Haiman (1983), the differential degrees of
fusion semantically correlate with the differing degrees of directness involved in causing an event. For instance, while a lexical
causative (3) encodes the causers nonmediated action of putting clothes on the causee, a morphological causative (2) can
express the situation where the causee put on his clothes upon
the causers request (e.g., verbal command).
Languages can employ two causatives in a single sentence, as
in a Korean example ( 7), to encode a sequence of CSs.
(7)

John-i Tom-eykey Mary-lul cwuk-i-key ha-ess-ta.


nom dat
acc die- caus-caus-past-decl
John made Tom kill Mary.
(Ishihara, Horie, and Pardeshi 2006, 323)

Double CCs do not always encode a sequence of CSs and can


serve some pragmatic function instead. For instance, as observed
by J. Okada (2003), the Japanese double causative occurs most
frequently in highly conventionalized benefactive expressions
indexing a speakers expression of humbleness toward his/her
own action, as well as politeness toward the addressee, such
as -(s)ase-sase-te itadaku (to have someone allow one to do
something), as in (8).
(8)

Otayori yom-as-ase-te
itadaki-masu.
letter read- caus-caus-conj humbly receive-pol:nonpast
Allow me to read this letter.
(Okada 2003, 29)

In this instance, as contrasted with its single causative counterpart yom-ase-ite itadaku (read-caus-humbly receive), the
double causative serves to reinforce the speakers expression of
humbleness and politeness.
CCs have also been productively investigated by more formally oriented linguists (e.g., Kuroda 1993, Miyagawa 1998).
Kaoru Horie
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Cole, Peter. 1983. The grammatical role of the causee in universal grammar. International Journal of American Linguistics 49: 11533.
Comrie, Bernard. 1976. The syntax of causative constructions: Crosslanguage similarities and divergences. In Shibatani 1976b, 261312.
. 1985. Causative verb formation and other verb-deriving morphology. In Language Typology and Syntactic Description. Vol.
3: Grammatical Categories and Lexicon. Ed. Timothy Shopen, 30948.
Cambridge: Cambridge University Press.
Givn, Talmy. 1980. The binding hierarchy and the typology of complements. Studies in Language 4: 33377.
Haiman, John. 1983. Iconic and economic motivation. Language
59: 789811.

147

C-Command

Cerebellum

Haspelmath, Martin. 1993. More on the typology of inchoative/causative


verb alternation. In Causatives and Transitivity, ed. Bernard Comrie
and Maria Polinsky, 87120. Amsterdam: John Benjamins.
Ishihara, Tsuneyoshi, Kaoru Horie, and Prashant Pardeshi. 2006. What
does the Korean double causative reveal about causation and Korean?
A corpus-based contrastive study with Japanese. In Japanese/Korean
Linguistics. Vol. 14. Ed. Vance, Timothy, 32130. Stanford, CA: CSLI.
Kemmer, Susanne, and Ariel Verhagen. 1994. The grammar of causatives and the conceptual structure of events. Cognitive Linguistics
5: 11556.
Kuroda, Shige-Yuki. 1993. Lexical and productive causatives in
Japanese: An examination of the theory of paradigmatic structure.
Journal of Japanese Linguistics 15: 181.
Miyagawa, S. 1998. (S)ase as an elsewhere causative and the syntactic
nature of words. Journal of Japanese Linguistics 16: 67110.
Okada, J. 2003. Recent trends in Japanese causatives: The sa-insertion
phenomenon. In Japanese/Korean Linguistics. Vol. 12. Ed. McClure,
William. 2839. Stanford, CA: CSLI.
Shibatani, Masayoshi. 1976a. The grammar of causative constructions: A
conspectus. In Shibatani 1976b, 140.
Shibatani, Masayoshi, ed. 1976b. Syntax and Semantics. Vol. 6, The
Grammar of Causative Constructions. New York: Academic Press.
. 2002. The Grammar of Causation and Interpersonal Manipulation.
Amsterdam: John Benjamins.
Shibatani, Masayoski, and Prashant Pardeshi. 2002. The causative continuum. In Shibatani 2002, 85126.
Song, Jae Jung. 1996. Causatives and Causation. London: Longman.
Whaley, Lindsay. 1997. Introduction to Typology. The Unity and Diversity
of Language. New York: Sage Publications.

C-COMMAND
An enduring and fundamental hypothesis within syntactic theory is that the establishment of most, if not all, syntactic relations (agreement, binding, case, control structures,
movement, etc.) requires c-command.
Tanya Reinhart (1979) provides the following definition of
c-command (see also Edward Klimas (1964) in construction
with):
(1)

c-commands if and only if


a. The first branching node dominating dominates ,
and b. does not dominate ,
and c. does not equal .

To illustrate, consider (2).


(2)

Sentence
Noun Phrase

Verb Phrase

Noun Phrase

Noun

Verb

Noun Phrase

Marys

mother

criticizes

herself

Does the noun phrase Marys mother c-command herself in (2)?


The first branching node dominating Marys mother is sentence,
which dominates herself. Also, Marys mother does not dominate or equal herself. Since c-command obtains, in this sentence Marys mother corefers with herself (i.e., herself must mean
Marys mother). By contrast, since Marys fails to c-command

148

herself, Marys and herself are unable to enter into such a relation
(see anaphora and binding).
Although Reinharts pioneering definition is formally explicit
and strongly supported empirically, questions arise regarding explanatory depth, as with any definition. In this respect,
S. Epstein and colleagues (1998) ask:
(i) Why should this formal relation, and not any definable
other, constrain syntactic relations?
(ii) Why is (first) branching relevant?
(iii) Why must not dominate or equal ?
Epstein and colleagues argue that (iiii) receive natural answers
under a bottom-up derivational approach with recursive application of the binary operation merge, as independently motivated in minimalism (Chomsky 1995). C-command is then
arguably an emergent property of this structure-building process
and is expressible in terms of merge, as in (3):
(3)

c-commands all and only the terms of the category with


which was merged in the course of the derivation.

Under (3), c-command is not defined on assembled trees, but


emerges as a consequence of the merger process by which trees
are built. It then follows that only the first branching node is relevant for computing what c-commands, since this branching
node is precisely the syntactic object resulting from merging with
another syntactic category. It also follows that must not dominate , since dominance entails non-merger. Finally, because a
category cannot merge with itself, does not c-command itself.
Gerardo Fernndez-Salgueiro, and Samuel David Epstein
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aoun, Joseph, and Dominique Sportiche. 1983. On the formal theory of
government. Linguistic Review 2: 21135.
Brody, Michael. 2000. Mirror theory: Syntactic representation in perfect
syntax. Linguistic Inquiry 31.1: 2956.
Chomsky, Noam. 1995. The Minimalist Program. Cambridge, MA: MIT
Press.
Epstein, Samuel David, Erich M. Groat, Ruriko Kawashima, and Hisatsugu
Kitahara. 1998. A Derivational Approach to Syntactic Relations.
Oxford: Oxford University Press.
Kayne, Richard. 1984. Connectedness and Binary Branching.
Dordrecht: Foris.
Klima, Edward. 1964. Negation in English. In The Structure of
Language, ed. Jerry Fodor and Jerrold Katz, 246323. Englewood Cliffs,
NJ: Prentice-Hall.
Langacker, Ronald. 1969. On pronominalization and the chain of command. In Modern Studies in English, ed. D. Reibel and S. Schane,
16086. Englewood Cliffs, NJ: Prentice-Hall.
Lasnik, Howard. 1976. Remarks on coreference. Linguistic Analysis
2: 122.
Reinhart, Tanya. 1979. Syntactic domains for semantic rules. In Formal
Semantics and Pragmatics for Natural Languages, ed. F. Guenthner
and S. Schmidt, 10730. Dordrecht: D. Reidel Publishing Company.

CEREBELLUM
The cerebellum is a brain structure located underneath the posterior part of the cerebral hemispheres. Three anatomical loops

Cerebellum
connect the cerebellum to various parts of the nervous system
(Ramnani 2006). The cerebro-cerebellar loop has been of particular interest to language researchers because this pathway supports anatomical connections between the cerebellum and the
cortex, potentially including language-related cortical regions in
the contralateral cerebral hemisphere. Recent advances in neuroimaging techniques, clinical testing, and anatomical methods
provide evidence that strongly implicates the cerebellum in a
broad range of language-related tasks, including those involving speech production and perception, single-word reading, and
higher-level language processing.

Historical Perspectives
Historically, the functions of the cerebellum were thought to be
limited to motor processes, such as motor control, performance,
and skill acquisition. Basic neuroscience research has led to different proposals regarding its role in motor processes such as errordriven learning (Marr 1969) and internal timing (Ivry 1996). In the
late 1980s, H. C. Leiner, A. L. Leiner, and R. S. Dow (1986) proposed
that the cerebellum is not exclusively involved in motor functions
but that it also contributes to cognitive processes. Specifically, they
argued for a putative role in language because the evolutionary
development of the cerebellum paralleled a similar evolution of
cortical areas associated with linguistic functions (e.g., brocas
area) (Leiner, Leiner, and Dow 1993). Based on the homogeneity
of cerebellar cellular organization, a similar role was attributed to
the cerebellum across both motor and non-motor domains.
Empirical work providing support for the claims proposed by
Leiner, Leiner, and Dow began to emerge in the late 1980s and
1990s (Desmond and Fiez 1998). A positron emission tomography (PET) study conducted by S. E. Petersen and J. A. Fiez (1993)
showed increases in cerebellar activity during a verb generation
task. This neuroimaging finding was consistent with a follow-up
case study of a patient with a lateral cerebellum lesion (Fiez et
al. 1992). The patient showed particularly poor performance on
verb generation despite the fact that other neuropsychological
assessments were within the normal range.

Current Perspectives
The cerebellum has been implicated in a broad range of languagerelated tasks. The majority of the work, however, can be related
to one of three domains: 1) speech production and perception,
2) reading, and 3) higher-level word processing. In order to
account for the cerebellums function in language, investigators
have made reference to the timing and error correction functions
that have been attributed to the cerebellum in the motor literature. For a review on how these may be general mechanisms that
contribute to both motor and non-motor processes, see Ivry and
Spencer (2004) and Doya (2000).
SPEECH PRODUCTION AND PERCEPTION. During speech production, the control, coordination, and timing of movements are
essential. Not surprisingly, clinical findings demonstrate profound speech and motor deficits associated with lesions to the
cerebellum. One common speech disorder resulting from damage to the cerebellum is dysarthria, which is characterized by distorted and slurred speech that is often monotonic and of a slower
rate (Duffy 1995). neuroimaging studies provide further

evidence for the involvement of the cerebellum in speech production. In a recent study, participants performed a syllable repetition task in which speech rate was varied. The results showed
increases in cerebellar activity that corresponded with increases
in speech rate (Riecker et al. 2006).
Cerebellar contributions to speech extend to the domain of
perception. Lesions to the cerebellum produce deficits in temporal duration discrimination and impair categorical perception
for consonants that differ in the onset of voicing (Ackermann et
al. 1997). This clinical evidence is consistent with neuroimaging data that show increases in right cerebellar activity during a
duration discrimination task for linguistic items (Mathiak et al.
2002). Other neuroimaging results suggest that the cerebellum
is also involved in learning new perceptual distinctions, such
as the non-native /r/-/l/ phonetic contrast for Japanese speakers (Callan et al. 2003). As in the motor literature, many of these
studies provide evidence that the cerebellum may be important
for coordination and timing in the production as well as the perception of speech (for a discussion, see Ivry and Spencer 2004).
Other research also draws upon knowledge from the motor
literature to emphasize a potential role of the cerebellum in error
correction. The fluency of normal speech has led many models
of speech production to incorporate a mechanism for monitoring and correcting speech errors (Postma 2000). More detailed
computational work has mapped certain processes in speech
production to specific brain regions and defined the cerebellum
as an important component in monitoring (Guenther, Ghosh,
and Tourville 2006). Similar ideas have emerged in models of
verbal working memory. Specifically, J. E. Desmond and colleagues (1997) suggest that a rehearsal process that relies on
inner speech to maintain verbal items in working memory may
also implement an error correction process. In their model,
inputs from frontal and parietal cortex into superior and inferior regions of the cerebellum are used to calculate and correct
discrepancies between phonological and articulatory codes in
order to improve memory performance.
READING. Data from neuroimaging studies consistently show
cerebellar activation during single-word reading tasks (Fiez and
Petersen 1998; Turkeltaub et al. 2002). In addition, individuals with
developmental reading disorders show some of the same symptoms that are often seen in patients with cerebellar damage, such
as poor duration discrimination and impaired gross motor functions (Nicolson, Fawcett, and Dean 1995). These observations led
R. Nicolson, A. Fawcett, and P. Dean (2001) to propose a relationship between cerebellar deficits and developmental reading disorders. Consistent with this idea, anatomical findings have reported
smaller right anterior lobes of the cerebellum in children diagnosed
with developmental dyslexia (Eckert et al. 2003). This work in
developmental reading disorders has focused on the importance of
cerebellar involvement in coordination and timing. Integrating the
neuroanatomical findings with behavioral work on dyslexia will be
key for establishing a specific role for the cerebellum in reading.
Recent neuropsychological research provides mixed findings on the causal relationship between lesions to the cerebellum in adult skilled readers and reading difficulties. One study
found that patients with lesions to the cerebellar vermis had
more errors in single-word reading when compared to controls

149

Cerebellum
(Moretti et al. 2002). On the other hand, a study of native English
speakers with lesions to the lateral cerebellar hemispheres did
not find any reading difficulties at the level of single words or
text (Ben-Yehudah and Fiez 2008). These seemingly inconsistent
findings may be due to differences in the site of the cerebellar
lesions in the two patient groups.
HIGHER-LEVEL LANGUAGE. There is accumulating evidence that
higher-level language processes may also involve the cerebellum,
although this level has received less attention. A meta-analysis of
the neuroimaging literature conducted by P. Indefrey and W. J. M.
Levelt (2004) reveals increased activity in the cerebellum for tasks
that require higher-level word processing; such tasks include picture
naming and verb generation (Indefrey and Levelt 2004) or internal
generation of semantic word associations (Gebhart, Petersen, and
Thach 2002). It is important to note that higher-level language processes seem to recruit more lateral areas, often in the contralateral
right hemisphere of the cerebellum (Indefrey and Levelt 2004).
Neuropsychological studies have observed impairments in higherlevel language processing (Silveri, Leggio, and Molinari 1994; Riva
and Giorgi 2000), including poor performance on a grammaticality
judgment task relative to controls (Justus 2004).

Summary
In summary, these data collectively provide strong support for
cerebellar involvement in many aspects of language, including
speech processing, reading, and higher-level language processing. They also suggest that there may be different regions of the
cerebellum that are involved in different types of language tasks.
This observation is consistent with an emerging concept that
distinct cerebro-cerebellar loops support cerebellar interactions
with cortex, thus potentially enabling the cerebellum to apply
one or more of its suggested functions (e.g., error correction) to
separate input-output loops (Kelly and Strick 2003). According
to this view, language tasks that rely on different cortical regions
would engage distinct cerebro-cerebellar loops that recruit specific cerebellar regions.
Sara Guediche, Gal Ben-Yehudah, and Julie A. Fiez
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ackermann, H., S. Graber, I. Hertrich, and I. Daum. 1997. Categorical
speech perception in cerebellar disorders. Brain and Language
60: 32331.
Ben-Yehudah, G., and J. Fiez. 2005. Impact of cerebellar lesions on reading and phonological processing. Annals of the New York Academy of
Sciences 1145: 26074.
Callan, D. E., K. Tajima, A. M. Callan, R. Kubo, S. Masaki, and R. AkahaneYamada. 2003. Learning-induced neural plasticity associated with
improved identification performance after training of a difficult second-language phonetic contrast. NeuroImage 19: 11324.
Desmond, J. E., J. Gabrieli, A. Wagner, B. Ginier, and G. Glover. 1997.
Lobular patterns of cerebellar activation in verbal working memory
and finger-tapping tasks as revealed by functional MRI. Journal of
Neuroscience 17.24: 967585.
Desmond, J. E., and J. A. Fiez. 1998. Neuroimaging studies of the cerebellum: Language, learning and memory. Trends in Cognitive Sciences
2.9: 355358. This article reviews neuroimaging evidence suggesting
that the cerebellum is involved in cognitive tasks, including those that
involve learning, memory, and language.

150

Doya, K. 2000. Complementary roles of basal ganglia and cerebellum


in learning and motor control. Current Opinion in Neurobiology
10: 7329. This paper suggests that the cerebellum is part of a more
general supervised learning system that is guided by error signals.
Duffy, J. R. 1995. Motor Speech Disorders. St. Louis, MO: Mosby.
Eckert, M., C. Leonard, T. Richard, E. Aylward, J. Thomson, and V.
Berninger. 2003. Anatomical correlates of dyslexia: Frontal and cerebellar findings. Brain 126 (Part 2): 48194.
Fiez, J. A., and S. E. Petersen. 1998. Neuroimaging studies of word reading. Proc Nat Acad Sci USA 95: 91421.
Fiez, J. A., S. E. Petersen, M. K. Cheney, and M. E. Raichle. 1992. Impaired
non-motor learning and error detection associated with cerebellar
damage. Brain 115: 15578.
Gebhart, A. L., S. E. Petersen, and W. T. Thach. 2002. Role of the posterolateral cerebellum in language. Annals of the New York Academy of
Sciences 978: 31833.
Guenther, F. H., S. S. Ghosh, and J. A. Tourville. 2006. Neural modeling
and imaging of the cortical interactions underlying syllable production. Brain and Language 96: 280301.
Indefrey, P., and W. J. M. Levelt. 2004. The spatial and temporal signatures of word production components. Cognition 92: 10144.
Ivry, R. B. 1996. The representation of temporal information in perception and motor control. Current Opinion in Neurobiology 6: 8517.
Ivry, R. B., and R. M. Spencer. 2004. The neural representation of time.
Current Opinion in Neurobiology 14: 22532. This article discusses
timing processes and their potential neural correlates. A review of the
evidence from many different methods is provided, with an emphasis on the potential contributions made by the cerebellum and basal
ganglia.
Justus, T. 2004. The cerebellum and English grammatical morphonology: Evidence from production, comprehension, and grammaticality
judgments. Journal of Cognitive Neuroscience 16.7: 111530.
Kelly, R. M., and P. L. Strick. 2003. Cerebellar loops with motor cortex
and prefrontal cortex of a nonhuman primate. Journal of Neuroscience
23.23: 843244. This article shows cerebellar regions that receive input
from the same cerebral cortical regions they project to. They hypothesize closed cerebro-cerebellar loops for the basis of the interactions
between cerebellum and cortex.
Leiner, H. C., A. L. Leiner, and R. S. Dow. 1986. Does the cerebellum contribute to mental skills? Behavioral Neuroscience 100.4: 44354.
. 1993. Cognitive and language functions of the human cerebellum. TINS 16.11: 4447.
Marr, D. 1969. A theory of cerebellar cortex. J Physiol 202.2: 43770.
Mathiak, K., I. Hertrich, W. Grodd, and H. Ackermann. 2002. Cerebellum
and speech perception: A functional magnetic resonance imaging
study. Journal of Cognitive Neuroscience 14.6: 90212.
Moretti, R., A. Bava, P. Torre, R. M. Antonello, and G. Cazzato. 2002.
Reading errors in patients with cerebellar vermis lesions. Journal of
Neurology 49: 4618.
Nicolson, R., A. Fawcett, and P. Dean. 1995. Time estimation deficits
in developmental dyslexia: Evidence of cerebellar involvement. Proc
Biol Sci 259.1354: 437.
. 2001. Developmental dyslexia: The cerebellar deficit hypothesis.
Trends in Neurosciences 24.9: 50811.
Petersen, S. E. ,and J. A. Fiez. 1993. The processing of single words studied
with positron emission tomography. Annual Reviews of Neuroscience
16: 50930.
Postma, A. 2000. Detection of errors during speech production: A review
of speech monitoring models. Cognition 77: 97131.
Ramnani, N. 2006. The primate cortico-cerebellar system: Anatomy and
function. Nature Reviews Neuroscience 7: 51122. This review provides
a brief description of cerebellar anatomy, and stresses integrating anatomical, computational, and experimental knowledge.

Charity, Principle of
Riecker, A., J. Kassubek, K. Groschel, W. Grodd, and H. Ackermann. 2006.
The cerebral control of speech tempo: Opposite relationship between
speaking rate and BOLD signal changes at striatal and cerebellar structures. NeuroImage 29: 4653.
Riva, D., and C. Giorgi. 2000. The cerebellum contributes to higher functions during development: Evidence from a series of children surgically treated for posterior fossa tumours. Brain 123: 105161.
Silveri, M., M. Leggio, and M. Molinari. 1994. The cerebellum contributes to linguistic production: A case of agrammatic speech following a
right cerebellar lesion. Neurology 44.11: 204750.
Turkeltaub, P., G. Eden, K. Jones, and T. Zeffiro. 2002. Meta-analysis of
the functional neuroanatomy of single-word reading: Method and validation. NeuroImage 16.3 (Part 1): 76580.

CHARITY, PRINCIPLE OF
A charity principle is a principle governing the interpretation of
the speech and thought of others. It says that the correct interpretation of certain kinds of expressions, areas of discourse,
or whole languages maximizes truth and rationality across the
(relevant) beliefs of its subject. According to Donald Davidson,
the main defender of a principle of charity, its validity derives
from the essentially rational and veridical nature of belief and
thought.
Principles of charity are of central importance in discussions
of radical interpretation or radical translation. In W. V.
O. Quines version, charity governs the translation of the logical
constants (cf. Quine 1960, 59). According to Donald Davidson,
charity governs the radical interpretation of all expressions of a
language. In an early formulation, it tells the radical interpreter
to optimize agreement by assigning truth conditions to alien
sentences that make native speakers right when plausibly possible (Davidson [1973] 1984, 137). To make native speakers
right is to interpret them as having beliefs that are largely true
and coherent with each other. Later, Davidson distinguished
explicitly between these two aspects of charity:
The Principle of Coherence prompts the interpreter to discover
a degree of logical consistency in the thought of the speaker; the
Principle of Correspondence prompts the interpreter to take
the speaker to be responding to the same features of the world
that he (the interpreter) would be responding to under similar
circumstances. Both principles can be (and have been) called
principles of charity: One principle endows the speaker with a
modicum of logical truth, the other endows him with a degree of
true belief about the world. Successful interpretation necessarily
invests the person interpreted with basic rationality. (Davidson
[1991] 2001, 211)

Coherence restricts belief ascription in terms of the logical relations among the beliefs of a speaker. Correspondence restricts
the ascription of empirical beliefs to a speaker in terms of
their truth. Since this can only be done according to the interpreters own view of what is true, following the principle of
correspondence amounts to agreement maximization
between speaker and interpreter. Here, Davidson more and
more emphasized a causal element; in the most basic perceptual cases, the principle of correspondence calls for the ascription of beliefs shared by speaker and interpreter. The objects of
these beliefs are determined as the shared, external causes of

these beliefs: Communication begins where causes converge


(Davidson [1983] 2001, 151). In later years, Davidson liked to use
the metaphor of triangulation for this three-way interaction
among speaker, interpreter, and external object (cf. Davidson
[1991] 2001).
The principle of charity does not exclude the possibility of error; speakers are to be right when plausibly possible
(Davidson [1973] 1984, 137). Charity, thus, in certain situations
actually prevents the interpreter from ascribing beliefs of his or
her own to the speaker, for instance, perceptual beliefs about
objects the speaker cannot perceive from his or her position in
space, or beliefs it would be irrational for the speaker to hold on
the basis of other beliefs. If something false follows rather directly
from other beliefs the speaker holds, charity might even call for
ascribing outright mistakes. The rationality induced by the principle is of a minimal, subject-internal character.
For Davidson, the principle of charity plays a double role: On
the one hand, it provides the method for the radical interpreter,
but it does so because it, on the other hand, is the principle metaphysically determining meaning (and belief content): What
a fully informed interpreter could learn about what a speaker
means is all there is to learn; the same goes for what the speaker
believes (Davidson [1983] 2001, 148). This is a kind of supervenience: According to Davidson, meaning (and content) supervene on (dispositions to) observable behavior in observable
circumstances. That is, there cannot be a difference in meaning
(or content) without a (potential) difference in behavior. This
can be called a weak semantic behaviorism, but according to
Davidson, meaning (and content) cannot be reduced to behavior. That meaning is determined by charity leaves room for a certain indeterminacy, according to Davidson, but does not lead
to antirealism or skepticism about meaning or thought content.
Because of the role that external objects, as shared causes, play
in the determination of content for basic perceptual beliefs, he
thought of his own position as a kind of externalism (cf. Davidson
2001; see meaning externalism and internalism).
The principle of charity has been widely discussed. Not only
have questions of its exact formulation and of its truth or validity been raised but also the question of what kind of a truth it
is, if any. What is its epistemic status a priori or a posteriori?
And what is its metaphysical status necessary or contingent?
Davidson mostly thought of charity as a principle constitutive of
thought and meaning, an a priori truth of conceptual necessity.
Many commentators have claimed that radical interpretation is
supposed to provide an (a priori) argument for charity: If radical interpretation is possible, charity is valid (see, for example,
Lepore and Ludwig 2005, 204 ff). But according to others, the
direction of argument can only be the opposite: If charity holds,
radical interpretation is possible (Davidson 1994, 122; Gler
2006, 344). Then, Davidson would be seen as arguing for charity
from considerations regarding the nature of thought content, its
holism and externalist determination (cf. Davidson [1991] 2001;
1999, 343; 2001). Partly against Davidson, it has been argued that
charity can only be an a posteriori necessity (cf. Fllesdal 1982;
Gler 2006) and that it, like other nomological principles, can be
justified by the principles of empirical science (cf. Pagin 2006).
Kathrin Gler

151

Childrens Grammatical Errors


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Davidson, Donald. [1973] 1984. Radical interpretation. In Inquiries into
Truth and Interpretation, 12539. Oxford: Clarendon Press.
. [1983] 2001. A coherence theory of truth and knowledge. In
Subjective, Intersubjective, Objective: 13753. Oxford: Clarendon Press.
. [1991] 2001. Three varieties of knowledge. In Subjective,
Intersubjective, Objective, 20520. Oxford: Clarendon Press.
. 1994. Radical interpretation interpreted. Philosophical
Perspectives 8: 1218.
. 1999. Reply to Andrew Cutrofello. In The Philosophy of Donald
Davidson, ed. L. Hahn, 3424. Chicago: Open Court.
. 2001. Externalisms. In Interpreting Davidson, ed. P. Kotatko,
P. Pagin, and G. Segal, 116. Stanford, CA: CSLI.
Fllesdal, Dagfinn. 1982. The status of rationality assumptions in interpretation and action explanation. Dialectica 36: 30116.
Gler, Kathrin. 2006. The status of charity I: Conceptual truth or aposteriori necessity? International Journal of Philosophical Studies
14: 33760.
Lepore, Ernest, and K. Ludwig. 2005. Donald Davidson: Meaning, Truth,
Language, and Reality. Oxford: Clarendon Press.
Pagin, Peter. 2006. The status of charity II: Charity, probability, and simplicity. International Journal of Philosophical Studies 14: 36184.
Quine, Willard V. O. 1960. Word and Object. Cambridge, MA: MIT Press.

CHILDRENS GRAMMATICAL ERRORS


Language learners make errors. This observation, easily verified,
is not confined to children in thrall to their first encounter with
language. It applies equally well to adults acquiring a further language (see second language acquisition). And it applies
to cases of both typical and atypical language development (see
specific language impairment). There is, then, nothing
abnormal about speech errors. They are an intrinsic feature of
language acquisition and do not mark out special cases of learning but, rather, constitute the norm. In this vein, one might argue
that the very notion of language development almost inevitably
implies the occurrence of errors. Perfect speech could not readily be ascribed to a language learner, after all. It is not surprising,
therefore, to find that linguistic errors have featured prominently
in research on language development. As a universal feature of
language acquisition, errors provide not only evidence that
learning is taking place but also, in some cases, evidence of how
that learning occurs.
Childrens speech errors range over every level of language: phonological (see phonology, acquisition of);
lexical (see lexical acquisition); morphological (see morphology, acquisition of); and grammatical (see syntax,
acquisition of). Grammatical errors are of special interest
because they are germane, in the field of language acquisition,
to the nature-nurture controversy. Barbara C. Scholz and G. K.
Pullum (2006, 60) usefully encapsulate the nativist credo: [M]ost
of the acquisition of natural languages by human beings depends
on unacquired (or acquired but unlearned) linguistic knowledge or language-specialized cognitive mechanisms. Child
grammatical errors present a problem, therefore. If the bulk of
what is acquired is unlearned, why is there a protracted period
in a young childs life (several years) during which language is
manifestly imperfect? At the very least, grammatical errors throw
into sharp relief the messiness of the data which nativists must

152

grapple with. Howsoever powerful the childs innate mechanisms might be, they do not equate to an attribute (language)
that comes into the world fully formed at birth. Instead, there is a
bridge to be crossed from what Noam Chomsky (1980) has called
the childs initial state (the genetic endowment for language,
present at birth) to the steady state (the mature knowledge of
grammar finally attained). Several explanations are available to
deal with this problem (see innateness and innatism). Of
note here is the simple point that such explanations are required
by nativists and, inevitably, muddy the waters both theoretically
and empirically.
On the nurture side of the nature-nurture fence, speech errors
(grammatical and otherwise) again present a vexing issue that
needs to be addressed. In particular, the behaviorist approach to
language acquisition has been castigated for an excessive reliance
on operant conditioning as a mechanism of language learning.
B. F. Skinner (1957) argued that one of the key processes in language development was the shaping of the childs verbal behavior through reward. On this view, child utterances are rewarded
according to their proximity to the adult models provided. But
this is problematic. In a celebrated demolition of Skinners thesis, Chomsky (1959, 42) pointed out that operant conditioning
cannot tell the whole story since a child will be able to construct
and understand utterances which are quite new, and are, at the
same time, acceptable sentences in his language. Thus, operant
conditioning cannot account for novelty.
Similarly, imitation cannot account for the childs speech,
particularly errors. Although not mentioned by Chomsky (1959),
grammatical errors do, in fact, present the most striking demonstration that language acquisition is not largely based on
imitation. The reason is that children are exposed to very few
grammatical errors in the input they receive from parents.
Hence, there are very few faulty models for children to copy. For
example, Elissa Newport, H. Gleitman, and L. R. Gleitman (1977)
report just one instance of parental ungrammaticality in a corpus of 1,500 utterances directed toward young children. In consequence, one cannot easily blame the parents for child errors.
A further critical point is that the child cannot imitate grammar,
only the products of grammar (sentences). Perhaps not surprisingly, since the demise of behaviorism, several other theories of
language acquisition have been promulgated that do not rely on
operant conditioning or imitation as their mainstay.
Beyond their relevance for the nature-nurture issue, child
errors have been studied because of the insights they furnish
about the processes of language acquisition. For example, an
error like i thought they were all womans reveals that the
child has extracted the regular suffixation rule for forming plurals
of nouns in English. That is, the child knows to add -s to singular
nouns in order to mark plurality. The childs error lies in mistaking woman for a regular noun. This kind of error is commonly
described as an overregularization, since the plural rule for
regular forms (add S) has been applied beyond its conventional
confines to an irregular noun. Thus, errors of this kind illuminate
the childs ability to extract and generalize a morphological rule.
We know that the child is indeed adding -s to make plurals, even in the case of regular plurals like coconuts. This latter fact has been established even though it is conceivable that
the child has simply heard the form coconuts in the input and

Childrens Grammatical Errors


stored it whole, entirely unaware that the word can be parsed
into coconut and the plural marker -s. Jean Berko Gleason (1958)
invented nonsense words (including wug) to denote birdlike
creatures (also invented), in one of the first experiments in the
field of child language. Children were shown a picture of one of
these creatures and heard This is a wug. They were then shown
a picture with two of these creatures and heard Now there are
two of them. There are two The pronunciation of two is left
hanging in the air, inviting the child to complete the sentence.
And, indeed, children four years of age will often declare There
are two wugs. Since the child has never before encountered the
word form wugs in the input, we can be sure that this word has
been assembled on-line using the new word form wug and prior
knowledge of the plural suffix -s.
What is almost always overlooked, or possibly just taken for
granted, in research on childrens errors is the fact that error
is an intrinsically relative concept. In the case of language learners (young and old), utterances can be judged against the standard of an expert, typically a parent or other native speaker.
When my four-year-old son said, Whats the man who the forest doing? I registered an error, based on the dictates of how I
would have said the same sentence (possibly, Whats the man
who is in the forest doing?). But the intuitions of a parent are
not sufficient proof that a given child sentence is ungrammatical.
Parental intuitions, as a form of evidence, are neither objective
nor decisive. Nevertheless, linguists continue to rely on intuitions
as a primary source of data on grammaticality (Smith 2004).
It is argued that the intuitions of an adult native speaker constitute direct evidence for mental grammar, that is, the knowledge of grammar residing in the head of an individual human
being. However, the judgment of what is and is not grammatical
is embedded in social convention. Whatever rule happens to be
mentally represented by a given individual, and whatever intuitions that rule gives rise to, its acceptance as part of the grammar
for a given language is judged in comparison with the intuitions
of other persons. Thus, the grammaticality of a child utterance
will be judged against the intuitions of the parent or, in some
cases, a passing child language researcher. The social nature of
this process is rooted in the appointment (or self-appointment)
of one or more people as arbiters over the grammaticality of any
given utterance.
Evidently, decisions about when an error really is an error
are not entirely straightforward. And even when one has made
that judgment (on whatever basis), one is then faced with a further difficult issue that has, hitherto, received scant attention. In
short, does a given error arise from an immature or incomplete
knowledge of grammar? Or is the underlying knowledge base
entirely adultlike, but somehow an error has slipped out, owing
to a technical hitch in production? In this vein, Chomsky (1965)
distinguished between competence and performance.
Competence refers to the speaker-hearers tacit knowledge of
his or her language. Performance, on the other hand, comprises
the use of this knowledge in producing speech. The utterances
we produce arise from both our linguistic competence and other
intervening performance factors, including the limitations of
short-term memory, motor control over the execution of speech
plans, and even the effects of anxiety or alcohol. Cognitive factors of this kind can cause errors to creep into our speech output

despite the fact that our linguistic knowledge (competence) may


be flawless.
Adult speech (in particular, speech directed toward other
adults) may be laden with false starts, hesitations, unnecessary
repetitions, and slips of the tongue. The default assumption about
adult errors is that they are the product of faulty performance.
Child speech errors, on the other hand, are more likely ascribed
to an immature competence. However, all the factors that apply
to adults as causes of performance errors apply equally well to
children. At the same time, the task of distinguishing errors of
competence from errors of performance is empirically fraught.
And tellingly, it is a task that researchers have not even begun
to tackle with any serious purpose (though, see Jaeger 2005 for
work on childrens slips of the tongue). With regard to adult
errors, there is also a scarcity of evidence to support the assumption that they are, unfailingly, the product of performance factors. It may well turn out, on closer inspection, that adults vary in
terms of their grammatical competence.
As noted, theories of grammar acquisition tend to assume
that immature competence lies at the root of grammatical errors.
A notable exception is found in the study of childrens past tense
errors. The so-called words and rules theory suggests that when
children learn an irregular past tense form (e.g., broke), it automatically blocks the application of the regular suffixation process
(break + -ed breaked). In this way, errors are avoided (Marcus et
al. 1992). Of course, young children do produce errors from time
to time. To explain these errors, Gary F. Marcus and colleagues
(1992) suggest that young childrens memory retrieval system is
immature and sometimes lets them down. In consequence, the
child may occasionally fail in an attempt to retrieve an irregular form like broke. This failure then triggers the default regular
process to produce breaked. Hence, the explanation for child
errors is based on limitations in performance, not competence.
In support of this idea, it is argued that overregularization rates
are generally very low, something like 4 percent (Marcus et al.
1992). This rarity lends itself to a performance-based explanation for what prompts the childs errors. In the event, error rates
may be considerably higher than initial estimates might indicate
(Maslen et al. 2004). Sampling limitations may mask brief periods of very high error rates, especially for high-frequency verbs.
A further problem is that there is no empirical support for the
speculation that errors are caused by failures in memory retrieval.
Very little is known about retrieval processes in young children,
especially in connection with language. Whatever the merits of
the words-and-rules account of past tense errors, it does at least
raise awareness that childrens grammatical errors may not necessarily stem solely from an immature competence.
As noted, the fact that children produce errors in the course
of language acquisition is uncontroversial. Where controversy
does arise is in the attempt to explain how children expunge
errors and move toward a more adultlike system of grammar.
The obvious solution to the childs problem is for parents and
others to supply corrections. Corrections for grammatical errors
are often referred to as negative evidence, that is, evidence that
some structure is not permitted by the target grammar. However,
opinions differ sharply as to whether negative evidence is available to children. Roger Brown and C. Hanlon (1970) demonstrated that parents do not overtly disapprove of their childrens

153

Childrens Grammatical Errors


grammatical errors. Thus, they do not reliably mark grammatical errors with injunctions like Dont say that or No, thats
wrong. This finding has exerted an enormous influence in the
field of child language, being hailed by Steven Pinker as one
of the most important discoveries in the history of psychology
(1988, 104). Undoubtedly, Pinker overstates the case. But his
enthusiasm stems from the perception that a crucial aspect of
linguistic knowledge could not have arisen in the childs mind
through the mediation of the environment. That is, if children
receive no help or information in the input from parents concerning what is or is not grammatical, then one must conclude
that the childs knowledge in this respect is innate. Observe that
the reach of this conclusion is extensive, since it could, conceivably, encompass each and every rule or principle of grammar
in a language.
Nativist enthusiasm for what is known as the no negative evidence assumption is tempered by numerous empirical studies
that challenge this assumption. Beginning with Kathy HirshPasek, R. Treiman, and M. Schneiderman (1984), researchers
have noted that the markers of disapproval examined by Brown
and Hanlon (1970) do not constitute the only possible form of
corrective input. More recent research has focused on the frequent contrasts between erroneous child usage and correct
adult models that figure in childadult discourse (Chouinard
and Clark 2003; Saxton, Backley, and Gallaway 2005). The following example is an exchange between my four-year-old son and
myself (emphases highlight the contrast in linguistic forms, not
pronunciation stress).
Child: I thinked about it with my brain.
Adult: You thought about it.

To function as a form of corrective input, contrasts of this kind


would have to be interpreted by the child as not simply modeling a correct form. The child would also have to regard them as
signals that their own previous usage was ungrammatical (for
evidence consistent with this view, see Saxton 1997 and Strapp
and Federico 2000). Curiously, Brown and Hanlon themselves
remarked on the corrective potential of contrastive discourse,
observing that repeats of ill-formed utterances usually contained corrections and so could be instructive (1970, 43).
However, this observation was entirely overlooked for many
years, leading to a considerable distortion of the empirical facts.
At the same time, though, and as noted previously, the fact that
contrastive discourse is abundantly available to children does
not entirely resolve the matter. It still remains to be demonstrated decisively that children actually perceive such contrasts
as a form of negative evidence and that they exploit that information in shedding errors and arriving at a mature system of
grammar.
To conclude, childrens grammatical errors demand the
attention of language scientists for two reasons. First, and most
obvious, errors stand out. They attract our attention like brightly
colored flags, flapping above the parapet. And, second, the
investigation of errors reveals much about the processes of language acquisition. They provide the paradigm demonstration
that language develops. The fact that errors occur at every level of
language, both in abundance and for extended periods of time,

154

Chirographic Culture
provides a strong stimulus for language scientists to seek explanations for how and why language learners differ from fully competent native speakers.
Matthew Saxton
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Berko Gleason, Jean. 1958. The childs learning of English morphology.
Word 14: 15077.
Brown, Roger, and C. Hanlon. 1970. Derivational complexity and order
of acquisition in child speech. In Cognition and the Development of
Language, ed. J. Hayes, 1153. New York: John Wiley.
Chomsky, Noam. 1959. Review of B. F. Skinners Verbal Behavior.
Language 35: 2658.
. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
. 1980. Rules and Representations. New York: Columbia University
Press.
Chouinard, Michelle M., and E. V. Clark. 2003. Adult reformulations
of child errors as negative evidence. Journal of Child Language
30: 63769.
Hirsh-Pasek, Kathy, R. Treiman, and M. Schneiderman. 1984. Brown
& Hanlon revisited: Mothers sensitivity to ungrammatical forms.
Journal of Child Language 11: 818.
Jaeger, Jeri J. 2005. Kids Slips: What Young Childrens Slips of the
Tongue Reveal about Language Development. Mahwah, NJ: Lawrence
Erlbaum.
Marcus, Gary F., S. Pinker, M. Ullman, M. Hollander, T. J. Rosen, and F.
Xu. 1992. Overregularization in Language Acquisition. Monographs of
the Society for Research in Child Development, serial no. 228.
Maslen, Robert J. C., A. L. Theakston, E. M. V. Lieven, and M. Tomasello.
2004. A dense corpus study of past tense and plural overregularization in English. Journal of Speech, Language, and Hearing Research
47: 131933.
Newport, Elissa, H. Gleitman, and L. R. Gleitman. 1977. Mother, Id
rather do it myself: Some effects and non-effects of maternal speech
style. In Talking to Children: Language Input and Acquisition, ed. C.
Snow and C. Ferguson, 109149. Cambridge: Cambridge University
Press.
Pinker, Steven. 1988. Learnability theory and the acquisition of a
first language. In The Development of Language and Language
Researchers: Essays in Honor of Roger Brown, ed. F. Kessel, 97119.
Hillsdale, NJ: Lawrence Erlbaum.
Saxton, Matthew. 1997. The contrast theory of negative input. Journal of
Child Language 24: 13961.
Saxton, Matthew, P. Backley, and C. Gallaway. 2005. Negative input for
grammatical errors: Effects after a lag of 12 weeks. Journal of Child
Language 32: 64372.
Scholz, Barbara C., and G. K. Pullum. 2006. Irrational nativist exuberance. In Contemporary Debates in Cognitive Science, ed. R. Stainton,
5980. Oxford: Basil Blackwell.
Skinner, Burrhus F. 1957. Verbal Behavior. New York: Appleton-CenturyCrofts.
Smith, Neil. 2004. Chomsky: Ideas and Ideals. 2d ed. Cambridge: Cambridge
University Press.
Strapp, Chehalis M., and A. Federico. 2000. Imitations and repetitions: What do children say following recasts? First Language
20.3: 27390.

CHIROGRAPHIC CULTURE
Writing and script are systems of graphic marks that represent words, syllables, or individual sounds (phonemes)

Chirographic Culture
of a language. Chirography shares the same definition with
the added meaning of writing by hand. The term thus applies
to all the writing systems or scripts that followed the first
invention of writing in Mesopotamia, circa 3200 b.c., and
before Gutenbergs invention of the printing press about
1437. Chirography is generally viewed as the gateway to complex literate societies while leaving behind the archaic oral
cultures.
The nature and extent of the divide between oral and chirographic cultures has long been a matter of debate. In the fifth
century b.c., in Phaedrus and Letter VII, the Greek philosopher
Plato expressed his concerns with the impact of chirography on
human cognition. He warned that writing would weaken human
memory and threaten scholarship by allowing the ignorant to
fake omniscience. As discussed in Khosrow Jahandaries volume
Spoken and Written Discourse (1999), the present consensus is
less critical. Wherever chirography emerged, in Mesopotamia
about 3200 b.c., in China about 1250 b.c. and in Mesoamerica
circa 650 b.c., it is held as a productive supplement of speech.
This is based on the facts that, first, the human voice can be
heard only by a small audience but written documents can be
sent to any destination, and, second, speech disappears instantaneously, leaving no trace, while texts can be preserved over
time. It is, therefore, generally agreed that chirography extends
the network of human communication from culture to culture and makes it possible to trace the roots of cultures and their
evolution in history. Moreover, by reducing to order and clarity
a myriad of details, chirography is credited with revolutionizing
record keeping. Registers allow for administering communities and keeping track of entries and expenditures, profits, and
losses of businesses. Finally, writing is recognized for creating
data banks far beyond the power of human memory, resulting
in turn in the accumulation and articulation of an unlimited
quantity of complex data regarded as instrumental to the formulation of significant syntheses and the creation of new cognitive skills. In other words, literacy is viewed as enhancing the
possibilities for socially distributed cognition, allowing
civilization to grow more complex with the administration of
organizations of greater dimensions and larger political units,
the creation of more extensive economies, the development of
complex sciences and technologies, and a more accurate knowledge of the past.
The major priority of the twentieth century, however, has
been to investigate the impact of chirography on the human
mind. With his famous adage the medium is the message,
Marshall McLuhan emerged as a popular champion of literacy.
In his books The Gutenberg Galaxy (1962) and Understanding
Media (1964), the Canadian media critic advocated that media
were not passive conduits of information but, rather, vortices of
power restructuring human perceptions. McLuhan argued that
by translating sounds into a visual code, writing exchanged an
eye for an ear and that [p]honetic culture endows men with the
means of repressing their feelings and emotions when engaged in
action (1964, 84, 88); he therefore claimed that literate humans
develop the power of acting with detachment from emotional
involvement. He further emphasized the impact of the linear
format of writing, pointing out that civilized societies acquire a

Figure 1. Cuneiform tablet featuring a list of goods. Courtesy of the Texas


Memorial Museum, The University of Texas at Austin.

lineal conception of time, that they view events linearly, with a


beginning, a middle, and an end. And, in his view, the resulting
systematic sequential presentation of arguments translated into
a more rigorous logic.
Also in the 1960s, Eric A. Havelock analyzed how the adoption of the Semitic alphabet in Greece influenced the organization of ideas, abstraction, and consciousness. Havelock, as
well as McLuhan, dealt primarily with alphabetic scripts. In
contrast, Walter J. Ong, S.J., university professor of humanities at St. Louis University, and Jack Goody, anthropologist at
Cambridge University, included in their analyses prealphabetic
chirographic systems, such as the Mesopotamian cuneiform
script and non-alphabetic oriental writing systems. Among its
many important contributions, Ongs book Orality and Literacy
([1982] 1983) makes the case that the nonliterate relates to the
world in a concrete, situational way, downplaying generalization
and abstraction. Relying on Aleksandr R. Lurias psychological
field work, Ong argued that the illiterate does not name geometric figures abstractly as circles or squares but as moons or
plates, and doors or mirrors. Furthermore, the nonliterate
avoids self-analysis, which requires abstracting the self from the
surrounding world.
Goody, the author of The Domestication of the Savage Mind
(1977) and The Interface Between the Written and the Oral (1987),
investigated how the series of shifts involved in the development of writing restructured human thought. In particular,
he analyzed how the first Mesopotamian texts that consisted
exclusively of lists changed thought processes. He suggested
that the lists of goods generated by the Mesopotamian administration (Figure 1) or the sign lists compiled by scribes encouraged scrutiny by selecting which items to include and which to

155

Chirographic Culture

Figure 2. Tokens from Uruk, Iraq, ca. 3300 B.C. Courtesy Vorderasiatisches
Museum Berlin, Germany.

exclude. Moreover, he argued that the lists segmented reality


by breaking down the perceptual world. For example, a list of
tree signs abstracted the trees from the forests to which they
belong. In other words, according to Goody, lists decontextualize data but also regroup elements, ordering them by type,
shape, size, number, and so on. Consequently, lists reorganize
the world, transforming it into an ideal form and creating a new
reality upon which the literate is forced to reflect at a new level
of generality.
Among other twentieth-century authors who considered
that writing affected the human mind, David R. Olson, professor of applied cognitive science at the Ontario Institute
for Studies in Education, emphasized in The World on Paper
(1994) the importance of writing for reflecting upon ourselves. On the other hand, Bruno Latour, an anthropologist,
is among the scholars who disagree with the proposition that
writing created new cognitive skills. In an article in 1990, he
proposed that it is the combination of images and writing in
maps, charts, graphs, photos, and diagrams that create better
tools to allow scientists to foray into new ideas. Others credit
schooling, rather than writing, for increasing rationality and
abstract thought.
These seminal studies of the 196080s must now be updated
by taking into account the archaeological discovery that
the cuneiform script, the earliest chirographic system, was
derived from an earlier visual code. As described by Denise

156

Schmandt-Besserat in Before Writing (1992) and How Writing


Came About (1996), a system of tokens was used to keep track
of goods in the Near East, starting about 7500 b.c. The tokens
were modeled in clay in multiple shapes, such as cones,
spheres, disks, and cylinders. Each token shape stood for a
unit of an agricultural product: A cone was a small measure
of grain, a sphere stood for a large measure of grain, and a cylinder for an animal. Four thousand years later, in the urban
period circa 3300 b.c., the tokens had evolved into a complex
accounting device with a repertory of about 300 shapes, some
including additional incised or punched markings, to record
the various units of goods manufactured in workshops (Figure
2), such as wool, textiles and garments.
The fact that, in the Near East, the first script was preceded
by a visual code a system of arbitrary symbols to represent
words sheds new light on chirographys contribution to society. In particular, some of the merits formerly attributed to
writing have to be credited to the token system. For example,
tokens, not chirography, shifted an eye for an ear. Like texts,
the tokens were permanent and, therefore, could be transported
or stored. The clay artifacts symbolized units of real goods and
as such handled data in abstraction. Finally, like written lists,
tokens could organize information in successive lines in the
most concise way, allowing scanning, evaluating, scrutinizing,
and analyzing a budget. As a result, the token system stretched
human cognition by making it possible for the neolithic oral
cultures of the Near East to handle large amounts of complex
information.
Once appropriate recognition is given to the token system,
the revolutionary contributions of chirography become very
clear. First, chirography abstracted numbers (Figure 3). It
should be well understood that the tokens were used in one-toone correspondence, which means that one, two, or three small
measures of grain were shown by one, two, or three cones. But
numerals signs that represent numbers abstractly appeared
about 31003000 b.c., after the three-dimensional tokens were
replaced by their images pictographs traced onto clay tablets. At this point, 10 jars of oil and 10 large units of grain were
no longer shown by 10 ovoid tokens and 10 spherical tokens
but by one sign standing for ten, followed by a sign for the
goods in question. Second, chirography abstracted the sounds
of speech. Whereas the tokens merely acted as logograms or
signs standing for a concept chirography created phonetic
syllabic signs to write personal names (see Figure 3). In sum,
compared to tokens that stood for concrete merchandise, the
chirographic numerals symbolized oneness, twoness, and
abstract constructs of the mind, and the chirographic phonetic
signs symbolized the immaterial and evanescent sounds of
the voice. By creating numerals and phonetic signs, chirography, therefore, raised human cognition to far higher levels of
abstraction.
In the twentieth century, research on the impact of chirography on cognition was confined to issues of interest to
the humanities. In the 21st century, however, the debate was
extended to the field of art. In When Writing Met Art (2007),

Chirographic Culture

Figure 3. Pictographic tablet from Uruk, Iraq, ca. 3000 B.C. Courtesy
Deutsches Archaeologisches Institut, Berlin, Germany. The tablet features
a list of goods. In the upper cases, the units of merchandise are indicated by pictographs or images of tokens traced in clay. Numerals are
shown with impressed signs. On the lower case, phonetic signs indicate
the name of the recipient or donor.

ancient Near East. The preliterate lines of a repeated motif


were apprehended at a glance, but the narrative compositions of chirographic cultures were read analytically, sequentially. It is generally assumed that the Neolithic motifs, such
as triangles or ibexes, were symbols like the dove is a symbol
of peace in our culture. Thus, the preliterate Near Eastern art
probably evoked ideas perhaps profound ideas but only the
art compositions of chirographic cultures could tell complex
stories.
On the basis of these recent findings, the immense legacy
of the first handwritten texts can now be assessed with greater
clarity. Art demonstrates that chirography created a paradigm
that can be successfully implemented in other communication
systems. Archaeology shows that compared to its archaic token
precursor, chirography meant leaps in abstraction with the creation of numerals and phonetic signs. By inventing symbols
to express such numbers as 1, 10, and 60, chirography laid the
foundation for the development of mathematics. By representing the sounds of the voice by phonetic signs, chirography set
the stage for writing to become a universal system of communication emulating speech.
Denise Schmandt-Besserat

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

Schmandt-Besserat argued that ancient Near Eastern art compositions the way images are organized changed with the
advent of chirography in 31003000 b.c. She showed that preliterate painted or carved compositions consisted of the mere
repetitions of one motif as many times as necessary to cover a
surface. For instance, the same triangle or ibex was replicated
around the body of the vessel. In contrast, by borrowing the
strategies of writing, compositions of the chirographic cultures were able to organize multiple figures into a narrative. To
illustrate this concept, large and small signs of writing denoted
greater or lesser units of goods and similarly, the size of images
indicated status. Gods were represented as larger than kings,
and the images of kings were shown larger than those of their
fellow citizens. Just as signs changed value by being placed to
the right or the left, above or below other signs, the heroes
actions and interactions were indicated by their orientation,
order, and direction. For instance, one figure standing in front
of another was understood as being more important than
one standing behind it. From writing, art also acquired ways
of loading images with information by using symbols akin to
determinatives signs denoting a general class. For instance,
the horned tiara indicated the divine status of a figure in the
same way the star-shaped sign dingir did in Sumerian cuneiform texts. As a result, reading art became akin to reading
a text.
In sum, art, which is, in at least certain respects, a mirror
of culture, signals a conceptual change in design compositions that coincides with the advent of chirography in the

Coulmas, F., ed. 1999. The Blackwell Encyclopedia of Writing Systems,


Oxford: Oxford University Press.
. 2003. Writing Systems: An Introduction to Their Linguistic Analysis.
Cambridge: Cambridge University Press.
Goody, Jack. 1977. The Domestication of the Savage Mind.
Cambridge: Cambridge University Press.
. 1987. The Interface between the Written and the Oral.
Cambridge: Cambridge University Press.
Havelock, Eric. A. 1963. Preface to Plato. Cambridge: Belknap Press of
Harvard University Press.
Houston, Stephen D., ed. 2004. The First Writing. Cambridge: Cambridge
University Press.
Jahandarie, Khosrow. 1999. Spoken and Written Discourse, A Multidisciplinary Perspective. Stamford, CT: Ablex Publishing.
Latour, Bruno. 1990. Drawing things together. In Scientific Practice,
ed.M. Lynch and S.Woolgar, 1968. Cambridge, MA: MIT Press.
McLuhan, Marshall. 1962. The Gutenberg Galaxy: The Making of
Typographic Man. Toronto: University of Toronto Press.
. 1964. Understanding Media: The Extensions of Man. New York: New
American Library.
Niditch, S. 1996. Oral World and Written Word. Louisville, KY:
Westminster John Knox Press.
Olson, David R. 1994. The World on Paper. Cambridge: Cambridge
University Press.
Ong, Walter J., S. J. [1982] 1983. Orality and Literacy. London: Methuen.
Plato. 1973. Phaedrus and Letters VII and VIII. Trans. Walter Hamilton.
Harmondsworth: Penguin Books.
Schmandt-Besserat, Denise. 1992. Before Writing. Austin: University of
Texas Press.
. 1996. How Writing Came About. Austin: University of Texas
Press.
. 2007. When Writing Met Art. Austin: University of Texas Press.

157

Clitics and Cliticization

CLITICS AND CLITICIZATION


The unusual properties of little words have attracted the attention of generations of linguists. This is especially true of the items
known to traditional grammar as enclitics and proclitics and to
modern linguistics simply as clitics. Different theoretical frameworks have highlighted different characteristics of what has often
been seen as a unitary class of elements, with the result that two
quite distinct sorts of unusual behavior have not always been
carefully distinguished.

Two Senses of Clitic


The etymology of the word clitic (from Greek kli:no lean)
brings out what seemed most distinctive to an earlier generation of scholars, their tendency to lean on or form part of a
prosodic unit with a preceding or following word, linked to
their typical lack of autonomous accent. Such attachment
may give rise to phonological words containing syntactically
unrelated material as a result of linear adjacency. In addition to standard cases in Greek, Latin, and Sanksrit, a wellknown example of this is furnished by the Wakashan language
Kwakwala, where determiner elements at the left edge of
nominal expressions (among other clitics) attach phonologically to the rightmost word of a preceding constituent. In
a sentence like mxid ida bgWanma-xa gnanma-sa
kWixayu hit-Det man-Det child-Det club, The man hit the
child with a club, the words bgWanma-xa man-Obj and
gnanma-sa child-Inst end in determiner elements that are
syntactically linked not to the phrase of which they are phonologically a part but, rather, to the argument that follows in the
sentence.
In this case, the anomaly is a phonological grouping that does
not appear to reflect syntactic organization, and this can plausibly be attributed to the prosodic weakness of the clitic (here the
determiners). In other instances, however, something else must
be at work. The pronominal object clitics in a French sentence
like Je le lui donne I give it to him appear preceding the main
verb, a position in which objects cannot otherwise occur and
which requires reference to ordering principles outside of the
languages normal syntax.
Elements appearing in unusual positions in this way are generally also phonologically weak, and this has led to a tendency
to conflate the two sorts of exceptionality, taking prosodic weakness (including lack of stress) as diagnostic of clitics and then
proposing distinctive ordering for the items so identified. In fact,
however, some prosodically weak elements appear only in positions that are quite normal syntactically (e.g., the reduced forms
of is and has in Freds sleeping and Freds lost his dog), while some
elements that are positioned unusually along with other clitics
are nonetheless prosodically full words, such as Tagalog tayo/
natin we (incl.) and a number of other pronouns and particles
in this language.
Such facts suggest that we should recognize two distinct
dimensions of unusual behavior of clitics, one phonological
(associated with prosodic weakness) and the other syntactic
(associated with unusual positioning of a restricted sort, elements
commonly called special clitics). The two often, but not always,

158

coincide (as with French object le, which is both unaccented and
distinctively positioned).
An association between them has been noted at least since
the classic work of Jakob Wackernagel (1892), who pointed out
that phonetically weak elements in the ancient Indo-European
languages tended to cluster in second position within the sentence, a position that was quite anomalous from the point of view
of the rest of the syntax. Much later literature has treated the connection between phonological weakness (and especially a lack
of autonomous accent) and unusual positioning as essential,
although the two turn out to be quite separate characteristics,
not only logically but also empirically. The essential distinction
between them was pointed out by Arnold Zwicky (1977) and
further developed in much later literature, including Anderson
(2005).

Accounts of Clitic Behavior


Once we realize that it follows from the prosodically impoverished nature of the elements concerned, the phonological
dimension of clitic behavior finds a natural account in more
general theories of prosody and stress, as demonstrated in
work such as that of Elizabeth Selkirk (1995). The (morpho-)
syntactic dimension is somewhat more controversial, however. Special clitics appear in a limited range of positions. Any
given clitic can be associated with some syntactic domain, and
within this domain it may appear initially, finally, postinitially
(in second position), prefinally, or adjacent to the head of the
domain.
Syntacticians have tended to see special clitics as filling
normal syntactic positions and then displaced to their surface position under the influence of distinctive movement
principles within the syntax. One difficulty with this is the
fact that in some languages (of which certain forms of SerboCroatian are the most discussed), the element defining
second position for the location of clitics is a unit in phonological terms (a prosodic word) but not necessarily a unit that
ought to be accessible to the syntax. An alternative that avoids
this difficulty is to note the close parallel between possible
positions for clitics and for affixes and to treat special clitics
as a class of affixes introduced directly into the surface form
of phrases by principles closer to those of morphology than
to syntax.

Conclusion
The analysis of clitics and the principles by which they find their
place in sentence structure and prosody involve an intricate
interplay among all of the major components of grammatical
structure, including syntax, phonology, morphology, and
even semantics. These elements have been invoked by scholars in all of these areas as evidence for fundamental claims about
linguistic structure, and an assessment of those claims is only
possible on the basis of a clearer understanding of the subdivisions among clitics, and the appropriate mechanisms for accommodating their specific properties, than is often found in the
existing literature.
Stephen R. Anderson

Codeswitching
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, Stephen R. 2005. Aspects of the Theory of Clitics. Oxford: Oxford
University Press.
Selkirk, Elizabeth. 1995. The prosodic structure of function words.
In Papers in Optimality Theory, 43970. University of Massachusetts
Occasional Papers in Linguistics 18.
Wackernagel, Jakob. 1892. ber ein Gesetz der indogermanischen
Wortstellung. Indogermanische Forschungen 1: 333436.
Zwicky, Arnold. 1977. On Clitics. Bloomington: Indiana University
Linguistics Club.

CODESWITCHING
Introduction
Codeswitching (CS) is defined as the use of two or more language
varieties in the same conversation, not counting established borrowed words or phrases. Two general types of structural configurations occur. 1) Intersentential CS, switches for one sentence
or many, is generally studied for its social implications (1). 2)
Intrasentential or intraclausal CS is more studied for its grammatical configurations (24).
(1)

(Policeman to heckler in Nairobi crowd, switching from


English to Swahilisentences)
How else can we restrain people from stealing except by
punishment?
Wewe si mtu ku-tu-ambia vile tu-ta-fanya kazi tu-na sheria
yetu.
Swahili translation: You arent a person to tell us how to do
our work we have our laws.
(Myers-Scotton 1993, 77)

(2)

(A clause in French embedded in a Brussels Dutch clause)


[t is dat ][que jai dit madame].
That is what that I told the lady.
(Treffers-Daller 1994, 30)

(3)

(Single English content morpheme in a Hungarian-framed


clause)
jtsz-ok school-ot
play-s.pres school-acc
Im playing school.
(Bolonyai 1998, 34).

(4)

(English verb stem with Swahili affixes in Swahili frame)


father a-li-m-buy-i-a vi-tabu a-ka-potez-a vy-ote
s-past-obj-buy-appl-fv cl-book s-consec-losefv cl-all
father bought for him books and he lost all [of them]
(Myers-Scotton 1997, 87)

CS and Its Social Meanings


CS is a means of presenting a particular persona or negotiating
interpersonal relationships in a given interaction, making it a major
research topic for some sociolinguists and linguistic anthropologists. A starting point is John J. Gumperzs (1982) notion that CS is
one of the possible contextualization cues of the speakers pragmatic intentions. Also, researchers often mention E. Goffmans
concept of footing, and M. Bakhtins concept of speakers multiple voices that are echoes of earlier utterances.

Many studies remain at the descriptive level, but at least two


models offer explanations for why CS occurs within a discourse.
conversation analysis (CA) analysts emphasize a switchs
sequential positioning in conversation, claiming that it provides
vital information about its sociopragmatic message (Auer 1998
inter alia; Li 2005). In contrast, the markedness model emphasizes that speakers use CS as a tool to present a certain persona;
they exploit participants sense of the indexicality of each code
(see indexicals) and of the contrast between the social import
of codes in a given context (Myers-Scotton 1993 inter alia).
Some analysts, such as B. Rampton, C. Stroud, and J. Gafaranga,
emphasize CS as exemplifying the speakers creative agency.
CS researchers agree on two points: 1) To engage in CS is
largely an unconscious move, and 2) speakers seldom intend
a single, specific meaning; potentially ambiguous or multiple
meanings are part of the pragmatic message.
Two overlapping generalizations capture differences in various approaches. First, the meaning of strategy, with its implication that CS carries messages of intentionality, divides analysts.
Second, analysts differ on the role of community values and
participants own sociolinguistic profiles, as well as a varietys
multiple associations, as they relate to a speakers motivation for
making a switch.

CS and Its Grammatical Structure


Most analysts agree that CS has a principled grammatical structure, but the principles they propose to constrain sentence/clause
structure vary. Many early studies employed a linear-based
framework; for example, Shana Poplack (1980) argues that possible switching depends on surface-level syntactic equivalences
across participating languages. Some place importance on distinguishing borrowing from CS through quantitative analyses (e.g.,
Budzakh-Jones 1998). In contrast, The matrix language frame
model links CS at abstract levels to psycholinguistic models
of language production (Myers-Scotton 1997, 2002). Asymmetries
between the structuring roles of the participating languages are
stressed. Also, languages do not supply morpheme types equally.
The 4-M model and a uniform structure principle explain different
morpheme distributions with more precision (cf. Myers-Scotton
and Jake 2009). Still other researchers argue that current syntactic
theory of mainstream generative grammar, though intended
for monolingual data, can explain CS parisomiously (MacSwan
2000). Although CS involves bilingual data (see bilingualism
and multilingualism), researchers claim that no dominant
or matrix language is needed. This conclusion is debated (cf.
MacSwan 2005a, 2005b and Jake, Myers-Scotton, and Gross 2002,
2005). CS as a vehicle in convergence in grammatical patterns is
also studied (e.g., Muysken 2000; Clyne 2003; Backus 2005).
Carol Myers-Scotton
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Auer, Peter, ed. 1998. Code-switching in Conversation: Language,
Interaction and Identity. London: Routledge.
Backus, Ad. 2005. Codeswitching and language change: One thing leads
to another? International Journal of Bilingualism 9: 33740.
Bolonyai, Agnes. 1998. In-between languages: Language shift/maintenance in childhood bilingualism. International Journal of Bilingualism
2: 2143.

159

Cognitive Architecture
Budzhak-Jones, Svitlana. 1998. Against word-internal codeswitching: Evidence from Ukrainian-English bilingualism. International
Journal of Bilingualism 2: 16182.
Clyne,
Michael.
2003.
Dynamics
of
Language
Contact.
Cambridge: Cambridge University Press.
Gumperz, John J. 1982. Discourse Strategies. Cambridge: Cambridge
University Press.
Jake, Janice, Carol Myers-Scotton, and Steven Gross. 2002. Making a
minimalist approach to codeswitching work: Adding the matrix language. Bilingualism, Language & Cognition 5: 6991.
. 2005. A response to MacSwan (2005): Keeping the matrix language. Bilingualism, Language, and Cognition 8: 2716.
Li, Wei. 2005. How can you tell?: Towards a common sense explanation of conversational code-switching. Journal of Pragmatics
37: 37589.
MacSwan, Jeff. 2000. The architecture of the bilingual language faculty: Evidence from intrasentential code switching. Bilingualism,
Language, and Cognition 3: 3754.
. 2005a. Making a minimalist approach to codeswitching
work: Adding the matrix language. Bilingualism, Language, and
Cognition 5: 6991.
. 2005b. Remarks on Jake, Myers-Scotton and Grosss
response: There is no matrix language. Bilingualism, Language and
Cognition 8: 27784.
Muysken, Pieter. 2000. Bilingual Speech, A Typology of Code-mixing.
Cambridge: Cambridge University Press.
Myers-Scotton, Carol. 1993. Social Motivations for Codeswitching: Evidence
from Africa. Oxford: Oxford University Press.
. 1997. Duelling Languages, Grammatical Structure in
Codeswitching. 2d ed. Oxford: Oxford University Press.
. 2002. Contact Linguistics: Bilingual Encounters and Grammatical
Outcomes. Oxford: Oxford University Press.
. 2006. Multiple Voices: An Introduction to Bilingualism.
Oxford: Blackwell. Chapters 6 and 9 deal with codeswitching for
advanced undergraduates.
Myers-Scotton, Carol, and Janice Jake. 2009. A universal model of codeswitching and bilingual language processing and production. In The
Cambridge Handbook of Linguistic Code-Switching, ed. B. Bullock and
A. Toribio, 33657. Cambridge: Cambridge University Press.
Poplack, Shana.1980. Sometimes Ill start a sentence in English Y
TERMINO EN ESPAOL: Toward a typology of code-switching.
Linguistics 18: 581618.
Treffers-Daller, Jeanine. 1994. Mixing Two Languages: French-Dutch
Contact in a Comparative Perspective. Berlin: Mouton de Gruyter.
Winford, Donald. 2003. An Introduction to Contact Linguistics.
Cambridge: Cambridge University Press. Winford provides a comprehensive overview of codeswitching designed for beginning graduate
students.

As digital computers evolved, so, too, did the notion of computer architecture. Computer designers had to pay attention not
only to the needs of the user but also to additional constraints
that arose with the development of high-level programming
languages and with the invention of new hardware technologies. Brookss more modern definition of architecture reflects
these developments: The architecture of a computer system we
define as the minimal set of properties that determine what programs will run and what results they will produce. The architecture is thus the systems functional appearance to its immediate
user (Blaauw and Brooks 1997, 3). The key element here is that
a computers architecture describes the information-processing
capacities of a device without appealing to its specific hardware
properties. In short, a computers architecture is a description
of its logical and abstract information-processing properties
(Dasgupta 1989).

The Cognitive Architecture


The concept cognitive architecture is the direct result of applying the notion of computer architecture to human cognition.
Cognitive scientists assume that cognition is information processing (e.g., Dawson 1998). Cognition as information processing must therefore be characterized by a fundamental set of
logical and abstract properties (e.g., a primitive set of symbols
and operations). By identifying this set of properties for human
cognition, one specifies the cognitive architecture.
For example, Z. W. Pylyshyn isolates the basic operations for
storing and retrieving symbols, comparing them, treating them
differently as a function of how they are stored (hence, as a function of whether they represent beliefs or goals), and so on, as well
as such basic resources and constraints of the system, as a limited memory. It also includes what computer scientists refer to as
the control structure, which selects which rules to apply at various times (1984, 30). It is no accident that this account emphasizes symbols and primitive operations for their manipulation.
This is because Pylyshyn wants to ensure that the architecture
is indeed cognitive, which for him means that it must be representational: It [the cognitive architecture] is the level at which
the system is representational, and where the representations
correspond to the objects of thought (1991,191). There may be
other levels of system organization above and below the cognitive architecture, but researchers like Pylyshyn would argue that
these levels are not cognitive.

Architecture and Explanation


COGNITIVE ARCHITECTURE
The Architecture of a Computer
The cognitive sciences have developed in large part from the
application of concepts that initially arose in computer science.
One such concept is that of architecture.
The term computer architecture was originated by Frederick P.
Brooks, Jr., a pioneering force in the creation of IBMs early computers. For Brooks, computer architecture, like other architecture, is the art of determining the needs of the user of a structure
and then designing to meet those needs as effectively as possible
within economic and technological constraints (1962, 5).

160

Brookss (1962) original notion of computer architecture was


driven by the goals of computer design: The architecture served
as a functional account of capabilities to be used as a blueprint by
hardware engineers in order to bring a computer into being. In
the study of cognition and language, the information processor
already exists. Why, then, is there a need for the cognitive architecture? The answer is that an architectural account converts a
cognitive description into a cognitive explanation.
Architectural components convert descriptions into explanations by providing a bridge between functional and physical
accounts. They can do this because components of the cognitive architecture must be both cognitive and physical (e.g.,
Haugeland 1985, 100).

Cognitive Architecture
These two lives are important because the predominant
research methodology used by cognitive scientists is functional
analysis (Cummins 1983). In functional analysis, a researcher
attempts to account for some complex function by decomposing it into an organized system of subfunctions. Each subfunction often becomes the subject of its own functional analysis;
this methodology is intrinsically iterative. However, if this were
all that there was to this methodology, functional analysis would
fall victim to an infinite regress and generate an infinite variety of
unexplained functions (Ryle 1949).
To avoid Ryles regress, functional analysis also attempts to
progressively simplify the proposed subfunctions: The highest
level design breaks the computer down into a committee or army
of intelligent homunculi with purposes, information and strategies. Each homunculus in turn is analyzed into smaller homunculi, but, more important, into less clever homunculi (Dennett
1978, 80). At some point, the less clever homunculi become so
simple that they can be replaced by physical devices that carry
out the desired function. At this point, the functional description is physically subsumed (Cummins 1983) and becomes
explanatory.
From this perspective, the cognitive architecture can be
described as the set of primitive functions that have been subsumed in a functional analysis. Their functional or cognitive role defines how these components mediate complex
information processing. Their physical role defines how such
processing can be physically implemented and explained.

Architecture and Language


To the extent that language is mediated by information processing, an explanation of language must be grounded in a cognitive
architecture. In many respects, linguistics provides prototypical
examples of architectural accounts of language. However, it can
also be argued that the cognitive architecture holds an uneasy
position within some linguistic theories.
On the one hand, dominant theories in linguistics appear
to provide architectural accounts of language. We have already
seen that a cognitive architecture requires a set of functions
to be established as primitives by subsuming them as neural
implementations. From its inception, the standard Chomskyan
approach to language appears to strive toward this kind of architectural account.
First, the specification of a generative grammar provides
a detailed account of a set of complex tokens, and the rules for
their manipulation, that are required to assign structural descriptions to sentences. Furthermore, this grammar is intended to
describe (at least in part) cognitive processing: Every speaker has
mastered an internalized a generative grammar that expresses
his knowledge of his language (Chomsky 1965, 8). In short, one
purpose of a generative grammar is to describe the functional
properties of an internalized set of symbols and rules.
Second, the Chomskyan tradition presumes a strong link
between generative grammar and the brain. This link is included
in the general view that the language faculty is a biological organ
(Hauser, Chomsky, and Fitch 2002). According to Chomsky,
The human brain provides an array of capacities that enter
into the use and understanding of language (the language faculty); these seem to be in good part specialized for that function

and a common human endowment (1995, 167). Complete


accounts of human language must appeal to these biological
underpinnings.
Thus, in the Chomskyan tradition, an architectural account
would include the specification of a generative grammar, as
well as additional processes that are necessary and sufficient for
mediating language. Furthermore, this account would be biologically grounded.
On the other hand, the Chomskyan tradition takes strong
positions that conflict with the general notion of cognitive architecture sketched earlier. Two of these positions require special
mention here. The first is that a theory in linguistics should focus
on competence, and not performance. The second is that
the language faculty is modular in the sense of J. A. Fodor
(1983).
These positions are important because both have been used
to exclude certain concepts from linguistic study that are critical
components of the cognitive architecture. For example, memory
has not been deemed to be properly part of the linguistic domain.
That is, while memory might impact language production or
comprehension (e.g., by limiting the number of embedded
clauses that can be processed), some researchers would argue
that memory is not part of the language faculty proper. For some,
memory limitations are viewed as being important to a theory
of language performance, but not to a theory of language competence (e.g., Chomsky 1965). Furthermore, memory limitations
are related to a general cognitive resource, which by definition
therefore cannot be solely part of a language faculty (Hauser,
Chomsky, and Fitch 2002).
More recent variations of the Chomskyan approach provide
a more flexible view of the competence/performance distinction and, as a result, lead to theories that make strong proposals
about cognitive architecture as construed earlier: A theory that
allows us to readily relate competence to performance ought to
be favored over one that creates hard boundaries between the
two (Jackendoff 2002, 197).
Jackendoffs parallel architecture (2002) is one such theory.
He assumes that syntax is not the only system responsible for
the combinatorial nature of language. Instead, he proposes a parallel architecture in which three separate levels (phonology,
syntax, and semantics) are independent, each having their
own primitive operations and combinatorial principles. Though
independent, the three levels are linked by interface constraints.
Thus, all three levels cooperate to produce the generative structure of language. Furthermore, this theory of linguistic structure
can be mapped to a parallel architectural theory in which each
level is a modular processor, but there are interfaces between the
three processors that share a common linguistic memory.
The preceding discussion of language and the cognitive architecture has emphasized theories that adopt a so-called classical
perspective that emphasizes the rule-governed manipulation of
symbols. It is important to realize that alternative architectures
for language have also been explored. For instance, classical
researchers have argued that artificial neural networks are not
capable of modeling the generative properties of language (Fodor
and Pylyshyn 1988). However, empirical and theoretical analyses would indicate that this criticism is not valid (Hadley and
Hayward 1997; Siegelmann 1999). As a result, many examples

161

Cognitive Grammar
exist in which neural networks have been used to model a variety
of language phenomena (Mammone 1993; Sharkey 1992).
Michael R. W. Dawson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Blaauw, G. A., and F. P. Brooks. 1997. Computer Architecture: Concepts
and Evolution. Reading, MA: Addison-Wesley.
Brooks, F. P. 1962. Architectural philosophy. In Planning a Computer
System Project Stretch, ed W. Buchholz, 516. New York: McGrawHill.
Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT
Press.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
Cummins, R. 1983. The Nature of Psychological Explanation. Cambridge,
MA: MIT Press. Explores the role of the architecture in providing
explanatory power to functional analyses.
Dasgupta, S. 1989. Computer Architecture: A Modern Synthesis. New
York: Wiley.
Dawson, M. R. W. 1998. Understanding Cognitive Science.
Oxford: Blackwell.
Dennett, D. 1978. Brainstorms. Cambridge, MA: MIT Press.
Fodor, J. A. 1983. The Modularity of Mind. Cambridge, MA: MIT Press.
Fodor, J. A., and Z. W. Pylyshyn. 1988. Connectionism and cognitive
architecture. Cognition 28: 371.
Hadley, R. F., and M. B. Hayward. 1997. Strong semantic systematicity from Hebbian connectionist learning. Minds and Machines
7: 137.
Haugeland, J. 1985. Artificial Intelligence: The Very Idea. Cambridge,
MA: MIT Press.
Hauser, M. D., N. Chomsky, and W. T. Fitch. 2002. The faculty of
language: What is it, who has it, and how did it evolve? Science
298: 156979.
Jackendoff, R. 2002. Foundations of Language: Brain, Meaning,
Grammar, Evolution. Oxford: Oxford University Press. Provides an
overview of the state of linguistics that pays special attention to architectural issues.
Mammone, R. J. 1993. Artificial Neural Networks for Speech and Vision.
New York: Chapman and Hall. Contains many examples of artificial
neural networks applied to specific areas of speech and language.
Pylyshyn, Z. W. 1984. Computation and Cognition. Cambridge, MA: MIT
Press. A detailed examination of the role of cognitive architecture in
cognitive science.
. 1991. The role of cognitive architectures in theories of cognition.
In Architectures for Intelligence, ed. K. VanLehn, 189223. Hillsdale,
NJ: Lawrence Erlbaum.
Ryle, G. 1949. The Concept of Mind. London: Hutchinson and Company.
Sharkey, N. E. 1992. Connectionist Natural Language Processing.
Dordrecht: Kluwer Academic Publishers.
Siegelmann, H. T. 1999. Neural Networks and Analog Computation: Beyond
the Turing Limit. Boston: Birkhauser.

COGNITIVE GRAMMAR
Cognitive Grammar (CG) refers to the theory of language articulated most comprehensively in Ronald W. Langacker (1987,
1991), two mutually dependent volumes that are best read
together. Langacker (1988) provides a succinct chapter-length
overview of his theory, while Taylor (2002) and Evans and Green
(2006, 553640) are highly recommended as student-oriented
introductions to the theory. CG is wide ranging in its scope and
provocative in its approach to an understanding of linguistic

162

structure. It has played a key role in the history of cognitive


linguistics.
Fundamental to CG is the idea that language is an integral
part of human cognition and cannot be properly understood
without reference to cognitive abilities. A pervasive feature of
CG is the determination to reconcile accounts of linguistic structure with what is known about cognitive processing in domains
other than language. CG contrasts in this respect with models
that insist upon a discrete, autonomous grammar module and
the autonomy of syntax. The cognitive orientation of CG
is apparent from a reliance on notions such as sensory imagery, perspective, mental scanning, attention, and figure versus
ground asymmetry in accounting for linguistic phenomena. In
broad terms, grammatical structure is explained as conventional
imagery, with alternate structures reflecting alternate construals
of the conceived situation.
Not surprisingly, the cognitive notions underlying CG assume
a relatively abstract interpretation when applied to some aspects
of linguistic structure. For example, cognitive processes such as
registration of contrast, scanning of a field, and perception of a
boundary are all deemed relevant for explicating the notion of a
count noun, understood as a bounded region in some domain
in Langacker (1987, 189203). Such processes may be obvious factors in the conceptualization of nouns with clear spatial
boundaries (e.g., cup, pencil), but a more abstract interpretation
of these processes is clearly required in other domains. Body-part
nouns (e.g., waist, shoulder, side) must be explicated in terms of a
virtual boundary that does not correspond to any visible, objectively identifiable demarcation. Likewise, the notions of figure
and ground familiar from the study of perception are seen as
underpinning various relational asymmetries in language. These
notions have most obvious relevance in the case of words relating to the spatial domain, such as the contrasting pair above and
below, where there is a figure-ground reversal of the figure and
the conceptual reference point. The terms trajector (an adaptation of the notion of figure) and landmark (an adaptation of the
notion of ground) are used to refer to the specifically linguistic
manifestation of the perceptual notions of figure and ground,
such that the book is the trajector and the table is the landmark
in the book under the table. Conversely, the table is the trajector
and the book is the landmark in the table over the book. More
abstractly still, the traditional syntactic contrast between subject
and object is construed in terms of relative salience, such that the
subject is a primary clausal figure, or trajector, and the object is a
secondary clausal figure, or landmark.
At the heart of CG is the concept of a symbolic unit, consisting of semantic structure standing in correspondence to a
phonological structure. Consistent with the idea that language
is part of conceptual structure, semantic structure is understood
as conceptualization tailored to the specifications of linguistic
convention (Langacker 1987, 99; see Talmy 2000, 4 for a similar
view of semantic structure). CG takes the notion of symbolic unit
(similar to, but not to be equated simply with, the Saussurean
sign) as fundamental and applicable at all levels of representation, including lexical items, grammatical classes, and grammatical constructions. The lexical item tree, for example, consists
of a semantic unit [tree] and a corresponding phonological unit
[tri], which combine to form the symbol for tree, [[tree]/[tri]].

Cognitive Grammar
The same apparatus is applicable for defining a word class such
as a noun, abbreviated by Langacker as [[thing]/[]], indicating a schematic semantic specification of a thing but without
any specific content phonologically. A morphologically more
complex lexical item such as trees is represented as a composite
structure integrating two symbolic units representing the noun
tree and the plural [z]: [[[tree]/[tri]]-[[pl]/[z]]]. Grammatical
constructions are in principle no different from a lexical item like
trees in terms of the descriptive apparatus required to capture all
the relevant detail, with each of the component structures of a
construction represented by a symbolic unit. Grammatical morphemes appearing in a construction, such as of, are treated as
symbolic units in their own right, with semantic structure (of, for
example, specifies a partwhole relation). The integration of any
two symbolic units goes hand in hand with distinguishing the
dependent and autonomous parts of the composite structure.
As far as semantic structure is concerned, [tree] is autonomous,
while [pl] is dependent, requiring an elaboration by a noun to
complete the structure. In terms of phonological structure, [tri] is
pronounceable as a whole syllable and can be considered autonomous, while the single consonant [z] is dependent.
A striking feature of CG is the detail provided for in the integration of structures into larger composite structures. The analysis of the English passive construction in Langacker ([1991] 2001,
10147) illustrates the theoretical notions relevant to a detailed
grammatical description and is recommended as a prime example of a full-blown CG account of a construction type. Briefly, and
consistent with the foregoing remarks, each morpheme in the
passive (including by and the auxiliary verbs) has its own symbolic representation, giving rise to the overall semantic structure,
just as the active counterpart has its own compositional structure and resulting semantic structure. Passive clauses do not
derive from active clauses in this view, nor do they derive from
some abstract structure underlying actives and passives. Rather,
passive clauses exist in their own right as instantiations of a construction type with its own distinctive way of integrating symbolic units, reflecting a particular construal of the event.
While phonological structure can be fruitfully explored
within CG (see Langacker 1987, 32848, 388400; Taylor 2002,
7895), it is semantic structure that has received most attention
and for which most theoretical apparatus has been developed.
Fundamental to semantic structure is the idea of a network that
is employed to represent polysemy relationships and to provide
motivation for conventional and novel extensions. Each node of
the semantic network, together with the associated phonological
structure, represents a semantic variant of the lexical item.
Two types of relationships figure prominently in these networks: schematicity, whereby one node of the network expresses
a meaning fully contained in another node (see schema), and
extension, understood as a relationship between semantic nodes
of a lexical item involving a conflict in semantic specifications.
The word head, for example, can be assigned a sense [part of a
whole which controls the behavior of the whole] that is
schematic relative to finer-grained elaborations, such as [part
of the human body where thinking is located] and [person
who manages an administrative unit]. In some cases, a highest-level node or superschema can be proposed, encompassing
all lower-level senses in the network, though such superschemas

are not feasible for every network. The extensive polysemy of


head, for example, makes one single superschema covering such
diverse senses as head of a lettuce, head of a bed, head of
a university department, and so on unlikely. Semantic extension holds between the more basic sense of human head and
the sense of head of an administrative unit. The node that is
the source of the extension constitutes a local prototype (with
respect to the extended sense); where one node is experienced
as representative of the whole category, as is likely in the case of
the human head sense of head, we speak of a global prototype.
There is clearly variation among speakers in their judgments
about nodes and relationships within the network, including
their ability to identify relatedness of senses and to extract schematic meanings. This variation poses challenges for description
but does not negate the need to acknowledge the reality of such
networks.
CG adopts a nonreductionist or maximalist stance in its analysis of linguistic structure, contrasting with prevailing reductionist, minimalist approaches in contemporary linguistics. The
nonreductionist approach of CG explicitly provides for the listing of highly specific patterns alongside the statement of more
general patterns, rather than recognizing only the most general
rules and schemas. The existence of a general rule of plural formation in English suffixing /s/~/z/~/z/ to a noun, for example,
does not mean that certain instantiations of the rule, such as cats,
dogs, horses, and so on, have no place in a grammatical description. On the contrary, where such instantiations have gained unit
status and are activated directly by the speakers, it is appropriate to recognize them alongside other symbolic units, grammar
and lexicon forming a continuum of types of symbolic elements.
Even when particular instantiations conform to a general rule,
they may acquire unit status in their own right, for example,
through high frequency of use. Acknowledging low-level, highly
specific instantiations runs counter to deeply entrenched practices in contemporary linguistics, which has been preoccupied
with higher-level generalizations and the principle of economy
in description.
Langacker has repeatedly emphasized the desirability of both
general and particular statements in linguistic description, referring to the assumption that a phenomenon is to be accounted
for in a mutually exclusive way as either a rule or a list as the
rule/list fallacy (Langacker 1987, 402). Grammar, in CG terms,
amounts to a structured inventory of conventional linguistic
units (Langacker 1987, 73). The units, so conceived, may be
semantic or phonological; they range from the symbolic units
consisting of a single morpheme to larger composite symbolic
units at the clause level, and they include highly specific, as well
as highly schematic, units. This conception of grammar makes
CG comparable to construction grammars, which are also
inventory-based (cf. Evans and Green 2006, 47583), particularly
radical construction grammar (Croft 2001).
By including quite specific syntagmatic patterns within a
grammatical description, CG is able to comfortably accommodate phenomena that have been largely neglected in linguistic theorizing, for example, the collocational patterning of
great idea, absolutely fabulous, and so on involving combinations of particular words. The greater emphasis on specific patterning makes CG highly compatible with the methodology of

163

Cognitive Linguistics and Language Learning


corpus linguistics and other approaches that focus on language in use whereby actual usage, including frequency of occurrence and patterns of co-occurrence, can be observed and used
as a basis for extracting patterns of varying generality (see also
the entries treating connectionism). Fully general, exceptionless rules are seen as atypical, and while it is valid to seek out
such rules, it would be misguided in this approach to attend only
to the most general patterns.
Finally, a word on notation employed in CG. There is an array
of notational devices used by Langacker, who employs a distinctive and highly original geometric style of representation (in his
earlier publications, he used the term Space Grammar to refer
to his approach). To some extent, the notation is intuitive: A
circle is used to denote a [thing] entity; thicker, darker lines
represent the profile, that is, the designated thing or relation in
the semantic structure of a morpheme. A full appreciation of the
notation, however, requires careful study. Of course, not all the
detail needs to be represented all the time, and CG ideas can be
effectively incorporated into linguistic analyses simply in prose
or with a minimum of notation (as in Taylor 2002).
John Newman
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Croft, William. 2001. Radical Construction Grammar: Syntactic Theory in
Typological Perspective. Oxford: Oxford University Press.
Evans, Vyvyan, and Melanie Green. 2006. Cognitive Linguistics: An
Introduction. Edinburgh: Edinburgh University Press.
Langacker, Ronald W. 1987. Foundations of Cognitive Grammar. Vol. 1.
Theoretical Prerequisites. Stanford, CA: Stanford University Press.
. 1988. An overview of cognitive grammar. In Topics in Cognitive
Linguistics, ed. Brygida Rudzka-Ostyn, 348. Amsterdam and
Philadelphia: John Benjamins.
. 1991. Foundations of Cognitive Grammar. Vol. 2. Descriptive
Application. Stanford, CA: Stanford University Press.
. [1991] 2001. Concept, Image, and Symbol: The Cognitive Basis of
Grammar. Berlin and New York: Mouton de Gruyter. The chapters in
this volume cover key areas of grammar (grammatical valence, case,
passive, etc.) and can be read more or less independently of one
another a good balance between CG theory and application to data.
Newman, John. 2004. The quiet revolution: Ron Langackers fall quarter 1977 lectures. In Imagery in Language: Festschrift in Honour
of Professor Ronald W. Langacker, ed. Barbara LewandowskaTomaszczyk and Alina Kwiatkowska, 4360. This chapter gives a firsthand account of an early presentation of the material that eventually
became Langacker (1987).
Talmy, Leonard. 2000. Toward a Cognitive Semantics. Vol. 1. Concept
Structuring Systems. Cambridge, MA: MIT Press.
Taylor, John. 2002. Cognitive Grammar. Oxford: Oxford University Press.

COGNITIVE LINGUISTICS AND LANGUAGE LEARNING


A fundamental challenge for any theory of language is to provide
a convincing account of how the prelinguistic child becomes a
competent member of his or her linguistic community. To be
convincing, the developmental account should be consistent
with the general model of language posited for the adult system. Michael Tomasello (2003, 45) calls this the how do we get
there (to the adult language system) from here (the pre-linguistic
infant) problem. A cognitive linguistic approach to the adult

164

language system holds that language is a reflection of human


cognition and that language can be accounted for by the interaction of the complex set of cognitive capabilities with which
humans are endowed. Language is understood to emerge from
contextualized use; that is, it is usage based. The language
system to be learned is an inventory of linguistic constructions
(units with a phonological pole, i.e., the constructions form,
and a semantic pole, i.e., the constructions meaning) which
range in size from lexical items and morphemes to syntactic
and even discourse (see discourse analysis [linguistic])
patterns. The challenge for language acquisitionists working
within a cognitive linguistic framework is to account for the rapid
learning of this vast array of linguistic patterns, drawing solely on
general cognitive processes. Over the past 25 years, developmental psychologists have developed a large body of observational
and experimental evidence that begins to do just that.
Elizabeth Bates famously said, Language is a tool. We use it
to do things (1976, 1). Cognitive linguists hold that the primary
thing for which we use language is communication. Humans
are highly social by nature. We also have the unique ability to
think about entities and events that are not in the immediate environment. Since we cannot communicate using mental telepathy,
we have to somehow externalize our internal conceptualizations
to make them available to others. Language is the major tool we
have developed to accomplish this task. Cognitive linguists hold
that when children learn a language, what they are learning is
constructions, of varying sizes and degrees of abstractness, as
they engage in using language in context. This learning process
takes place over a rather extended period of time, with most of
the pieces in place by the time the child is six or seven.
According to the developmental psychologist Jean Mandler
(2004), children begin forming rudimentary concepts, many
of which form the basic semantic frames from which language is
constructed, in early infancy. Very early on, the infant begins a
cognitive process of reformatting raw perceptional information
into simple, schematized spatial representations that express
fundamental experiences, such as self-motion, caused motion,
and containment (see schema ). Experimental evidence shows
that by three months, the infant can distinguish typical human
motion from mechanical motion. The infant learns that certain
entities start up under their own power while others do not. The
same entities that generate self-motion can also cause other
entities to move or change; the entities that do not generate
self-motion tend to be acted upon, and so on. The constellation
of these perceptually grounded generalizations form the basis
for fundamental concepts, such as animacy, inanimacy, and
caused motion. Such categories, in turn, represent the semantic frames for basic syntactic patterns, such as intransitive and
transitive constructions, and participant roles, such as agent
and patient (see thematic roles ). A wide range of infant
studies provides rather clear evidence that infants are actively
observing and exploring their world, forming generalizations
across events and entities, and in the process developing concepts and rudimentary syntactic-semantic frames that are the
foundation of language. These concepts are largely in place by
nine months.
Other researchers provide evidence that prelinguistic children generalize over units of spoken language and find linguistic

Cognitive Linguistics and Language Learning


patterns. For instance, several studies show that children as
young as eight months are sensitive to repeated patterns of syllables; this particular pattern finding forms the basis for recognizing words in the auditory stream (e.g., Saffran, Aslan,
and Newport 1996, 19268; see word recognition, auditory and speech perception in infants). In general, this
approach to language learning argues that language is extracted
from the patterns of usage events experienced by the child; the
system is derived from and grounded in contextualized utterances. For instance, the evidence suggests that the childrens
early word forms are shaped by the salience of particular types of
words in the adults speech. English children first produce relational words such as more, up, off, which seem to have particular salience in adult speech directed to children, and later fuller
verb forms, such as take off. In contrast, Korean childrens first
verb forms are full forms that reflect the salient forms in adult
speech. (Choi and Gopnik 1996). In line with the commitment
to cognitive generalization, pattern finding is not limited to
language; it is essential to all types of category formation (see
categorization).
Tomasello argues that intention-reading skills are also necessary to account for language learning. At around 9 to 12 months,
the young child begins to engage in a number of activities in
which he or she actively attends to and participates in communicative interaction. Around this age, children engage in joint
attentional frames in which the child coordinates and shares
attention with another participant around a third entity, for
instance, when the infant and parent attend to the same object
or when the infant follows the eye gaze or gesture of an adult
in order to attend to a distal object or event. These are activities
that create common shared ground for communication that
involves intentional communication about something outside
the dyad. The communicative events that take place within joint
attentional frames have the quality of focusing on a goal-directed
activity in which both the child and the adult are participating.
Within the context of the joint attentional frame, the infant can
begin to understand the adults use of pieces of language in coordination with communicative intent. This is the grounding
for the young children to recognize that those around them are
intentional agents, like themselves, and further, that language is
used intentionally to manipulate the attention, mental state, or
even actions of the other person.
At this age, children also begin using verbal cues in order to
perform intentional actions; for instance, for varying purposes,
the child begins to use linguistic symbols to direct the adults
attention to something outside the immediately shared frame.
In order to do this successfully, the child must engage in what
Tomasello terms role-reversal imitation. It is not enough to simply repeat the adults language; the child must learn to use a symbol toward the adult in the same intentional way the adult uses it
towards him or her. Tomasello argues that it is not a coincidence
that shortly after young children begin to engage in joint attention sharing, they also begin to produce their first truly linguistic
symbols. Although these early utterances are one-word phrases
or unanalyzed chucks, such as whats-that (holophrases; see
holophrastic stage), they have a range of functions, including imperative, declarative, and interrogative, that are typically
distinguished by distinct intonation contours. Intention-

reading skills are general cognitive skills that are fundamental to


a number of nonlinguistic human activities, such as tool use and
play.
Young children tend to be conservative in their language
use, apparently learning language item by item, as the items
are used in context. For instance, if they have heard a verb
used transitively, they are unlikely to use it in an intransitive
construction until they hear it used intransitively. Eventually,
children begin to form generalizations or more abstract representations of the patterns. It is only when the childs syntactic
constructions become more abstract that creative language
begins to emerge, sometime between two and three years.
Tomasello argues that the creative use of language represents
the child putting together utterances out of already well-entrenched pieces of language.
Adele Goldberg (2006) specifically argues that syntactic patterns are best understood as meaningful constructions (see construction grammars) that reflect recurrent, humanly salient
scenes, such as an agent engaging in an activity that results in
force being applied to another entity (transitive) or the agent
causing someone to receive something (cause to receive). To
the set of pattern finding and intention-reading skills, Goldberg
adds key frequency-based constraints to account for the way in
which the child learns to limit his or her use of abstract constructions, using language creatively but in a way that is attuned to the
conventional restrictions of the ambient discourse community.
Two of the most important constraining elements identified
by Goldberg are skewed input and preemption. Corpus studies
show that in the speech directed at children, a single verb, or
small set of verbs, tends to be disproportionally used with particular syntactic constructions. For instance, in the cause-to-receive
construction, for example, Ellie _____ Jerry the teddy bear, dozens of different verbs occur, but the verb give accounts for over
40 percent of the instances. Goldberg points out that the semantics of give closely match those of the cause-to-receive construction; thus, there is a reinforcing match between the semantics of
the syntactic pattern and the central verb that occurs with the
construction. She argues that such skewed input is a key aid in
helping children both learn the semantics of the syntactic construction and the conventionally appropriate matching between
verbs and constructions.
A second important constraint provided by the input is
preemption. This is the notion that if two forms seem equally
appropriate for a particular context, but one consistently occurs
in the input while the other does not, the child learns that the
form that occurs in the input preempts the other, seemingly
appropriate, form. As a simple example, at some point the child
forms the generalization that -ed is used to represent an action
or event that took place in the past and creates the form goed.
However, in the input, where the child expects to hear goed, he
or she consistently hears went. After a short period of overgeneralization (see overregularizations), the child will learn that
went preempts goed, and stop producing goed. Goldberg argues
that exactly the same process accounts for children learning
the match between specific verbs and argument structure, for
instance, that told occurs in the cause-to-receive construction,
as Mommy told Isabela the story, but say does not. N. Ellis (2002)
reviews a plethora of psycholinguistic studies showing that

165

Cognitive Linguistics and Language Learning

Cognitive Linguistics, Language Science, and Metatheory

humans are highly sensitive to frequency of linguistic input, thus


providing support for Goldbergs fundamental claims regarding
the importance of skewed input and preemption.
Much of the research on child language learning focuses on
universal human capabilities and universal stages that all children go through. A theory that argues language is a reflection of
general human cognitive processes would predict many universals. As a usage-based theory, cognitive linguistics also focuses
attention on language-specific learning and predicts a wide
range of variation. Melissa Bowerman and Sonja Choi (2001)
studied the acquisition of spatial language by young English- and
Korean-speaking children, an area in which one might expect to
find general human perception reflected rather uniformly across
languages. Relatively speaking, however, English has a rather
general system for expressing the notion of containment with
the preposition in and that of support with the preposition on.
In contrast, Korean makes a number of finer distinctions in these
categories with separate verbs of containment that express tight
fit versus loose fit and verbs of support expressing more information about the supporting surface, such as horizontal surface and
juxtaposing surfaces. Bowerman and Choi found that despite the
seeming differences in complexity, both systems were learned at
about the same age. Korean children were quite sensitive to the
fine-grained spatial distinctions lexicalized in their language.
Such findings raise issues about how language might influence the speakers mental representation of phenomena in
the world. Dan Slobin (1985) describes language-directed attention as thinking for speaking. He argues that the language makes
salient, and focuses the speakers attention on, different aspects
of a scene in order to encode it in language. However, the claim
is not that language somehow causes people to experience spatial
scenes or activities differently (strong Sapir-Whorf hypothesis),
but rather that speakers of different languages have the capacity
to categorize objectively similar experiences in different ways.
Andrea Tyler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bates, Elizabeth. 1976. Language and Context: The Acquisition of
Pragmatics. New York: Academic Press.
Bowerman, Melissa, and Sonja Choi. 2001. Shaping meanings for language: Universal and language-specific in the acquisition of spatial semantic categories. In Language Acquisition and Conceptual
Development, ed. Melissa Bowerman and S. Levinson, 15891.
Cambridge: Cambridge University Press.
Choi, Sonja, and A. Gopnik. 1996. Early acquisition of verbs in English: A
cross-linguistic study. Journal of Child Language 22: 497530.
Ellis, N. 2002. Frequency effects in language processing: A review with
implications for theories of implicit and explicit language acquisition.
Studies in Second Language Acquisition 24.2: 14388.
Goldberg, Adele. 2006. Constructions at Work: The Nature of Generalization
in Language. Oxford: Oxford University Press.
Mandler, Jean. 2004. The Foundations of Mind: Origins of Conceptual
Thought. Oxford: Oxford University Press.
Saffran, E., R. Aslan, and E. Newport, E. 1996. Statistical learning by
8-month old infants. Science 274: 19268.
Slobin, Dan. 1985. The Crosslinguistic Study of Language Acquisition.
Hillsdale, NJ: Lawrence Erlbaum Associates.
Tomasello, Michael. 2003. Constructing a Language: A Usage-Based
Theory of Language Acquisition. Cambridge: Harvard University Press.

166

COGNITIVE LINGUISTICS, LANGUAGE SCIENCE, AND


METATHEORY
Cognitive linguistics is probably best understood not as a theory
but as a theoretical orientation. A theoretical orientation is a
broader category that encompasses a number of particular theories that share presuppositions, attitudes, interests, methods,
and so on. generative grammar is a theoretical orientation
in this sense, as is connectionism. In the past, behaviorism
and structuralism were important theoretical orientations.
Different theories within an orientation need not be mutually compatible. Moreover, theories in different orientations
need not be mutually exclusive, and are certainly not mutually
exclusive on all points. However, orientations often operate as
identity categories, ways of defining affiliations, dividing ingroups from out-groups. This has a number of intellectual and
practical consequences, including our tendency to exaggerate
differences between in-groups and out-groups, formulating the
views of both in extreme ways. For example, Noam Chomskys
view of the autonomy of syntax is complex. However, both
Chomskyans and anti-Chomskyans may absolutize the distinction between syntax and semantics, setting aside the nuances
of Chomskys own formulations (see autonomy of syntax).
The dichotomizing of in-group/out-group differences is a
matter for social psychology. However, the tendency it represents is not unique to group relations. There are analogues for
this sharpening of differences at virtually every level of human
cognition, extending down to the neuronal level in perception.
I mention this because the continuity of cognitive functions
is arguably the fundamental principle of cognitive linguistics.
Moreover, that cognitive continuity is embodied (see embodiment), thus ultimately founded on bodily experience. Cognitive
linguists are, of course, concerned with the ways in which this
embodied continuity bears on language. Additionally, such a
continuity helps us to understand the development of theoretical
orientations. In this way, cognitive linguistic ideas are significant
for language science at two levels. First, they are important at the
level of guiding a set of research programs in language study.
Second, they have metatheoretical value in suggesting ways
we might think about the relations among different theoretical
orientations.
In this entry, I do not discuss specific cognitive linguistic
theories, which are treated in other entries theories such as
cognitive grammar, frame semantics, conceptual metaphor, and conceptual blending (see also construction
grammars, exemplar theory, functional linguistics,
usage-based theory, and cognitive linguistics and
language acquisition, as well as the more specific entries
on basic level concepts, blended space, conduit metaphor, framing effects, generic- and specific-level
metaphors, image schema, mental space, metonymy,
parable, and projection [blending theory]; in addition to
overviews of key topics, these entries provide key bibliographical
items for further reading in cognitive linguistics). Rather, I consider three general characteristics of the cognitive linguistic orientation, characteristics drawn from Croft and Cruse (2004). In
discussing these ideas, I simultaneously consider what they reveal
cognitively about the pretheoretical attitudes that ground cognitive linguistics, and how these might differ from the pretheoretical

Cognitive Linguistics, Language Science, and Metatheory


attitudes that ground other approaches to language science, particularly generative grammar.
On the first page of their important introduction to cognitive
linguistics, William Croft and D. Alan Cruse (2004) present the
following three major hypotheses: First, language is not an
autonomous cognitive faculty; second, grammar is conceptualization; and third, knowledge of language emerges from language use. Despite their phrasing, it does not seem quite right to
refer to these as hypotheses. Like the fundamental ideas in any
theoretical orientation, these are more like guiding principles,
assumptions for research programs that tie together the different theorists with their different theories. Cognitive linguists do
not set out to falsify or even corroborate, say, the nonautonomy
of grammar. Rather, they put forth specific hypotheses regarding how grammatical patterns can be explained by reference to
general cognitive structures, processes, and contents. (I should
emphasize that this is in no way a criticism of cognitive linguists.
Everyone does this in all theoretical orientations. That is part of
having a theoretical orientation and, therefore, is not something
that merits blame, or praise.) I would like to consider each of
these orienting principles in turn.
The first principle is a reformulation of the fundamental idea
of cognitive linguistics that language (structure, production,
reception, acquisition, and so forth) is continuous with other
aspects of human cognition. However, Croft and Cruse put the
statement negatively. Along one axis, the opposite of continuity would be discontinuity, a separation between language and
the rest of cognition. This is the position associated with generative grammar. So, in framing their statement negatively, Croft
and Cruse are making clear just where the identity division falls
here between those who see linguistic cognition as continuous with other forms of cognition and those who make language
a separate faculty. I say that this is an opposition along one
axis because, along another axis, this is not an opposition. For
example, part of the identity definition of generative grammar
involved demarcating its mentalistic view of language subsequently shared with cognitive linguistics from the nonmentalistic view of behaviorism.
The second orienting principle of cognitive linguistics is that
grammar is conceptualization. This extends the continuity
assumption, but it begins to structure that continuity as well.
If cognition is continuous in its various operations, there are
several ways in which particular cognitive functions might be
organized to perform tasks in different domains. For example,
it seems possible that syntax and semantics would be separate,
even if neither is autonomous with respect to other cognitive
processes. Syntax might follow some sort of sequential ordering
process also found in bodily movement, while semantics might
follow some other set of principles found in inferential thought.
However, cognitive linguists have tended to see different components of language itself as continuous with one another.
But this, too, leaves open several options. Just how are we to
understand the continuity of language? Are there neutral principles that apply equally to syntax and semantics, for example?
Here, cognitive linguists have tended to opt for a stratified
view of cognitive operation in language. Semantics is primary;
syntax is secondary. It is worth considering the form of Croft and
Cruses statement grammar is conceptualization. Given that

conceptual metaphor theory is a paradigmatic case of cognitive


linguistics, it is difficult not to read this statement as manifesting the canonical form of a metaphor, with the structure Target is
source (e.g., Juliet is the sun; see source and target). Thus
conceptualization more generally, semantics provides the
mental model for understanding syntax. This yields an orienting principle for language study. Faced with a grammatical phenomenon, this tells us, look for an explanation in the semantic
function of the grammatical phenomenon.
It is worth contrasting, and comparing, generative grammar
on this score. In both cognitive linguistics and in generative
grammar there is, in effect, a privileged level of language study.
While the privileged level in cognitive linguistics is semantics,
that in generative grammar is syntax. The privileging is somewhat different in the two cases. In generative grammar, syntax
does not provide an explanation for semantics. However, in each
case, language is seen as first and most importantly defined by
one level (semantics or syntax). This has consequences for the
ways in which all theories in the orientation are formulated.
For example, literary theories influenced by generative grammar took syntax as a model (see generative poetics), while
literary theories influenced by cognitive linguistics often take
semantics as a model (see cognitive poetics). Perhaps more
importantly, this privileging of a single level of language has consequences for empirical study and the evaluation of theories and
evidence. Every theory of any value has things that it can explain
and things that it cannot explain. Part of a theoretical orientation
involves distinguishing central cases, the things that really need
explaining, from peripheral cases, the things that it is less crucial
to explain. The privileging of one level of language contributes
to that division. Cognitive linguists are likely to find it scandalous that generative grammarians do not have any cogent way
of explaining the complexities of metaphor. Generative grammarians are likely to find objections on these grounds to be trivial. However, they are likely to find it scandalous that cognitive
linguists explain certain complex syntactic patterns in a somewhat loose, nonalgorithmic way. Cognitive linguists are likely to
respond to the generative arguments with the same indifference
that generativists show toward metaphor, seeing those intricacies of grammar as contrived artifacts of the generative method.
In each case, some concerns deemed central by one orientation
are deemed marginal by the other.
The privileging of one level of language is related to something
else ones determination of what a paradigmatic case of language is. As usual, the point is bound up with general cognition,
in fact general semantic cognition. When one thinks of a category
(see concepts and categorization), certain sorts of things
come to mind a prototype or standard case, perhaps salient
instances; depending on the category, some mode of actional
engagement may arise as well (e.g., a bodily orientation). Take a
category such as minority group. Each of us has an idea of what
a prototypical minority is, and each of us has some exemplars
or instances. While there are certainly similarities across our
prototypes and exemplars, there are differences as well. These
differences will affect how we respond to such ideas as minority
rights. The same point holds for a category such as sports, though
in this case, particular sports may partially arouse action tendencies as well (e.g., think of batting in baseball; when doing so, at

167

Cognitive Linguistics, Language Science, and Metatheory


least many people imaginatively orient toward a batting stance).
This is the sort of anticipatory actional engagement that is well
known in emotion theory (see, for example, Frijda 1986, 69 ff.). It
is bound up with research linking meaning with action (see, for
example, Pulvermller 2002, 5662), research often stressed by
cognitive linguists treating embodiment.
Again, all these points apply directly to language study for
language, too, is a semantic category. Our prototypes and exemplars of language certainly have things in common, enough in
common that we identify roughly the same sorts of things as
language. But they differ also. Indeed, it seems that they differ
systematically. Specifically, our pretheoretical ideas about language seem to cluster in ways that are roughly coordinated with
theoretical orientations. For some people, language is first of all
a matter of words; for others, sentences; for others, larger discourses. For some people, actional engagement with language
prototypically involves looking up words in a dictionary or learning vocabulary; for others, it may involve language instruction,
language therapy, or computer programming; for others, it may
be a matter of interior monologue or personal writing; for others,
animated conversation. If language brings to mind complexes
of related and opposed words, one will be inclined to view language differently than if it brings to mind sentences, or if it brings
to mind dialogues (see, for example, dialogism and heteroglossia) or chains of reasoning or poetry.
Prototypes, exemplars, and actional orientations are, of
course, somewhat different among ordinary folk, on the one
hand, and professional language scientists, on the other. But
the general pattern is the same. Indeed, one might argue that
the situation with language scientists is more extreme. On the
one hand, they have greater exposure to a wider range of understandings of language. This should, to some extent, loosen the
constraints imposed by their pretheoretical attitudes. On the
other hand, their specialized engagement with particular aspects
of language (not to mention their emotional and career investment in the success of certain theoretical orientations) tends
to entrench those attitudes more firmly. Thus, structuralism
and related developments, such as deconstruction, tend to
begin with a pretheoretical view of language as primarily a matter of words, specified professionally in terms of phonemes and
morphemes. In contrast, generative grammar began with a pretheoretical view of language as primarily a matter of sentences.
This was entrenched through research focusing on syntax.
Cognitive linguists actually seem to have somewhat different
prototypes for language in terms of language units, contributing
to their division into slightly different theoretical groups. (One
could undoubtedly make related points about structuralism and
generative grammar.) They do focus on meaning. But meaning
occurs in different bundles words, sentences, larger discourses.
Some cognitive linguists seem to focus on words, others on sentences, and others on larger discourses. However, in each case,
their prototype involves some person a person thinking the
words, writing the sentences, or arguing with someone else in
the larger discourse. Generative grammarians commonly wish
to abstract away from persons, seeing persons as introducing
performance errors into language. In contrast, cognitive linguists
stress the necessary involvement of persons in language. In some
versions of cognitive linguistics, this becomes embodiment. In

168

others, it is related to usage-based theory which leads us to the


third of Croft and Cruses criteria, that knowledge of language
emerges from language use.
The first thing to note about this orienting principle is that,
as phrased, it refers, most basically, to language acquisition.
As such, it is opposed (once again) to generative grammar.
Generative grammarians often see language as growing out of
an innate language faculty (see innateness and innatism).
Alternatively, they see a language acquisition device as
taking in very fragmentary, inadequate data and using that data
to set parameters for principles that are already given genetically (see principles and parameters theory). One might
say that, in this view, knowledge of language is largely given
innately and, thus, language use arises from prior knowledge.
Cognitive linguists reverse this explanatory sequence as one
would expect, given the apparent difference in pretheoretical
prototypes.
Croft and Cruses statement also alludes to Chomskys division between competence and performance. Competence is
the inner grammar that one has developed (or grown) in learning a language. Performance is any act of speaking, writing, or
signing. In generative grammar, competence explains performance or partially does, since performance is also affected
by many nonlinguistic factors. (That is why generative grammarians often wish to abstract away from persons in treating
language.) This leads us away from narrow concerns with acquisition. Specifically, Croft and Cruses statement suggests that, in
Chomskys terms, performance does not arise from competence
(plus other things), but the reverse. Competence or knowledge
of language, they suggest, is a sort of artifact of the actional practice of language use once more, just what one would expect,
given different pretheoretical prototypes and associated theoretical orientations. This has consequences for acquisition as
well, for it suggests that the child is not a passive recipient of
language but is actively engaged in language practice while
learning.
Finally, it is worth remarking on the use of metaphors here.
Chomsky sees speech as performance. I do not at all believe
that we are determined by the metaphorical associations of our
speech. However, certain word choices may reflect certain prior
attitudes, and they may serve to prime other, related ideas (see
priming, semantic), making it easier for us to choose those
primed options (other things being equal). In relation to drama,
film, or music, performance may suggest a secondary activity,
a more or less effective, successful, or accurate instantiation of
some prior, correct language the play text, the screenplay, or
the musical score. For example, an actor may flub his lines and
a director may cut scenes. If I want to study Shakespeares King
Lear, I do not confine myself to a particular film or stage production, a performance. Rather, I look at the text. The analogue in
language leads me to study competence, what lies behind performance, rather than performance itself. In contrast, use suggests
that language is a tool. The prior form is not the real thing (like
the play text). Rather, the crucial matter is the current action,
what is done with the tools. In relation to language, this suggests
the importance of focusing on speech practice.
The use of metaphor is related to a series of other general semantic/cognitive processes that bear on our theoretical

Cognitive Linguistics, Language Science, and Metatheory


orientations. For example, it is related to our organization of
semantic space, thus what we consider most similar to human
language and what we consider most different from it. For example, we will understand and investigate language in certain ways
if we view it as most similar to mathematics; we will understand
and investigate it in other ways if we view it as closest to gesture
or cooperative labor. The same point holds if we see human language as most different from animal communication, from
human silence, or from machine code. As with other phenomena we have been considering, none of this is something that we
examine empirically and seek to falsify or corroborate. Rather,
it is pretheoretical background that inclines us toward one or
another theoretical orientation.
Individual readers no doubt have different views as to which
pretheoretical ideas are right and which are wrong. I certainly
do. One thing I hope to have suggested, however, is that such
preferences are primarily a matter of ones own pretheoretical
attitudes. Does it seem obvious that language is really embodied and that abstracting from that embodiment is wrong?
Certainly, speech is embodied and it would be misleading to
ignore that. But it also seems clear that there are many speech
glitches and that at least some degree of abstraction is necessary. What about metaphors or the structure of semantic space?
Language is in some ways very much like mathematics, but in
other ways like shared work. It is in some ways very different
from animal communication or machine code, but in other
ways very similar to them. This is, of course, not to say that there
are no facts about language, about whether language is best
understood in one way or another. There are such facts in particular cases, and there are theories that are better at explaining
those facts. However, it is to say that the broad principles that
define our theoretical orientations are almost always partially
correct and partially incorrect. It is probably productive to pursue one set of orienting principles rigorously in a research program. However, it is probably not productive to dismiss other
theoretical orientations that are pursuing research programs
based on different principles (themselves derived from different pretheoretical attitudes).
Cognitive linguistics is a vibrant research program, or set of
research programs, in language science, as attested by a range
of entries in this volume. But it also has something to teach us
at a metatheoretical level. Specifically, it suggests that our theorization should be pursued vigorously within theoretical orientations, but that our subsequent evaluation of theories should
be less insular and, perhaps more importantly, less combative.
One of the early discoveries of conceptual metaphor theory was
that we commonly model intellectual dispute on war (see Lakoff
and Johnson 1980, 45). This is not simply a matter of metaphors.
It is the result of the ways in which we form in-groups and outgroups based, in this case, on theoretical orientations (as well as
personal affinities, attitudes toward the public political stances
of leading theorists in each orientation, and other factors). In any
event, it is unfortunate. Of course, cognitive linguists are people
with brains like anyone else. All of us tend to slide into the model
of warfare when engaging in intellectual dispute. But, as a theoretical orientation, cognitive linguistics helps us to see what we
are doing when we enter into combat mode and how intellectually deleterious it is. That metatheoretical point is potentially

Cognitive Poetics
one of the most important contributions of cognitive linguistics
to language science.
Patrick Colm Hogan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Croft, William, and D. Alan Cruse. 2004. Cognitive Linguistics.
Cambridge: Cambridge University Press.
Feyerabend, Paul. 1975. Against Method: Outline of an Anarchistic Theory
of Knowledge. London: Verso.
Frijda, Nico. 1986. The Emotions. Cambridge: Cambridge University
Press.
Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By.
Chicago: University of Chicago Press.
Pulvermller, Friedemann. 2002. The Neuroscience of Language: On Brain
Circuits of Words and Serial Order. Cambridge: Cambridge University
Press.

COGNITIVE POETICS
Cognitive poetics is the study of literary reading that draws on the
principles of cognitive science. In its early phase, the discipline
drew mainly on cognitive linguistics in focusing on the
textual cues for literary reading; in its more recent phase, it has
drawn more readily from cognitive psychology in order to explore
issues of readerly effects and aesthetics. Throughout its short
history, however, practitioners in the field have shown a willingness and propensity for genuinely multidisciplinary study. Work
in cognitive poetics is often characterized by an awareness that
the task of understanding literary reading holistically involves
a serious engagement with several disciplines: Linguistics and
psychology are central, but they are also often enriched from literary scholarship and critical theory, discourse analysis and
social theory, anthropology and historical study, neuroscience
and medical research, aesthetics, ethics, and philosophy. Most
people working in cognitive poetics believe that the systematic
analysis of literary reading is also essential within both linguistics
and psychology.
The field of cognitive poetics coalesced as an identifiable
movement in the mid-1990s, though it is possible to classify retrospectively several areas of work that can be seen as precursors.
The term itself was coined by Reuven Tsur in the 1970s (see Tsur
1992) to refer specifically to his exploration of literary aesthetics
through neuroscience and cognitive psychology. Since then, the
term has been taken up and broadened in scope to include a wide
range of research questions, frameworks for analysis, and areas
for exploration. Various alternative names for the enterprise have
been used during its brief history, each indicating the slightly different emphases of its users. For example, cognitive rhetoric
has been used in North America to point to the connections with
classical rhetoric in uniting form and effect in language study;
cognitive stylistics indicates a focus on detailed and rigorous textual analysis in the European tradition of stylistics, or literary
linguistics; other, more neutral terms, such as cognition and literature, have also been preferred on occasion.
It is apparent, too, that there are cultural contexts underlying
this nomenclature, much of which has to do with the branding
of the new discipline in institutional settings and the intellectual
marketplace. In the United States, the generative paradigm in

169

Cognitive Poetics
linguistics, on the one hand, and poststructuralist critical theory
in literary study, on the other, have meant that cognitive poetics
has been fighting for recognition on two fronts. Though the distinction was important earlier, a West Coast psychological tradition of metaphor study and an East Coast tradition of linguistic
textual analysis have now largely been merged. Similarly elsewhere, a continental European empiricist focus and a British and
East Asian stylistic emphasis have joined together more recently.
If there is any division remaining, it is a tendency for American
scholars to emphasize macrocognitive concerns and for EuroAsian researchers to emphasize the more micrological effects of
stylistic texture, though even this division is fast disappearing as
cognitive poetics develops globally. (See Lakoff and Turner 1989;
Turner 1991; Stockwell 2002; Gavins and Steen 2002; Semino and
Culpeper 2002).
For the majority of writers in cognitive poetics, the basic
common principles of their work involve a rejection of the
distinctions between text and context, form and meaning, abstraction and specification, and literal and metaphorical expression. Fundamental to this position is, most often, the
cognitive linguistic claim that language use is embodied and
that mind and body cannot be separated (see embodiment).
In other words, the linguistic expressions used in all languages
are elaborations of basic physical circumstances of the human
condition. To give an example from early cognitive linguistics,
there are fundamental conceptual metaphors (such as life
is a journey, ideas are objects, theories are buildings, and
so on) that tend to figure abstraction and complexity in familiar
and concrete terms. Such idealized figurative habits condition
our thinking and are manifest in linguistic expressions. These
conceptual metaphors are maintained and exploited in literary
texts just as in any form of language, and a thorough cognitive
poetic analysis is interested in both the conceptual significance
of the underlying scheme and the textual pattern through which
it is expressed (see Johnson 1987).
The principle of linguistic embodiment resolves the key issue
for a theory of interpretation, which is how a single explanation can account for the fact that individual readings are possible
but communal readings are in practice very common. Our basic
human condition creates figurative linguistic commonalities,
and our personal, social, and cultural tracks through these idealizations create individual, group, and cultural distinctiveness.
The literary scholar coming to cognitive poetics thus has a systematic and principled means of exploring individual expression
and sociohistorical patterns in culture.
Cognitive poetics sees reading as a natural evolutionary
process, rather than as an artifice that is distinct from other
human capacities. So, it is particularly interested in the ways
everyday (that is, nonprofessional, nonacademic) readers read,
and it draws connections between the natural creativity of imagination in everyday language (see creativity in language
use) and the particular ways in which linguistic creativity is
manipulated in the literary setting. Meaningfulness is regarded
as a readerly process, rather than a final classification, and so
cognitive poetics researchers have investigated how meanings
are constructed and resolved in the process of literary reading.
The personal experience and social circumstances of the reader
are at least as important as the textual organization of the literary

170

work. Cognitive poetic analyses thus have benefits for the study
of human language processing and the study of mind, as well as
for literary and artistic scholarship.
Within cognitive poetics, several different dimensions of
investigation have emerged. Some of the foremost and earliest
work centers on the human conceptual and linguistic capacity for metaphor or conceptual integration (see conceptual
blending). When cognitive linguists explored the workings of
conceptual metaphors of the sort mentioned earlier, it became
apparent that some expressive, poetic, or innovative metaphors
used in literary settings were causing problems with the basic
theory. Some creative literary language was more concerned with
interesting deviance and unsettling defamiliarization than with
resolving unfamiliar concepts in familiar terms. Furthermore,
some artistic metaphors seemed to affect those very same familiar domains in ways that persisted in the continuing life of the
reader, and most perplexingly of all, some literary metaphors
seemed to take on a life of their own that went beyond the basic
explanatory mappings of their source domains.
The cognitive poetic theory of conceptual integration or
blending has developed as a good account of these and other
features. Briefly, the theory proposes that a set of inputs are generalized to produce a blended space, a mental representation
that the reader uses to develop the emergent logic, texture, and
consequences arising from an engagement with the metaphor.
For example, at a microstylistic level, in Theodore Roethkes
phrase I have known the inexorable sadness of pencils, three
main input spaces (unstoppable motion, human emotion, and
the tool of the writer) are blended to produce a richly integrated
sense of emotional significance that is difficult to express in
any other way. At a more macrological level, the allegorical and
analogical significations of, for example, Margaret Atwoods
novel The Handmaids Tale amount to more than the sum of a
political manifesto, on the one hand, or a dark fantastic narrative, on the other (see Fauconnier and Turner 2002).
Another cognitive poetic dimension with a long history is
comprised of the various frameworks for the consideration of
worlds in literary works. Traditional possible worlds logic
from pragmatics and the philosophy of language has been
augmented as a means of understanding the richness of fictional
projection. (Indeed, the notion of mental spaces in conceptual integration theory owes something to the notion of worlds in
this sense). Such work has been especially fruitful in dealing with
extended prose fiction in which the divergence from reality (the
actual possible world) is most striking or thematized: science
fiction, fantasy, magical realism, dream visions, allegories, and
fairy tales (see Ryan 1991; Ronen 1994; Semino 1997).
Also aiming to integrate text and context in this way is schema
poetics, which draws on the notion of psychological schemas
from both Kantian philosophy and artificial intelligence work.
Here, culturally shared knowledge frames provide a rich context
for linguistic input that always underdetermines the affective
outcome in the reader. In literary research, the notion has been
used to explain mismatches in reader and character knowledge
and as a means of exploring the notion of literariness itself
(see Cook 1994).
Most recently developing out of the worlds tradition is text
world theory, a cognitive poetic approach to discourse processing

Cognitive Poetics

Coherence, Discourse

that has been fruitfully applied to many texts, including literary


ones. Text world theory offers a means of understanding how only
certain parts of readerly schematic knowledge are activated in a
reading of the literary text. It seems particularly useful for tracking readerly comprehension and understanding involvement
and empathy (see Emmott 1997; Werth 1999; Gavins 2007).
Other work in cognitive poetics has dealt with detailed matters of stylistic texture, drawing on cognitive grammar and
the psycholinguistics of figure and ground (Langacker
1987, 1991; and see van Peer 1986), on image schemas, and
on the psychology of deictic (see deixis) maintenance and shift
(Duchan, Bruder, and Hewitt 1995). If there is a major criticism
of cognitive poetics in addressing the nature of the connection
between cognition and stylistic realization, this work seems
to offer a direct response. Certainly the linguistic realization
of conceptual patterns is the most urgent area for new work.
Currently, the interest of cognitive poetics in readerly empathy,
emotion, and aesthetics is being developed through the notion
of texture (see Stockwell 2009), and connections are being forged
with related fields in political ethics and critical discourse
analysis (see Lakoff 2002; OHalloran 2003). As a young and
expanding field that has not yet established its paradigms, cognitive poetics looks set to continue this initial expansive phase for
the foreseeable future.
Peter Stockwell
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Cook, Guy. 1994. Discourse and Literature. Oxford: Oxford University
Press.
Duchan, J. F., G. A. Bruder, and L. E. Hewitt, eds. 1995. Deixis in
Narrative: A Cognitive Science Perspective. Hillsdale, NJ: Lawrence
Erlbaum.
Emmott, Catherine. 1997. Narrative Comprehension: A Discourse
Perspective. Oxford: Clarendon Press.
Fauconnier, Gilles, and Mark Turner. 2002. The Way We Think. New
York: Basic Books.
Gavins, Joanna. 2007. Text World Theory: An Introduction. Edinburgh
and New York: Edinburgh University Press and Columbia University
Press.
Gavins, Joanna, and Gerard Steen, eds. 2002. Cognitive Poetics in
Practice. London and New York: Routledge. This book is the companion volume to Stockwell (2002) and contains good examples of cognitive poetics.
Johnson, Mark. 1987. The Body in the Mind: The Bodily Basis of Meaning,
Imagination, and Reason. Chicago: University of Chicago Press.
Lakoff, George. 2002. Moral Politics. 2d ed. Chicago: University of Chicago
Press.
Lakoff, George, and Mark Turner. 1989. More Than Cool Reason: A Field
Guide to Poetic Metaphor. Chicago: University of Chicago Press.
Langacker, Ronald. 1987, 1991. Foundations of Cognitive Grammar. Vols.
1 and 2. Stanford, CA: Stanford University Press.
OHalloran, Kieran. 2003 Critical Discourse Analysis and Language
Cognition. Edinburgh: Edinburgh University Press.
Ronen, Ruth. 1994. Possible Worlds in Literary Theory.
Cambridge: Cambridge University Press.
Ryan, Maire-Laure. 1991. Possible Worlds: Artificial Intelligence and
Narrative Theory. Bloomington and Indianapolis: Indiana University
Press.
Semino, Elena. 1997. Language and World Creation in Poems and Other
Texts. London: Longman.

Semino, Elena, and Jonathan Culpeper, eds. 2002. Cognitive Stylistics.


Amsterdam and Philadelphia: Benjamins. A collection of articles in
cognitive poetics.
Stockwell, Peter. 2002. Cognitive Poetics: An Introduction. London and
New York: Routledge. The standard and comprehensive textbook of
the field.
. 2009. Texture: A Cognitive Aesthetics of Reading. Edinburgh and
New York: Edinburgh University Press and Columbia University Press.
Tsur, Reuven. 1992. Toward a Theory of Cognitive Poetics.
Amsterdam: North-Holland.
Turner, Mark. 1991. Reading Minds: The Study of English in the Age of
Cognitive Science. Princeton, NJ: Princeton University Press.
. 1996. The Literary Mind: The Origins of Thought and Language.
Oxford and New York: Oxford University Press. An excellent polemic
and exemplification of cognitive poetics.
van Peer, Willie. 1986. Stylistics and Psychology: Investigations of
Foregrounding. New York: Croom Helm.
Werth, Paul. 1999. Text Worlds: Representing Conceptual Space in
Discourse. London: Longman.

COHERENCE, DISCOURSE
What is a discourse? What makes a discourse coherent or incoherent? Investigation into these difficult questions has yielded so
many sophisticated proposals that a short, comprehensive survey is well out of reach.
With regard to the first question, it is fair to say that there is
widespread disagreement. Some researchers think of discourses
as texts, which raises questions about how texts are to be identified and individuated. Is sameness of spelling necessary and/
or sufficient to textual identity? Could there be significant variations, such as differences of spelling, among the tokens of a single text type or discourse? Under what conditions are sentence
tokens grouped as parts of a single text? Some investigators
question the very choice of texts as the basic unit of analysis: A
discourse is not a text (type), they say, but a string of spoken or
written sentences in a language. Yet disagreement again resurfaces when we ask how these strings are to be picked out. Some
authors tend to think of a discourse as an utterance (construed
loosely along Gricean lines as anything that is a candidate for
non-natural meaning and produced with communicative
intention), whereas others think that speech-acts, or groups
thereof, are the relevant discursive units. Yet even this is not sufficiently holistic for some discourse analysts, who want to focus
on larger sociocultural patterns, such as the discourse of racism
(see discourse analysis [foucaultian] and discourse
analysis [linguistic]). Some researchers argue for the primacy of face-to-face conversational interactions and contend
that the analysis of discourse coherence should find its point of
departure in specific sequences of immediate communicative
interaction (Schegloff 2001). Another area of divergence concerns the nature and number of the participants or producers of
a single discourse. Is a conversation between two or more parties
one or several discourses?
Investigators who disagree over the very outlines of the
concept of discourse can often still agree that certain kinds of
examples ought to be counted as discourses, and so can meaningfully debate logically independent questions about discourse
coherence. One such question is whether the coherence or

171

Coherence, Discourse
incoherence of a discourse is a matter of degree and, if so, what
sort of vagueness or lack of specification explains this fact. Is
it just ignorance of the real coherence conditions that leads us
to judge that a given discourse is more or less coherent or that
another discourse is a borderline case? Is discourse coherence
by its very nature a genuinely scalar concept? To what extent are
judgments of coherence and incoherence relative to the contexts,
categories, and genres of discourse? One may be tempted to conclude that the coherence of a poem is something quite different
from that of an argumentative essay, but perhaps the subtending
semantic relations are similar or even identical.
The coherence of discourse is not just a matter of logic. The
absence of logical contradictions, or the presence of logical
coherence (see coherence, logical), is hardly sufficient to
establish discursive coherence more generally, as the following
example is designed to show:
(1) The sap dripped. Mike pondered
Battologymeans tiresome repetition.

the

cogito.

It is logically possible that what the speaker of (1) has said is


entirely true, but as the speaker flits from topic to topic; the
utterance has a kind of incoherence but not the logical kind.
The example may be used to suggest that coreference or, more
broadly, sameness of subject or topic, is a necessary condition on
a discourses coherence.
Compare (1) to (2):
(2) I am fed up with the telephone always ringing. Dont
hesitate to call me!
The speaker has conjoined two sentences that both have something to do with the topic of telephoning, yet this two-part discourse seems incoherent unless the utterance was meant to
convey the idea that in spite of the irritation over phone calls in
general, an exception is being made with regard to the addressee.
And indeed, an intended, implicit contrast or counterpoint can
contribute to discursive coherence. What seems to be missing in
(2) is an explicit, metadiscursive marker, such as but, indicative
of the intended link. (An informative survey of work on discourse
markers, a growth market in linguistics, is offered in Schiffrin
2001.)
Investigators have described many other relations held to be
constitutive of discursive coherence or cohesion. For example,
consider the following coherent minidiscourse schemata, with
some of the proposed names for the illustrated coherenceconstitutive link:
(3) I am fed up with the telephone always ringing. A call woke
me up last night. (elaboration, illustration, specification)
(4) I am fed up with the telephone always ringing. You are
tired of always hearing me complain. (parallelism)
(5) I am fed up with the telephone always ringing. People are
always bothering me and asking for something. (amplification, generalization)

(8) I am fed up with the telephone always ringing. Yet I do


get some important calls every now and then. (concession,
contrast, qualification)
The appearance of divergent and at times highly elaborate lists of
coherence-constitutive relations has occasioned the generation of
such lists, as well as theoretical reflections over their status (Hovy
and Maier n.d.; Redeker 2000; Kehler 2002; Sanders 1997). Is the
list of coherence-constitutive relations open-ended or finite? Are
there deeper-level relations to which various other relations may
be reduced? What would count as a successful reduction? Should
the list of relations include recordation, or the loosest memorial
associations? How much of a real constraint is there?
At least some disagreements about the lists of relations stem
from divergent assumptions about the nature of the relata. Are
the actual intentions of the utterer a component of coherence, or
only those thoughts or moves expressed by the spoken or written phrases? Some analysts (e.g., Grosz and Sidner 1986) bring
in the speakers plans, while others want to leave them out and
contend that coherence is determined by rhetorical structures
that are like the crystallized form of diverse speech-acts (Asher
and Lascarides 2003). According to these authors, a discourse
is coherent to the extent that anaphoric expressions can be
resolved and the propositions introduced can be rhetorically
connected to other propositions in the discourse. The recognition of coherence requires the drawing of inferences about
rhetorical relations on the basis of semantic content; these rhetorical relations then serve as the basis for inferences about other
aspects of content. This approach is contrasted to one in which
conversational implicature and other implicit, coherence-relevant content are recognized by inferring the speakers
intentions directly from conventional linguistic meaning and
contextual factors such as salient aspects of the conversational
situation. Another issue is whether the coherence of a conversational exchange depends finally on complex relations between
the participants cooperative activities and intentions. Can a
participants deliberate silence contribute to the coherence or
incoherence of the overall discourse? Could the logical incoherence of one participants contribution contribute to the coherence of the conversation of which it is a part?
Questions about the psychological status or reality of some
of the complex coherence relations figuring in the literature lead
to a more general issue pertaining to the explanatory merits of
discursivity as such. Although we may sometimes attend specifically to coherence or incoherence as such, especially in scholarly
and argumentative contexts, in many other cases the production
and processing of words and related deeds proceed smoothly in
the absence of any such focus. Specific relations such as parallelism, contrast, causeeffect, and so on play their part, but not
the disjunctive property called coherence that can be theoretically cobbled together. Consequently, the analysis of discourse
coherence runs the risk of reifying underconstrained theoretic
constructions.
Paisley Livingston

(6) I am fed up with the telephone always ringing. Im going


to disconnect my phone. (causation, explication)

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

(7) I was fed up with the telephone always ringing. Then I fell
ill. (temporal ordering, narration)

Asher, Nicholas, and Alex Lascarides. 2003. Logics of Conversation.


Cambridge: Cambridge University Press.

172

Coherence, Logical
Clark, Herbert. 1996. Language Use. Cambridge: Cambridge University
Press.
Cohen, Philip, Jerry Morgan, and Martha Pollack, eds. 1990. Intentions in
Communication. Cambridge, MA: MIT Press.
Gernsbacher, Morton Ann, and Talmy Givon. 1995. Coherence in
Spontaneous Text. Amsterdam and Philadelphia: John Benjamins.
Grimes, J. 1975. The Thread of Discourse. The Hague: Mouton.
Grosz, Barbara J., and Candace L. Sidner. 1986. Attention, intentions, and
the structure of discourse. Computational Linguistics 12: 175204.
Halliday, M. A. K., and Ruqaiya Hasan. 1976. Cohesion in English.
London: Longman.
Hobbs, Jerry R. 1985. On the Coherence and Structure of Discourse.
Stanford, CA: Center for the Study of Language and Information.
Hovy, Eduard H., and Elisabeth Maier. N.d. Parsimonious or profligate: How many and which discourse structure relations? Available
online at: http://www.isi.edu/natural-language/people/hovy/papers/
93discproc.pdf (accessed February 7, 2009).
Kehler, Andrew. 2002. Coherence, Reference, and the Theory of Grammar.
Stanford, CA: CSLI Publications.
Mann, William C., and Sandra A. Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text
8.3: 24381.
Redeker, Gisela. 2000. Coherence and structure in text and discourse.
In Abduction, Belief and Context in Dialogue: Studies in Computational
Pragmatics, ed. H. Bunt and W. Black, 23363. Philadelphia: John
Benjamins.
Sanders, Ted. 1997. Semantic and pragmatic sources of coherence: On
the categorization of coherence relations in context. Discourse
Processes 24: 11947.
Sanders, Ted, and Wilbert Spooren. 1999. Communicative intentions and
coherence relations. In Coherence in Spoken and Written Discourse,
ed. W. Bublitz, U. Lenk, and E. Ventola, 23550. Amsterdam: John
Benjamins.
Schegloff, Emanuel A. 2001. Discourse as an interactional achievement: III: The omnirelevance of action. In Handbook of Discourse
Analysis, ed. Deborah Schiffrin, D. Tannen, and H. Hamilton, 22949.
Oxford: Blackwell,
Schiffrin, Deborah. 2001. Discourse markers: Language, meaning, and
context. In Handbook of Discourse Analysis, ed. Deborah Schiffrin, D.
Tannen, and H. Hamilton, 5475. Oxford: Blackwell.

COHERENCE, LOGICAL
Logicians generally employ coherence and consistency as
synonyms naming the absence of contradictions in a group of
sentences, propositions, or beliefs, where a contradiction is
the conjunction of a proposition and its negation. In metaphysical terms, logical incoherence or contradiction is the impossible
instantiation of a property and some other, incompatible property, as in the circle was square. Epistemically, a contradiction
is an irrational belief in both a proposition and its denial.
Logical consistency is not a necessary feature of what people
say, write, or think. Nor is the absence of contradictions a sufficient condition on discourse coherence (see coherence,
discourse), as a collection of logically consistent yet unrelated
sentences does not constitute a coherent discourse. In many
contexts, however, logical consistency is a regulative norm for
speakers and interpreters. According to classical logic, a set of
propositions is either coherent or contradictory and trivial (in
the sense that it entails all propositions or explodes). In classical
logic, the ex falso quodlibet argument was held to establish that

Colonialism and Language


given a single contradiction, every arbitrarily chosen proposition
follows validly. Yet it is now often denied that this is a good principle of reasoning, and some philosophers contend that there are
paraconsistent yet nonexplosive systems. That some proposition
and its negation are part of the same belief set does not imply
that all other propositions belong to that set, and different levels
of logical coherence can be delineated in semantic representations of inconsistent sets (Jennings and Schotch 1984).
Some philosophers have gone so far as to contend that there
are true contradictions, such as the conclusions yielded by liar
and sorites paradoxes. The latter dialetheist stance is contested
by many logicians, however, who have sought to establish that
all paradoxical arguments are invalid or unsound. For example,
in an updating of the medieval cassatio account of logical paradoxes, Laurence Goldstein (2000) argues that while liar sentences
are meaningful, they lack content in the sense of failing to specify
truth conditions and, therefore, are neither true nor false.
Logical coherence or consistency is not equivalent to logical validity, which is often defined as a basic constraint on the
relations between the premises and conclusions of an argument: Valid arguments are those in which truth is preserved, in
the sense that whenever all of the premises of the argument are
true, its conclusion is necessarily true. (In classical logic, validity requires the preservation of falsehood as well; necessarily,
if the conclusion is false, at least one of the premises is false.)
Attempts to provide a conceptual analysis of the notion of logical
consequence include syntactical, model-theoretic, and prooftheoretic approaches.
Paisley Livingston
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Etchemendy, John. 1990. The Concept of Logical Consequence.
Cambridge: Harvard University Press.
Goldstein, Laurence. 2000. A unified solution to some paradoxes.
Proceedings of the Aristotlean Society 100: 5374.
Jennings, R. E., and P. K. Schotch. 1984. The preservation of coherence.
Studia Logica 53: 89106.
Priest, Graham. 2004. In Contradiction: A Study of the Transconsistent.
The Hague: Martinus Nijhoff.
Priest, Graham, J. C. Beall, and Bradley Armour-Garb, eds. 2004. The Law
of Non-Contradiction: New Philosophical Essays. Oxford: Clarendon.
Sainsbury, R. M. 1995. Paradoxes. 2d ed. Cambridge: Cambridge
University Press.

COLONIALISM AND LANGUAGE


One of the complications of writing on the centrality of language
to imperialism and colonialism is that even if the account were
limited to the policies and practices of the European nationstates in the last 500 years, it would still leave out major historical
events and processes. For example, it would not cover the effects
of Roman imperial linguistic strategy, one of which, at a deep
historical level, was the appearance of a number of the modern
European vernacular languages that were, in turn, to become
vehicles of imperial and colonial rule. Nor would it include the
linguistic impact of earlier empires, for example, the Aztecs
in Mexico or the Incas in Peru. And it would not address nonEuropean modern imperialism and colonialism, for example,

173

Colonialism and Language


the consequences of the imposition of Japanese language and
culture on its Asian neighbors in the late nineteenth and early to
middle twentieth centuries.
Given the complexity of this larger history or set of histories,
it would be impossible to provide any sort of sensible rendition
of it or them in a short entry. It is proposed, therefore, to trace
the development of one major form of linguistic colonialism in
order to demonstrate the general ideology that lay behind the
practice, and to show how, even in this single example, it worked
differently in distinct locations and points in history. Though this
sacrifices historical specificity in one sense, it is intended that
the example chosen the uses of English in the British imperial
and colonial project will demonstrate the particularity and variability of the process. Antonio De Nebrija made an important
point when he asserted in his Gramtica Castellana, published
in the fateful year 1492, that siempre la lengua fue compaera
del imperio (language was always the companion of empire)
(de Nebrija [1492] 2006, 13), but it is necessary to pay attention to
the ways in which this relationship was constituted within different forms of colonialism.
An account of English (later British) colonialism in Ireland
might start by noting that when the English first invaded Ireland
in 1169, they took their language with them and imposed it on the
native population. But such a narrative would involve an anachronistic oversimplification both in terms of the national identity of
the invaders and the languages that they spoke. It is open to question, for example, whether the leaders of the invasion thought of
themselves as English at all (barely a century after the Norman
Conquest, they were more likely to have considered themselves
Norman or Anglo-Norman), and the languages of their mercenary
soldiery included Flemish, Welsh, Anglo-Norman, and, of course,
what passed for English. Indeed, the first colonial legislation on
language in Ireland, The Statute of Kilkenny (1366), was notable
for two reasons. First, it was directed against the colonists, rather
than the colonized, and had the aim of preventing the colonizers
from adopting the native Gaelic language and culture. The indigenous Irish were not included in the scope of the law since they
could speak their own language if they wanted; the point was to
stop the colonizers from going native (a process of cultural assimilation that had been occurring since the first invasion). Second,
despite proclaiming that English should be the language of the
colonists, the statutes were in fact written in Norman-French
one of the languages of law in England at the time. The point here
is that although the general outline of the history of linguistic
colonialism in any given case can be traced relatively easily, the
debates and practices pertaining to specific historical conjunctures are often difficult and complex to understand.
In the sixteenth century, some 400 years after the first invasion of Ireland, the centralizing English state determined upon a
policy of linguistic colonialism as part of its attempt to bring the
whole island under crown rule. The legislation that marked the
implementation of the strategy, Henry VIIIs Act for the English
Order, Habit and Language (1537), revealed the belief that underpinned it. The law ordered that all of the kings subjects conform
to English culture, especially language, on the basis that
there is again nothing which doth more contain and keep many
of [the kings] subjects of [Ireland], in a certain savage and wild

174

kind and manner of living, than the diversity that is betwixt them
in tongue, language, order and habit, which by the eye deceives
the multitude, and persuades unto them, that they should be as it
were of sundry sorts, or rather of sundry countries. (Statutes 1786,
28 H 8. c.xv.)

The corollary to this belief that cultural specifically linguistic


difference created division and prevented political and religious
unity was the idea that a common language would forge common political allegiance and identity. The logical consequence,
therefore, was that linguistic difference had to be extirpated and
Ireland Anglicized. Edmund Spenser, poet and colonial servant,
noted in 1596 that it hath ever been the use of the Conqueror,
to despise the language of the conquered and to force him by all
means to learn his (Spenser [1596] 1633, 47). He argued for the
eradication of Gaelic on the supposition that the speech being
Irish, the heart must needs be Irish: for out of the abundance of
the heart the tongue speaks (ibid., 48).
Yet if it was the aim of linguistic colonialism in Ireland to
Anglicize the country in order to bring it completely under political control, then it was a goal that was not achieved until the late
nineteenth century (by which point Irish linguistic nationalism,
the binary opposite of the colonial policy, had already started to
inspire the revolutionary movement that overthrew British rule).
It has been calculated that in the 1830s, for example, half the
native Irish population spoke Irish, and half of that group spoke
only Irish. Some 80 years later, just prior to Irish independence,
less than 14 percent of the population spoke any Irish, and no
more than 2 percent were Irish monoglots. How was this linguistic shift brought about? In this specific case there were a number
of factors: the incorporation of the country by military force into
the imperial political and economic order and the consequent
introduction of the socially centralizing processes of industrialism and urbanization; the massive emigration that followed upon
the widespread poverty among the rural population; the imposition of an educational system that rejected the native language in
favor of English; the spread of the bureaucratic state into everyday life; the choice of English as the language of religion by the
Irish Catholic Church; and the death of large numbers of Irish
speakers in the Great Famine.
Although a number of these causes were particular to
Ireland, others repeated in a pattern that occurred across the
British Empire though with differences. Indeed, if there is a
key to understanding how and why linguistic colonialism of the
modern European type operated, it lies in this variable combination of economic, cultural, educational, and religious factors
and their effects upon the lived experience of colonial subjects.
The nature, practices, and functions of colonial language policy
changed throughout time, were altered to suit the differing purposes of the colonizers, and were adapted when the colonized
responded in various ways. The sole aim was, to coin an oxymoron, the ruthlessly pragmatic use of language to achieve, consolidate, and prolong colonial rule.
In this regard, it should also be remembered that the discourse
deployed around the languages of colonialism also formed part
of the colonial project. For example, Edwin Guest noted in 1838
that English is rapidly becoming the great medium of civilisation,
the language and law to the Hindoo, of commerce to the African,

Colonialism and Language


of religion to the scattered islands of the Pacific; its range, he
observed, is greater than ever was that of the Greek, the Latin, or
the Arabic; and the circle widens daily (1838, 703). And in 1850,
T. Watts argued in the Transactions of the Philological Society that
it will be a splendid and novel experiment in modern society, if
a language becomes so predominant over all others as to reduce
them in comparison to the proportion of provincial dialects. He
had one language in mind, of course: [A]t present the prospects
of the English language are the most splendid the world has ever
seen. It is spreading in each of the quarters of the globe by fashion, by emigration, and by conquest (Watts 1850, 214).
The imperial vision that both Guest and Watts articulated in
the mid-nineteenth century was already one that held English
to be a global language transmitted by means of economic and
military conquest; by the emigration of English speakers as both
proponents and victims of British power; by cultural influence
(not least through education and the fashionability that economic success brings with it); and by the imposition of the civilizing influence of religion. But it is important to note that both
in the nineteenth century and today, the phrase global language
is significant in its reference to the use of English in contexts
across the world, but also highly misleading in its suggestion that
it is the same form of the language used throughout the world.
English, as the vehicle of imperialism and colonialism (primarily
British, more recently American), was and is used in enormously
wide-ranging situations, but it isnt a world language, either in
the sense of a single form reproduced globally or in the sense that
it is used by even a majority of human beings. Given the diversity of human experience, the complexity of our history, and the
nature of human language, it is highly implausible that a particular language English, Chinese, Arabic, or any other will
become a true global language. Indeed, as has been seen in the
history of colonialism and postcolonialism, what in fact happens
when it is imposed in different places across the world is that the
language itself changes and develops. This process of the emergence of variant forms, sometimes recognized as new languages
in their own right, is often described as the price that imperial
languages have to pay for their historical role.
If the functions of language in imperialism and colonialism are historically, spatially, and contextually variable, then
the responses made by those who were subjected to these languages also differ accordingly in the colonial and postcolonial
periods. To take the example of English again, it is possible to
point to the distinct roles of the language in India both before
and after national independence. Although English was clearly
used under colonialism to produce domination and to exercise
power, it is nonetheless the case, as B. J. Kachru has shown,
that it was used as a language of Indian nationalism in the independence struggle and now functions in complicated ways as a
vehicle of control, authority and administrative cohesion not
least in the way in which it can operate as a neutral medium in
particular contexts. This is not, however, to say that English is not
still perceived by some as a language of oppression in India, as it
is in other postcolonial locations. In the debate about the proper
medium for African literature, for example, the Kenyan writer
Ngg Wa Thiongo identified English as a significant cause of
colonial alienation and thus argued for his native Gky as
the language best suited to express his African experience. On

Color Classisification
the other hand, Chinua Achebe, another major writer, rejected
this position and opted instead to use English, but a new form
of English, linked to its national home but altered to conform to
African realities.
The range of views on this and related issues and the vehemence with which they are expressed testifies to the ongoing
complexity and significance of the debates surrounding the legacy of linguistic colonialism, many of which are treated as questions of language policy (see language policy).
Tony Crowley
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Achebe, Chinua. 1975. Morning Yet on Creation Day. London:
Heinemann.
Calvet, Louis-Jean. 1998. Language Wars and Linguistic Politics. Oxford:
Oxford University Press.
Crowley, Tony. 2005. Wars of Words: The Politics of Language in Ireland
15372004. Oxford: Oxford University Press.
de Nebrija, Antonio. [1492] 2006. Gramtica Castellana. Barcelona:
Lingua.
Guest, Edwin. 1838. A History of English Rhythms. London: Bell.
Kachru, B. J. 1986. The Alchemy of English: The Spread, Function and
Models of Non-Native Englishes. Oxford: Pergamon.
Ngg Wa Thiongo. 1986. Decolonising the Mind: The Politics of Language
in African Literature. London: James Currey.
Pennycook, Alastair. 1998. English and the Discourses of Colonialism.
London: Routledge.
Spenser, Edmund. [1596] 1633. View of the present state of Ireland. In
The Historie of Ireland Collected by Three Learned Authors, ed Sir James
Ware, 1119. Dublin.
Statutes at Large Passed in the Parliaments Held in Ireland, The. 1786
1801. 20 vols. Dublin.
Watts, T. 1850. On the probable future position of the English language.
Proceedings of the Philological Society 4: 20714.

COLOR CLASSISIFICATION
Color terms label categories of the hue, saturation, and brightness of light reflected from surfaces. Because colors vary from
one another continuously and independently on these three
dimensions, there is no apparent intrinsic structure to the color
space that would prevent speakers of different languages from
cutting up the continuum in different ways. In the absence of
empirical studies, psychologists and anthropologists expected
color classification to be an example of extreme cultural relativism and that the spectrum would be segmented into categories
by different languages in arbitrarily different ways (e.g., Brown
1965, 31516).
B. Berlin and P. Kay (1969) refuted this relativist assumption.
They asked native speakers of 20 different languages to identify
the best examples ( foci) and the boundaries of basic color terms
on a Munsell color chart (a grid of 320 color chips with 40 hues
and 8 levels of brightness, plus a 10-chip gray scale). Although
informants varied enormously in their placement of boundaries
of color categories, they agreed considerably more on the choices
of the foci of the categories. Berlin and Kay found that there were
only 11 basic color categories in their sample of languages, with
foci in black, white, red, yellow, green, blue, brown, gray, pink,
orange, and purple. More surprisingly, they found that the color

175

Color Classisification

W/R/Y
G/B/K

W R/Y
G/B/K

W R Y
G/B/K

W R Y
G B/K

W R/Y
G/B K

W R Y
G/B K

W R K
Y/G/B

W R K
Y/G B

categories came in a limited number of combinations. If there


were two categories, their foci were in black and white; if three,
they focused in black, white, and red; if four, black, white, red,
and yellow or black, white, red, and green; if five, black, white,
red, yellow, and green; if six, blue was added; if seven, brown
was added; and if there were eight or more categories, gray, pink,
orange, and purple were added, in no particular order.
In the roughly 40 years since Berlin and Kays initial description of the universals and evolution of color classification, this
picture has been enriched and complicated by further research,
but not radically changed. The most thorough revision was
prompted by the results of the World Color Survey (WCS) (Kay,
Berlin, and Merrifield 1991; Kay et al. 1997; Kay and Maffi 1999;
Cook, Kay, and Regier 2005). The WCS investigated color naming in 110 languages, with roughly 25 native speakers of each
language interviewed about their color names for each of 330
color chips, and their choices of the best examples of each basic
color term of their language. The WCS represented an enormous
improvement in both the methods and the quantity of data over
Berlin and Kay (1969): The WCS interviewers questioned many
more speakers of each language, surveyed nearly six times as
many languages, and focused on languages spoken by indigenous groups in Africa, Papua New Guinea, and Central and
South America, as opposed to predominantly Indo-European
languages.
The revised sequence (Kay et al. 1997; Kay and Maffi 1999)
recognizes five evolutionary pathways (shown in Figure 1). The
five trajectories were interpreted as generated by four principles: partition (lexicons tend to partition items into exhaustive
and mutually exclusive categories); black and white (distinguish
black and white); warm and cool (distinguish warm primaries
from the cool primaries); and red (distinguish red from other
colors).
One open question is where the universals come from. Kay
and C. McDaniel (1978) argued that the six unique hue points of
white, red, yellow, green, blue, and black are given to us by the
neurophysiology of color vision. They based their interpretation
on R. L. DeValois, I. Abramov, and G. H. Jacobss (1966) research
on the lateral geniculate nucleii (LGN) in the hypothalamus of
rhesus macaques, which had reported three families of neurons.

176

W R Y
G B K

Figure 1. Five trajectories of color term evolution.

Two families were opponent processes: a red-green channel


(excited by red light and inhibited by green light, or vice versa)
and a yellow-blue channel (excited by yellow light and inhibited
by blue light, or vice versa). The third family was a white-black
channel that responds to brightness levels independently of the
other two channels. This physiological account helped explain
why there might be universals in the classification of what
seemed a structureless domain: The universal structure is one
imposed by the neurophysiology of color vision.
Unfortunately, subsequent research has revealed that the
neurophysiological opponent process system would actually put
the unique hue points in the wrong places. The true axes of the
system are closer to cherry-teal and chartreuse-violet than they
are to red-green and yellow-blue (Jameson and DAndrade 1997).
Sadly, this leaves the universals in color classification without a
clear neurophysiological explanation.
James Boster
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Berlin, B., and P. Kay. 1969. Basic Color Terms: Their Universality and
Evolution. Berkeley: University of California Press.
Brown, R. 1965. Social Psychology. New York: Free Press.
Cook, R. S., P. Kay, and T. Regier. 2005. The world color survey database: History and use. In Handbook of Categorization in Cognitive
Science, ed. H. Cohen and C. Lefebvre, 22342. New York: Elsevier.
DeValois, R. L., I. Abramov, and G. H. Jacobs. 1966. Analysis of the
response patterns of LGN cells. Journal of the Optical Society of
America 56: 96677.
Jameson, K., and R. G. DAndrade. 1997. Its not really red, green, yellow, blue: An inquiry into perceptual color space. In Color Categories
in Thought and Language, ed. C. L. Hardin and L. Maffi, 295319.
Cambridge: Cambridge University Press.
Kay, P., B. Berlin, L. Maffi, and W. Merrifield. 1997. Color naming across
languages. In Color Categories in Thought and Language, ed. C. L.
Hardin and L. Maffi, 2156. Cambridge: Cambridge University Press.
Kay, P., B. Berlin, and W. Merrifield. 1991. Biocultural implications of
systems of color naming. Journal of Linguistic Anthropology 1: 1225.
Kay, P., and L. Maffi. 1999. Color appearance and the emergence and evolution of basic color lexicons. American Anthropologist 101: 74360.
Kay, P., and C. McDaniel. 1978. The linguistic significance of the meanings of basic color terms. Language 54: 61046.

Communication

COMMUNICATION
Explicit models of what communication is are not prominent in
cognitive science. Research under the explicit banner of communication does flourish in sociology. There, the emphasis is
most often on phatic rather than ideational communication, a
distinction due to B. Malinowski (1923). The former creates,
maintains, or dissolves communities of communicators. The
latter communicates ideas. Generally, these two aspects of communication are pursued independently, however unjustifiably.
The explicit and general models of language structure that dominate linguistics are sometimes tacitly thought of as providing,
inter alia, models of ideational communication, though the gap
between structure and function is underestimated.
Of course, this avoidance of explicit study might be an analogue of the lack of discussion of life in biology. Life is the subject
matter of biology and, therefore, the word is little heard therein.
Communication is at least a sizable part of what the social sciences are about and, so the argument goes, it is not surprising
that the word is little heard. Unfortunately, this analogy does not
fit unproblematically or absolve us very far.
There is, of course, a huge amount of work on communication phenomena. Linguistics is largely about the structure of
natural languages, and at least one of their functions is communication. psycholinguistics studies the interpretation of
natural language discourses by people. Social psychology has
much to say about both linguistic and nonlinguistic communication, from tone of voice to body language. Sociology provides
extensive studies of communication in all sorts of guises, from
microdialogue to mass media. The humanities likewise. The
issue here is not lack of study, or even lack of study in cognitive
science, but lack of theoretical frameworks for conceptualizing
what human communication is. One might be happy to have lots
of models, better still competing ones, but to have none smacks
of carelessness.
One significant event in the recent cognitive history of the
concept of communication was Noam Chomskys demolition
of the pretense of behaviorist psychologists and linguists to
analyze language (and communication) in terms of finite state
machine models. This computational model is closely related to
C. Shannon and W. Weavers model of communication, one of
the few influential abstract models of communication, in which
a sender issues signals from a finite code book through a channel
to a receiver who decodes the messages from an identical code
book. The amount of information transmitted is a function of
the probability of the occurrence of these signals. Information is
measured by the decrease of uncertainty. The less predictable a
signal, the more we learn from its occurrence. Of course, Shannon
and Weaver were not behaviorists. They assumed that the senders and receivers minds have general capacities for assimilating
messages, though any such assimilation lay outside their model.
But the behaviorists finite state machine can be construed as a
particular application of Shannon and Weavers model, and the
behaviorists claim was that it was a general theory not just of
human communication but of human behavior in general. There
was no mind to assimilate the finite code of messages.
Chomskys demolition of the finite state model as a model of
human language was deservedly famous, thoroughgoing, and

liberating. Human language was first proposed by behaviorists


to be analyzable in terms of a finite state model, and this analysis of human language was then shown by Chomsky, from some
elementary considerations about the structure of sentences, to
be evidently defective. The conclusion drawn was that the structure of human language was not to be understood on this model,
but no objection was raised to it as a model of communication.
Language was saved but communication was fed to the behaviorists. The conclusion from Chomskys demolition might just as
well have been that the behaviorist model was a bad model of
human communication and that better was deserved.
This skepticism about addressing communication is widely,
though not universally, shared by cognitive scientists it is not
just an isolated aberration of Chomskys. Those who have considered communication worth discussion have generally pursued
it within the framework of pragmatic theories, such as Paul
Grices, which address communication as an add-on to a logical
theory of sentence meaning (Sperber and Wilson 1986; Levinson
2000). These authors appreciate the gap between structure and
function. We communicate more than the literal meanings of
the sentences we utter (e.g., the implicatures we thereby make;
see conversational implicature), but this assumes, rather
than explains, what it is to communicate a sentences meaning.
Functional studies of information structure, many descended
from Halliday (2004), are further oblique contributions to an
understanding of communication in terms of the tailoring of
linguistic message to audience, but again they eschew a direct
account of what communication is.
Perhaps it is fitting that the nearest thing we have to a frontal
approach to communication is derived from philosophical and
logical approaches to language, and specifically to the semantics of discourse (see discourse analysis [linguistic]). The
logical tradition can be seen as defining communication as the
achievement of mutual interpretation for discourses. Traditional
logic had no explicit account of the process of interpretation, but logic was implicitly about the criteria that had to be
met to achieve mutually shared interpretation between two
participants in an argument or proof. If the parties shared all
assumptions, then they should also share the deductive closure
of the assumptions and conclusions. So if they differed on conclusions (such as P vs. not P) then that must be because there
was some divergence of assumptions or interpretation of the language fragment that appeared in the argument. For example, two
important cases were equivocation and enthymeme.
In equivocation, a party might draw a conclusion that relied
on slippage in their interpretation. Socrates might be interpreted
as, say, referring to the Greek philosopher on one occurrence
and the Brazilian footballer on a second. Or democracy might, at
one point in the argument, admit of a political system in which
unlimited funds could be used to campaign, while excluding
such systems at another point. Logic from the earliest times distinguished content from form, and avoidance of equivocation
was the main constraint on content in the process of interpretation. No constraint was placed on the content attached to X,
other than that it must be the same content that attached to every
occurrence of X in the argument.
Enthymeme is an inexplicitness of assumption. Since natural
argument rarely spells out all its assumptions, enthymeme is a

177

Communication
prevalent cause of misalignment. Of course, there is a fine line
between equivocation and enthymeme. If by dog you mean all
members of Canis canis in the current domain, and I mean only
the ones who arent bitches, this could be described as equivocation in the interpretation of a term (here across parties) or enthymatic suppression of a premise by one party (i.e., all relevant dogs
are male). Modern developments of logic have provided formal
theories of this process of reasoning from general knowledge and
contextual specificities to mutual interpretations. But in logics
twentieth-century detour into the foundations of mathematics
and the possibilities of knowledge engineering, this model of
communication has largely been lost.
Discourses come sentence by sentence, and each sentence
has complicated and generally nonmonotonic effects on the
structure of the context in which subsequent sentences will be
interpreted. Entities (people, objects, states, processes, events)
get added to, but also subtracted from, the current model of
the discourse. These effects are functions of both linguistic and
nonlinguistic long-term knowledge, current perceptual circumstance, and much else besides. So the meaning contributed by
each sentence to the discourse is a complex function of more
than its own or any other sentences structure. The general conception of discourse semantics has been developed with great
sophistication by H. Kamp and U. Reyle (1993) and their colleagues under the banner of discourse representation theory and
indeed by other approaches to discourse semantics, though still
with very little explicit connection to communication.
Whereas the classical logic of proof is monotonic in the sense
that adding more assumptions never removes valid conclusions,
defeasible logics of interpretation are nonmonotonic as new
assumptions are added, earlier conclusions may be subtracted.
These defeasible logics for reasoning to interpretations model
the process of discourse interpretation. In a monologue, hearers attempt to construct an interpretation of a speakers utterances that makes the statements true, bringing with them all
available general knowledge and contextual information. When
things go smoothly, this returns a unique minimal model at
every stage (van Lambalgen and Hamm 2004; Stenning and van
Lambalgen 2007). The reason these defeasible logics can yield
unique intended models for discourses against a background of
large bodies of general knowledge is the extensive deployment of
closed-world reasoning. If there is no evidence for the relevance
of facts in the model, then we can conclude that they are not relevant and at least not yet there. There are many subtleties about
how we close the world in constructing the intended model, but
they are variants on this same idea. Whereas in classical logic
we have to search for counterexamples in usually infinite sets of
logically possible models, closed-world reasoning gets us down
to single, small, intended models of the discourse at each point
in its development.
To adapt an example from an experiment designed to invoke
defeasible reasoning to an interpretation, when we are presented
with the following discourse,
She has an essay.
If she has an essay, she is in the library.

we duly conclude that she is in the library. But when we then


encounter, as the next sentence,

178

If the library is open, she is in the library.

we may use our general knowledge and may withdraw the inference we made before, constructing a model that can be summarized as:
If she has an essay, and the library is open, she is in the library.

The specification of the models that are the objects of communication on this theory is an achievement of a successful discourse.
Each specification of a situation determines a set of situations differing from it by permutations of the specification: She
does/doesnt have an essay and the library is/isnt open, and
Because of closed-world reasoning, there are no other students
or libraries or essays nor indeed much else in the current model
until we hear about them. It is these models that can be thought
of as Shannon and Weavers code books. Far from getting the
book down from the shelf at the outset of the discourse, it is only
at each stage of development of the discourse that we can see
what code book has been specified. But note that this was never
Chomskys complaint about Shannon and Weavers model. The
complaint was always about the infinity of the code, not about its
on-line construction.
Such defeasible logics formalize the implicit model of communication dominant in psycholinguistics the field that has
most concerned itself with the empirical study of the process of
interpretation. The intended models correspond to what in psycholinguistics is known as the gist of the discourse, or what W.
Kintsch (1988), for example, calls the situation model. It is this
gist that was shown to be rapidly extracted from discourses in
classical studies of text comprehension.
Notice that the most plausible kinds of Shannon and Weaver
signaling within the context of a fully specified code book have to
do with temporal changes of state: She now has an essay to write
and is in the library. She now doesnt have an essay and she is in
the library (shes finished perhaps). She now has no essay and is
not in the library (having left?) what one might call the monologue of the surveillance camera. There are kinds of human communication that are like this (e.g., the stock ticker perhaps), but
one only has to consider such examples to realize what a minor
part they play. Creating local mutual interpretations is not signaling within their possibilities.
Here is the beginnings of a general abstract model of human
communication based on logical theories of discourse processing. In this model, communication is the construction of mutual
interpretations for discourses. It requires an attendant theory
of the structure of a language and of the organization of general
knowledge databases, which might be fully consonant with linguistic theories. But it is not to be confused with such theories.
Its objects of communication are models of discourses (interpretations that make them true), not sentences or meanings.
The contrast between this and other sentence-based theories of
communication can be well illustrated by considering the case of
soliloquy. We do indeed talk to ourselves, either audibly or not,
and intuitively we talk to ourselves for some of the same reasons
we talk to other people, including to help understand what we
believe or want, to formulate a course of action, to persuade ourselves to follow resolutions, to understand what someone said to
us, or to weigh up pros and cons. Needless to say, this process is

Communication

Communication, Prelinguistic

extremely important in learning, as is well testified in the empirical literature (e.g., Chi et al. 1989). To adapt an old saying, talking
to oneself is the first sign of sanity, or at least the search for it.
There is little temptation to understand soliloquy in terms
of the reduction of uncertainty, but we can apply the same logical model of communication as we use for public discourse. We
would not need to talk to ourselves if our knowledge and belief
were a transparent, homogeneous, consistent database of facts
and principles driven by unconflicted motivations. We equivocate and suppress our assumptions in internal argument just
as well as in public, and successful argument with ourselves
can lead to the same kinds of revision in order to gain coherent
interpretations.
Certainly there are differences between soliloquy and public
dialogue, but there are also enormous overlaps. In pursuing our
goals, it may often be a pragmatic matter of convenience whether
we choose to talk to ourselves or a conversational partner. These
functions of soliloquy are functions shared with dialogical communication, and they are functions that have been neglected in
our thinking about communication.
One could, of course, reject the notion that soliloquy is communication and define away these barriers to the consignment
of communication to Shannon and Weaver, but deeper considerations indicate that to do so is to miss much of what is crucial
about public communication. One should observe that this
model is not so incompatible with Chomskys deeper views as at
might at first appear. For example, his objections to functional
linguists who would see language shaped only by public communication is that language evolution may have been driven as
much by the advantages of an internal medium for representation and reasoning as by one for public communication. But
with a more adequate theory of what communication is, and by
dropping the idea that communication is automatically public,
this view is entirely consistent with our claim that communication is about achievement of coherent interpretation, whether by
public utterance or internal soliloquy.
Lastly, the model may help to reconnect the cognitive and
the affective perspectives on phatic and ideational communication alluded to at the outset. One observation is that the fundamental basis of ideational communication is the achievement of
mutually aligned interpretations. The process of getting to these
is, by definition, phatic communication it creates community
through shared interpretation of language. We perhaps forget how disturbing are our rare experiences of complete failure
to achieve this happy state and, in so doing, fail to see that our
cognitive theories of ideational communication contain within
them abstract specifications of just what has to be achieved and
maintained phatically, along with abstract accounts of some of
the processes by which this might be done.
Keith Stenning
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chi, M., M. Bassok, M. Lewis, P. Reimann, and R. Glaser. 1989. Selfexplanations: How students study and use examples in learning to
solve problems. Cognitive Science 13: 14582.
Halliday, M. A. K. 2004. An Introduction to Functional Grammar.
London: Arnold.

Kamp, H., and U. Reyle. 1993. From Discourse to Logic: Introduction


to Model Theoretic Semantics of Natural Language, Formal Logic
and Discourse Representation Theory. Part II, Vol. 42, of Studies in
Linguistics and Philosophy. Dordrecht, the Netherlands: Kluwer
Academic Publishers.
Kintsch, W. 1988. The role of knowledge in discourse comprehension: A
construction-integration model. Psychological Review 95: 16382.
Levinson, S. C. 2000. Presumptive Meanings. Cambridge, MA: MIT Press.
Malinowski, B. 1923. The problem of meaning in primitive languages.
Supplement to The Meanings of Meanings: A Study of the Influence of
Language upon Thought and the Science of Symbolism, ed. C. K. Ogden
and I. A. Richards, 451510. London: Routledge and Kegan Paul.
Sperber, D., and D. Wilson. 1986. Relevance: Communication and
Cognition. Oxford: Blackwell.
Stenning, K., and M. van Lambalgen. 2007. Human Reasoning and
Cognitive Science. Cambridge, MA: MIT Press.
van Lambalgen, M., and F. Hamm. 2004. The Proper Treatment of Events.
Oxford and Boston: Blackwell.

COMMUNICATION, PRELINGUISTIC
Over the last 30 years, the field of developmental psychology
has devoted considerable research to prelinguistic communication, defined, most generally, as the sharing of information
prior to the onset of language. Because language onset is usually identified with the first spoken words, the prelinguistic
period encompasses roughly the first 12 to 18 months. Research
reviews have been framed both in terms of age-related changes
in infants interest in, and behavior during, social interactions
(e.g., Reddy 1999) and in terms of milestones for behaviors specifically related to communication, such as visual regard, turntaking, and gesture (Dromi 1993). These two reviews point to a
general agreement about the infant behaviors that are relevant to
prelinguistic communication (e.g., gesturing and visual regard)
and about the ways in which those behaviors change during the
first year of life. The major controversies in this area concern how
active a role infants play in structuring early episodes of communication and how changes in cognitive functioning relate to
changes in prelinguistic communication.
Shortly after birth, infants recognize familiar people, and
they can recognize their mothers by voice, face, and even by
smell (e.g., DeCasper and Fifer 1980). These perceptual capacities set the stage for infants social interactions to play a special role in prelinguistic communication; infants are interested
in and responsive to their caregivers, who interpret their early
social behaviors as communicative. Even if infants are not yet
aware that others have emotions or ideas to share or that they
themselves might have the same, the fact that the social world
treats them as communicative partners is viewed as a critical feature of social-pragmatic theories of language acquisition (e.g.,
Tomasello 2006).
Social smiling emerges at roughly six to eight weeks of age
and helps to mark the beginning of face-to-face, or en face,
interactions with caregivers, which are characterized by vocal
turn-taking and by the sharing of affect (see review by Adamson
2003). Although the role of the infant in holding up the structure
of these early en face interactions is controversial, it is clear that
infants take an even greater role in initiating social interactions
and maintaining their structure during the middle of the first

179

Communication, Prelinguistic
year. This three-to-eight-month age range has been characterized as a time when infants become increasingly interested in
regularity and surprise. Conventional games such as peek-a-boo
become prominent, and infants begin to take the lead in initiating these games as well as their turns in them.
Infants interest in the attention of others during the last quarter of the first year has been viewed as an important milestone
in prelinguistic communication. For the first time, there is joint
attention, which refers to episodes when infants and partners are
both engaged with the same object-in-the-world. Infants readily
follow the gaze of their partner and attempt to engage them in
attending to the object of interest. Infants use expanded means
to garner the attention of others, including giving objects, showing, and pointing. Gestures, especially pointing, have been the
subjects of intense study in the prelinguistic period because gestures may be used to share ones focus of attention with another
or to direct the attention of another (e.g., Bates 1979).
Also of great interest in late infancy is how new achievements in cognition might relate to changes in communication.
According to Jean Piagets theory of cognitive development (e.g.,
1983), infants begin, in the latter half of their first year, to understand that objects exist when out of sight and that objects exist
independent of our actions on them. This achievement, termed
object permanence, is theorized to be a critical part of the development of symbolic functioning. The words of language are
symbols, that is, arbitrarily spoken or written units that stand for
other objects and events. Thus, the achievement of object permanence and symbolic functioning are important milestones
setting the stage for the onset of formal language.
It is probably important, however, to view the transition
between prelinguistic communication and formal language as
neither abrupt nor all-or-none. Clearly, there continue to be
important relations between cognitive development and language after infants speak their first words; just as clearly, nonsymbolic forms of communication continue throughout the life
span (as, for example, in communication via physical actions or
emotional expressions).
Current research in developmental psychology is focusing on
processes that might be specific to language learning, as well as
on more general cognitive processes (such as categorization)
that might be involved in early word learning (see review by
Hollich, Hirsh-Pasek, and Golinkoff 2000). For example, infants
ability to find patterns in auditory stimuli or to group objects
together if they share similar attributes are general cognitive processes that relate to the problem of learning a language (Hollich,
Hirsch-Pasek, and Golinkoff 2000; Tomasello 2006). Delays in
these milestones of prelinguistic communication, and in the others
noted here, have been the subject of early intervention programs,
and deficits in prelinguistic communication skills have even been
linked to specific developmental disorders, such as autism.
James A. Green and Gwen E. Gustafson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Adamson, Lauren B. 2003. The still face: A history of a shared experimental paradigm. Infancy 4: 45173.
Bates, Elizabeth. 1979. The emergence of symbols: Cognition and communication in infancy. New York: Academic.

180

Communicative Action
DeCasper, Anthony, and W. Fifer. 1980. Of human bonding: Newborns
prefer their mothers voices. Science 12: 30517.
Dromi, Ester. 1993. The development of prelinguistic communication. In At-Risk Infants: Interventions, Families, and Research, ed. N.
Anastasiow, 1926. Baltimore: Brookes Publishing.
Hollich, George J., K. Hirsh-Pasek, and R. M. Golinkoff. 2000. Breaking
the language barrier: An emergentist coalition model of the origins
of word learning. Monographs of the Society for Research in Child
Development 65.3: 1-137.
Piaget, Jean. 1983. Piagets theory. In Handbook of Child Psychology.
Vol. 1: History, Theory, and Methods, ed. P. Mussen, 10326. New
York: Wiley.
Reddy, Vasudevi. 1999. Prelinguistic communication. In The
Development of Language, ed. M. Barrett, 2550. Hove, East Sussex,
UK: Psychology Press.
Tomasello, Michael. 2006. Acquiring linguistic constructions.
In Handbook of Child Psychology. Vol 2: Cognition, Perception,
and Language, 6th ed. Ed. D. Kuhn and R. Siegler, 25598. New
York: Wiley.

COMMUNICATIVE ACTION
Communicative action is a term introduced by Jrgen Habermas
as part of his attempt to develop a general theory of action for
the social sciences. Among social theorists, it is widely believed
that a purely instrumental (or economic) model of rational action
is unable to account for the orderliness and stability of human
social interaction (Parsons 1968). Classical sociological theorists,
from Max Weber (1978) to Talcott Parsons (1951), tried to remedy
this defect by positing some additional category of value-oriented
or norm-governed action that imposed constraints on the range
of strategically optimizing behavior. Absent from this analysis,
however, was any precise specification of the role that language
played in mediating social interaction. Indeed, in many cases, it
was unclear how speech was supposed to fit into the theory of
action at all (Cicourel 1973, 21).
Habermas took as his point of departure the observation
that not only was a purely instrumental model of rational action
unable to explain the orderliness of social interaction, but it was
also unable to supply an adequate pragmatics for a theory of
meaning. So instead of looking to values or norms for a specification of the structure of noninstrumental rational action, he
turned to speech-act theory. In particular, he looked to the
notion of illocutionary force, as developed by J. L. Austin
and John Searle (Habermas 1984, I: 293; Austin 1975; Searle 1969).
His central intuition was that the limitations of Gricean (Grice
1989), or intentionalist (see communicative intention)
semantics might both reveal the limitations of a strictly instrumental approach to understanding the illocutionary dimension
of speech-acts and provide some indication of the structural
features that a noninstrumental theory of rational action should
exhibit. Once an account of the rationality of speech acts was
developed, his thought was that this could be extended to provide a more general account of the rationality of linguistically
mediated interactions. It is the latter category of action that he
refers to as communicative action.
Although this theory is of primary relevance to social scientists, it is also important to the study of language. Because of the
constraints imposed by the compositionality requirement,

Communicative Action

Speech Acts

action. On the contrary, in producing an utterance, Habermas


argues, speakers always associate a validity claim with its content, essentially extending a warrant to the effect that the relevant norms governing its production have been satisfied. In the
case of assertions, this takes the form of a truth claim (which is
why, Habermas claims, to assert something is to assert it as true).
He then generalizes this analysis to suggest that imperatives are
produced with an associated rightness claim, which warrants
that the action mandated is in fact the correct one to perform. He
argues also that expressives are produced with an associated sincerity claim, which he analyzes in an analogous manner (Heath
1998). Appealing to Michael Dummetts (1993) assertabilityconditional semantics, he then argues that grasping the conditions under which these validity claims are satisfied constitutes
an understanding of the meaning of an utterance. Thus speechacts, insofar as they are meaningful, are necessarily governed by
a noninstrumental pragmatics.

Standard noncooperative game theory (or rational choice theory;


see games and language), which provides the canonical
modern formulation of the instrumental conception of rational
action, is the most widely adopted model of rational action in the
social sciences. It is, however, not a candidate for adoption as a
general theory of rational action because the model explicitly
excludes any communication between the parties to an interaction (Nash 1951) and prohibits any action from having semantic
content (see Farrell 1993). Furthermore, when these restrictions
are lifted, the standard equilibrium solution concepts are no longer valid, and no theorist has yet succeeded in developing correlates that exhibit the same stability properties (Farrell 1993;
Heath 1996). In other words, linguistic communication so far
does not fit into the model of action favored by rational choice
theorists.
It is against this background that Habermass theory of
communicative action must be assessed. The central question
is: What properties do speech-acts possess that make them
unsuitable for purely instrumental use? The most obvious
answer, in the case of assertions, is that they are subject to a
norm of veridicality (and produced, in the standard run of cases,
with at least the pretense of satisfying that norm). Among philosophers of language, this norm is commonly regarded not as a
convention that happens to govern the production of assertions
but as a norm that is internally connected to the meaningfulness of these expressions (see truth conditional semantics). Absent such a norm, not only would no utterance be
credible, but it is not clear that the language itself would even
be learnable. If speakers simply claimed whatever happened to
be in their interest at the time to claim, the connection between
semantic conventions and patterns of use would essentially be
scrambled.
Habermas articulates this idea by claiming that in order to
produce a meaningful utterance, speakers must adopt what he
calls the performative stance, whereby they bracket the more
mundane instrumental objectives that they may be pursuing
and adopt the standard intracommunicative objective of reaching mutual understanding. This is essentially a cooperative
undertaking, and so even though it may be pursued as a means
to securing other, extracommunicative objectives (indeed, this
is almost always the care), it is not itself a system of instrumental

Habermas concludes, on this basis, that instrumental action and


speech constitute two elementary forms of action (1998, 118).
The former is oriented toward success in the attainment of some
objective; the latter is oriented toward mutual understanding in a
process of communication. Naturally, the term elementary form
should not be taken to suggest that language is a presocial phenomenon. The point is simply to identify two orientations that
the agent is able to assume toward his or her environment before
considering the implications of introducing a second rational
agent into the frame of reference. The introduction of a second agent, in Habermass terms, generates social action. Social
action, in this view, is a complex phenomenon constructed out of
the interaction of the two elementary forms. The most immediate consequence of introducing a second agent is that it places
them both in the position that Parsons referred to as double
contingency what the first agent wants to do will depend upon
what he or she expects the second to do and vice versa (Parsons
1951, 1011). Thus, agents engaged in interaction are always in a
position where they must coordinate their action-plans, even if
this means simply developing a stable set of expectations against
which they can each proceed to pursue their private objectives
(Habermas 1998, 221).
In Habermass view, this problem of interdependent expectations can be resolved by drawing upon the resources of either
elementary action type. When instrumental action is assigned
priority, social action takes the form of strategic action, in the
standard game-theoretic sense. In this context, the resources of
language are used only to supply the content of the intentional
states beliefs and preferences that serve as parameters of the
strategic optimization problem. However, when the resources
of language are used to resolve the coordination problem, this
use generates the form of action that Habermas refers to as
communicative action. The difference is that communicative
action draws upon the commitments made, in the form of validity claims, in order to limit the range of action alternatives that
are available (thus, the consensus achieving force of linguistic
processes of reaching understanding the binding and bonding
energies of language itself becomes effective for the coordination of action [Habermas 1998, 221]).

any plausible approach to the theory of meaning must incorporate some sort of division of labor between the semantics
and pragmatics (with the former taken to have a compositional
structure, the latter typically not). Theorists of language have,
however, sometimes been naive when it comes to understanding
the constraints that the theory of action imposes upon the pragmatics. For example, it is often simply assumed that individuals
are capable of rule-following at the level of social action; yet
rule-following is, at the level of general action theory, a deeply
contested if not entirely problematic concept (e.g., see Bicchieri
1993). Habermass concept of communicative action is important for showing not only how action theorists might learn from
contemporary developments in the study of language but also
how theorists interested in language might profit from greater
attention to the structure of social action.

Communicative Action

181

Communicative Action

Communicative Intention

It is important to note that communicative action is not the


same as speech. It is a form of teleological action, in the sense
that agents continue to pursue extralinguistic objectives. The
distinguishing characteristic is that they use language in order
to solve the problem of double contingency by establishing a
set of shared goals and norms (rather than simply using language to identify background beliefs and preferences). This use
of language as an explicit coordination mechanism imposes
constraints on the type of goals that agents can pursue and the
means they can employ. Thus, communicative action, despite
being teleological in form, is not merely a species of instrumental
action. At most, it represents a type of constrained instrumental
action. Thus, it is the use of language to coordinate social interaction, Habermas claims, that provides the explanation for the
orderliness and stability of human social interaction. The central
error of the classical sociological action theorists, according to
this view, rested with their focus upon a practical, rather than a
communicative, conception of rationality (Habermas 1996, 3).

Practical Discourse
Finally, it is worth mentioning a further distinction in Habermass
work, between communicative action and practical discourse.
The orderliness of linguistically mediated interactions (i.e., communicative action) is achieved by the binding-bonding effects
of the validity claims raised within speech-acts. It is the rational
acceptance of these claims by listeners that makes it rational, in
turn, for them to accept any constraints on their conduct that
may arise as a consequence. However, this process of acceptance
is usually only tacit and in many cases relies merely upon the
speakers warrant. This means that should the listener suddenly
experience doubts during the course of the subsequent interaction, it is always legitimate for him or her to go back and demand
further justification (i.e., request that the speaker redeem some
validity claim that was associated with the speech-acts). As a
result, the potential for critical scrutiny of social practices is
always present, in every society and culture, even if not explicitly
institutionalized. Such a demand for justification interrupts the
sequence of communicative action and shifts the participants
into discourse, where contested validity claims are reflexively
thematized and debated. Contested rightness claims are discursively tested in a forum that Habermas refers to as practical discourse, which is governed by a set of distinctive inference rules,
in particular a universalization rule that serves as the foundation
for the theory of discourse ethics. The distinction between practical discourse and communicative action is important, in this
regard, because it is only the former that is directly governed by
the universalization constraint.
Joseph Heath
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Austin, J. L. 1975. How to Do Things with Words. 2d ed. Cambridge: Harvard
University Press.
Bicchieri, Cristina. 1993. Rationality and Coordination. Cambridge:
Cambridge University Press.
Cicourel, Aaron. 1973. Cognitive Sociology. Harmondsworth, UK:
Penguin.
Dummett, Michael. 1993. What is a theory of meaning? (II) In The Seas
of Language, 3493. Oxford: Clarendon Press.

182

Farrell, Joseph. 1993. Meaning and credibility in cheap-talk games.


Games and Economic Behavior 5: 51431.
Grice, H. P. 1989. Studies in the Ways of Words. Cambridge: Harvard
University Press.
Habermas, Jrgen. 1984. The Theory of Communicative Action. 2 vols.
Trans. Thomas McCarthy. Boston: Beacon Press.
. 1996. Between Facts and Norms. Trans. William Rehg. Cambridge,
MA: MIT Press.
. 1998. On the Pragmatics of Communication. Ed. Maeve Cooke.
Cambridge, MA: MIT Press.
. 2001. On the Pragmatics of Social Interaction. Trans. Barbara
Fultner. Cambridge, MA: MIT Press.
Heath, Joseph. 1996. Is language a game? Canadian Journal of
Philosophy 26: 128.
. 1998. What is a validity claim? Philosophy and Social Criticism
24: 2341.
. 2001. Communicative Action and Rational Choice. Cambridge,
MA: MIT Press.
Nash, John. 1951. Noncooperative games. Annals of Mathematics
54: 28995.
Parsons, Talcott. 1951. The Social System. New York: Free Press.
. 1968. The Structure of Social Action. 2 vols. New York: Free Press.
Parsons, Talcott, and Edward Shils, eds. 1951. Towards a General Theory
of Action. New York: Harper and Row.
Searle, John. 1969. Speech Acts. Cambridge: Cambridge University Press.
Weber, Max. 1978. Economy and Society. 2 vols. Ed. G. Roth and C.
Wittich. Berkeley: University of California Press.

COMMUNICATIVE INTENTION
Late twentieth-century discussion of the nature of communicative intention was dominated by the theories of British
philosopher Herbert Paul Grice. Grice initially (1957) argued
that the primary intended effect of an indicative utterance
was to get the hearer to believe the proposition expressed; an
essential component of this communicative intention was the
intention to have this effect be achieved through the hearers
recognition of that intention. He eventually acknowledged that
there were counterexamples to this analysis and subsequently
(1968; 1969, 1712) proposed that the primary communicative intention must be that the hearer should at least come to
believe that the utterer has some particular thought or belief.
Grice also allowed that speakers need not intend to change
the attitudes of some specific, actual audience; instead, this
part of the communicative intention concerns what is meant
to happen should there be an audience having such-and-such
characteristics.
Setting aside some of the many refinements (1989, 86116),
Grices characterization of communicative intention runs as
follows:
An utterance, U, is made with a communicative intention if and
only if the utterer, S, utters U with an intention comprised of three
subintentions:

(1) Ss utterance U is to produce a certain response, R, should


there be an audience, A, having characteristics, C;
(2) A is to recognize Ss intention (1);
(3) As recognition of Ss intention (1) is to function as at
least part of As reason for having response R.

Communicative Intention
It is assumed here that S communicates a belief to some audience, A, just in case As recognition of Ss communicative intention yields R, where R is the formation of the relevant belief in
A. For example, Sally said Congratulations! with communicative intent just in case she meant her saying to congratulate
the person to whom she was speaking and meant for this intention to be recognized by that person; she also had to intend for
that very recognition to be a reason for the recognition of the
congratulation.
Peter F. Strawson (1964) challenged the sufficiency of the
loop or mechanism constituted by subintention (3). Suppose that
Karen thinks that if her tennis racket is lying on the kitchen table,
her friend Laura will think Karen plans to play tennis that day.
Karen knows that Laura is watching her, and she also knows that
Laura does not know that Karen knows Laura is watching. Karen
then puts the racket on the table with the intention of getting
Laura to believe Karen plans to play tennis. Karen also intends
that Lauras recognition of the latter intention will give Laura
reason to believe that Karen in fact means to play tennis. Thus,
all three clauses in Grices definition have been satisfied. Yet in
such a situation, Strawson argues, Karen has not communicated,
at least in Grices sense, to Laura that she plans to play tennis. So
what is missing? Karen must intend not only that Laura recognize her intention to get Laura to think she plans to play tennis
but also that Laura recognize her intention to get Laura to recognize her intention to get Laura to think so.
Grice and other philosophers (e.g., Schiffer 1972; Holdcroft
1978; Recanati 1986) have explored various responses to the
problem raised by Strawson. Grices own preferred response to
the problem was to allow that meaning requires an infinite set of
intentions. Yet this condition is to be understood as defining the
optimal state in relation to which actual communicative states
are measured. He contends, then, that strictly speaking, no
speaker actually means that p in the sense of actually having the
set of infinite intentions required for ideal, non-natural meaning,
but he adds that the speaker is in a situation which is such that
it is legitimate, or perhaps even mandatory, for us to deem him
to satisfy this unfulfillable condition (1989, 302). Grices justification for this move finds its roots in his views concerning the
status of the normative rationality assumptions relied upon in
the entrenched and self-justifying system of both everyday and
philosophical psychology. He evokes the difference between the
titular and factual character of an utterance, where the former is
its idealized, rational character, never actually present in toto,
and where the latter could be a matter of a pre-rational counterpart of meaning (1989, 856). Yet it seems unsatisfactory
to conclude that Karen could never, strictly speaking, actually
communicate to Laura that she wants to play tennis!
Wayne A. Davis (2003) develops an alternative approach to
the relation between communication and semantic intentions.
Not all instances of communication are intentional, and cases
of intentionally communicating something to someone are analyzed as doing something that expresses a mental state, where
this action is the basis of an audiences recognition that the
mental state is expressed. What is expressed depends on what
is intended, but that does not mean that the hearer has to recognize the speakers communicative intention for intentional
communication to take place. The intentional component of

Comparative Method
expression is the performance of an observable action as an
indication of some attitude, where some x indicates some y
whenever x provides some (possibly unreliable) evidence that
y is the case.
Paisley Livingston
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bennett, Jonathan. 1976. Linguistic Behavior. Cambridge: Cambridge
University Press.
Davis, Wayne A. 2003. Meaning, Expression, and Thought.
Cambridge: Cambridge University Press.
Grice, Herbert Paul. 1957. Meaning. Philosophical Review 66: 37788.
. 1968. Intentions and speech acts. Analysis 29: 10912.
. 1969. Utterers meaning and intentions. Philosophical Review
78: 14777.
. 1989. Studies in the Way of Words. Cambridge: Harvard University
Press.
Holdcroft, David. 1978. Words and Deeds: Problems in the Theory of
Speech Acts. Oxford: Clarendon.
Recanati, Franois. 1986. On defining communicative intentions. Mind
and Language 1: 21342.
Schiffer, Stephen. 1972. Meaning. Oxford: Clarendon Press.
Strawson, Peter F. 1964. Intention and convention in speech acts.
Philosophical Review 73: 43960.

COMPARATIVE METHOD
Genetic Relatedness and Common History
Languages sharing a period of common history in a single ancestral language are genetically related. The comparative methods
(henceforth CM) goal is to demonstrate genetic relatedness by
identifying similarities attributable to retention from common
history. Demonstrations of genetic relatedness are presented as
reconstructions (or proto- forms) in the hypothetical ancestral
system. Genetically related languages comprise a language
family, relationships within which are identified through
female kin terms. A form inherited in a daughter is a reflex of the
mothers form. Reflexes of the same form in different languages
are cognates.
Languages sharing a period of common history independent
of the rest of the family constitute a subgroup. Subgrouping is represented as a family tree. (Wave models are an alternative permitting representation of overlapping shared innovations. Textbooks
like Hock [1986, Chap.15] compare the two.) In determining
genetic relatedness, one seeks evidence of shared retention from
a proto-language; in subgrouping, one seeks evidence of shared
innovation after the family began to diverge. In subgrouping, one
must identify unique events common to the history of the subgroup. The features defining the subgroup must demonstrably
not be retentions from an earlier period of common history. (On
subgrouping arguments, see Harrison 2003, 3.2).
Because a language is not an organism passing on genetic
material but behaviors and underlying knowledge, common
history is a property not of languages but the constructions constituting them (see construction grammars). A borrowing
and its source also share common history, and so genetic relatedness privileges common history involving transmission between
individuals speaking the same language. For mixed languages

183

Comparative Method
Table 1.

Trukese

plant

paddle

needlefish

thin

forehead

pandanus

fold

ftuk-i

ftun

taak

mlifi-lif

chaamw

faach

n-num

Mokilese

poadok

padil

doak

manip-nip

soamw

-par

lim

Gilbertese

arok-a

arina

raku

m-manii

ramwa

ara-

num

PMC

*faSok-

*faSla

*[sS]aku

*ma-nifi(nifi)

*camwa

*faca

*lumi

Table 2.
Trukese

ch

mw

aa

Mokilese

mw

oa

oa

*o

*i

Gilbertese

mw

PMC

*f

*S

*k

*l

*c

*m

*mw

(Thomason 2001, 70 ff), speaking the same language is difficult


to define. Genetic relatedness may require redefinition to incorporate language mixing.

Defining the Comparative Method


The CM emerged from nineteenth-century research largely on
Indo-European. It demonstrates genetic relatedness by distinguishing cross-linguistic similarities due to retention from those
due to chance, borrowing, or the nature of language. Four (not
necessarily mutually exclusive) approaches to identifying shared
retentions are now considered.
NEGATIVE SIEVING. Similarities due to borrowing or the nature of
language must be identified. One expects similarity in onomatopoetic and schematic constructions, where the form is to a degree
iconic of the meaning. Such natural similarities are eliminable by
restricting comparison to symbols, whose form/meaning relation
is arbitrary. (On restrictions on the CM, see Harrison 2003.) There is
less consensus regarding what is (not) likely to be borrowed. Some
have argued for a core lexicon resistant to borrowing. For an assessment, see Thomason (2001, 71 ff). Although constructions with
grammatical meaning (like conjunctions or adpositions) are less
often borrowed than those with lexical meaning (like most nouns
or verbs), Sarah Grey Thomason and Terrance Kaufman (1988)
have demonstrated that, in principle, anything can be borrowed.
REGULAR SOUND CORRESPONDENCE AND THE STANDARD CM.
Most comparativists identify demonstrations of genetic relatedness through regular sound correspondence as the standard CM.
To rule out chance and, to some extent, borrowing as accounts of
similarity, the standard CM exploits the neogrammarian movements regularity assumption that every sound change, inasmuch as it occurs mechanically, takes place according to laws
that admit no exception (Osthoff and Brugmann 1967, 204).
The Micronesian data and Proto-Micronesian (PMC) reconstructions (largely from Bender et al. 2003) in Table 1 exemplify
regular sound correspondence (where - indicates a morpheme
boundary) from which emerge the correspondences in Table 2.
Multiple correspondences with a single reconstructed phoneme reflect context-dependent sound changes. In Tables 1
and 2, they include the loss of short final vowels in Trukese and

184

*a

*u

Mokilese, raisings conditioned by the vowel in the following


PMC syllable, and lengthenings, by PMC syllable structure. Each
correspondence set, or set of conditioned sets, is reconstructed
as a phoneme of the proto-language (see historical reconstruction or textbooks like Campbell 2004, Chap. 5).
The regularity assumption is vacuous unless the sound
changes yielding conditioned correspondences are constrained,
since any correspondence is regular if its conditioning environment is sufficiently narrow (for example, a single morpheme).
The neogrammarians restricted conditioning to purely phonetic
environments (see Hale 2003, 343).
Chance similarities like Mokilese padil paddle are identifiable because there is no regular correspondence between
Mokilese /p/ and English /p/, as there is between Mokilese /p/,
Trukese /f/, and Gilbertese . Of course, distinguishing regular
from chance correspondences is a function of token frequency.
Lexical replacement over sufficient time or in some contact
situations may reduce the number of tokens exhibiting regular
correspondence so that regular correspondence is indistinguishable statistically from chance. (Statistical methods are seldom
used in the standard CM. Quantitative methods proposed in
subgrouping include lexicostatistics and, more recently, computational cladistics see McMahon and McMahon 2005.)
Borrowings may be identifiable similarly if their number is
small. Large-scale borrowing is often recognizable as a parallel
set of apparently regular correspondences, as in Rotuman. Bruce
Biggs (1965) was able to associate one set of Rotuman correspondences with native vocabulary and another with Polynesian borrowings. Latinate borrowings into English might be identifiable
similarly if we did not already know their history.
The identifiability of borrowings as irregular or parallel correspondences is inhibited when:
(i) the source language cannot be identified, or:
(ii) the source is a related language spoken by many target
language speakers, who apply their knowledge of source/target correspondences to nativize borrowings, or:
(iii) later phonological changes mask earlier borrowings.
In cases of massive borrowing across many languages, as
reported in Grace (1996) for New Caledonia, the number of

Comparative Method
regular correspondences proliferate to the point that one must
reconstruct a proto-language phonemic inventory far larger than
that of any of its daughters and possibly larger than one would
consider natural. In such cases, the standard CM fails.
It is vital in comparison that there be some measure of similarity to show that we are comparing likes with likes. The CM
says little about similarity in meaning. Comparativist practice
favors meaning identity. Since semantic similarity remains illdefined, one is guided by experience and common sense. For
phonetic similarity, we can appeal to phonetic theories. What
is seldom appreciated is that the standard CM does not need a
theory of phonetic similarity because the regularity assumption
is a stand-in. We dont need to know that two sounds are similar,
only that there is a regular correspondence between them. Much
of the data for modern theories of phonetic similarity undoubtedly came from regular correspondences identified by the CM.
The empirical validity of the regularity assumption has been
controversial from the outset. Those opposed to the neogrammarian position asserted that each word has its [own] history.
A current manifestation of this opposition is lexical diffusion
(Chen and Wang 1975), the view that sound changes move
through the lexicon, affecting different words at different times.
Words yet unaffected will appear to be exceptions to the change.
For example, the shortening of Early Modern English (ENE) /u:/,
as in good (ENE /gu:d/, English /gd/), has yet to affect food, and
has affected roof only in some dialects.
Examples of nonphonetic conditioning are less often cited. The
Micronesian [aa]-[oa]-[a] correspondence set in Table 2 might be
such a case. In Trukese and some other Micronesian languages,
lengthening affects only the V1 of PMC (C1)V1(C2)V2 nouns. If, as
other evidence suggests, these nouns are the residue of a process
affecting all prosodic phrases, phonetic conditioning is preserved.
Since the regularity assumption is crucial to the CM, it may
be problematic were it proven false. Though William Labov
(1981) argues that the regularity assumption holds for one class
of sound changes while others diffuse through the lexicon, we
need not rely on his assessment to save regularity. Sound change
might begin variably, but given enough time, it moves toward
regularity. (See Durie and Ross 1996 for a range of perspectives
on the regularity assumption.)
SHARED ABERRANCIES. A shared aberrancy is a correspondence
between lexically or morphologically related forms so unusual as
to be unattributable to chance or borrowing. An oft-cited example is the 3s/3p alternations in the present tense of to be in IndoEuropean:

chance also qualify as individual identifiers. Nichols (1996, 50)


cites Proto-Indo-European *widhew-a widow. It is crucial that
her example is a reconstruction. The existence of that word may
be statistically unattributable to chance, but one must have confidence that its reflexes (including Sanskrit vidhav, Greek itheos,
Latin vidua, Old Irish febd Russian , OldEnglish widuwe)
are sufficiently similar to be instances of the same word. Regular
sound correspondences are necessary to give that confidence.
MASS COMPARISON. The logic of mass comparison (as in
Greenberg 1987) is that by identifying a very large number of
similar constructions in many languages, one statistically rules
out similarity due to chance or to borrowing. Most comparativists do not regard mass comparison as an instance of the CM. The
volume of criticism leveled against it has been vast. (For a short
review, see McMahon and McMahon 2005, 1926.)
The fundamental objection is the same as that just raised to
the independence of shared aberrancies. Mass comparison provides no measure of similarity. There is no statistical reason to
consider significant the identification of numerous vague similarities in many languages unless attested in the same forms in
most of the languages compared. That does not seem to be true
in those cases in which mass comparison has been used.

Summary
Any method for determining genetic relatedness must provide a
similarity measure and a means of distinguishing shared retentions from other sources of similarity. None is without flaws and
limitations. The standard CM is unique in defining similarity
through regularity. In some cases it will fail, but less often in principle than the search for shared aberrancies, which depends on
the existence of data of a restricted sort.
The quantitative methods remain to be tested, but there is
reason to doubt that they can replace the standard CM, or even
supplement it where the latter fails. These methods have been
applied to the subgrouping problem, not the genetic relatedness problem, and the two differ crucially. The subgrouping
problem is the search for the best tree for a set of languages
already assumed to be genetically related. These mathematical
techniques can determine genetic relatedness only if they fail to
incorporate unrelated languages into the trees they generate by
failing to identify cognates. And as long as a measure of similarity
is required to identify cognacy, something equivalent to the regularity assumption of the standard CM remains essential.
S. P. Harrison

Sanskrit

sti

snti

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

Latin

est

sunt

Old High German

ist

sind

Bender, Byron W., Ward H. Goodenough, Frederick H. Jackson, Jeffrey C.


Marck, Kenneth L. Rehg, and Ho-min Sohn. 2003. Proto-micronesian
reconstructions 1. Oceanic Linguistics 42.1:1110.
Biggs, Bruce. 1965. Direct and indirect inheritance in Rotuman. Lingua
14: 383445.
Campbell, Lyle. 2003. How to show languages are related: Methods for
distant genetic relatedness. In Joseph and Janda 2003, 26282.
. 2004. Historical Linguistics: An Introduction. Cambridge, MA: MIT
Press.
Chen, M., and W. S-Y Wang. 1975. Sound change: Actuation and implementation. Language 51: 25581.

For many comparativists, like Lyle Campbell (2003, 268 ff),


shared aberrancies are an alternative to regular correspondences
in demonstrating genetic relatedness. Others, like Johanna
Nichols (1996), insist that only shared aberrancies or other individual identifiers demonstrate genetic relatedness. She consigns regular correspondence a subsidiary role in subgrouping.
Similarities in words sufficiently long to be unattributable to

185

Competence
Durie, Mark, and Malcolm Ross, eds. 1996. The Comparative Method
Reviewed: Regularity and Irregularity in Language Change.
Oxford: Oxford University Press
Grace, G. W. 1996. Regularity of change in what? In Durie and Ross
1996, 15779.
Greenberg, Joseph H. 1987. Language in the Americas. Stanford,
CA: Stanford University Press.
Hale, Mark. 2003. Neogrammarian sound change. In Joseph and Janda
2003, 34368.
Harrison, S. P. 2003. On the limits of the comparative method. In Joseph
and Janda 2003, 21343.
Hock, Hans Heinrich. 1986. Principles of Historical Linguistics.
Berlin: Mouton de Gruyter.
Joseph, B., and R. Janda, eds. 2003. The Handbook of Historical Lingusitics.
Oxford: Blackwell.
Labov, William. 1981. Resolving the neogrammarian controversy.
Language 57: 267308.
McMahon, April, and Robert McMahon. 2005. Language Classification by
Numbers. Oxford: Oxford University Press.
Nichols, Johanna. 1996. The comparative method as heuristic. In Durie
and Ross 1996, 3971.
Osthoff, Hermann, and Karl Brugmann. 1967. Preface to morphological
investigations in the sphere of the Indo-European languages I. In A
Reader in Nineteenth Century Historical Indo-European Linguistics, ed.
W. P. Lehmann, 197209. Bloomington: Indiana University Press.
Thomason, Sarah Grey. 2001. An Introduction to Language Contact.
Edinburgh: Edinburgh University Press.
Thomason, Sarah Grey, and Terrance Kaufman. 1988. Language Contact,
Creolization, and Genetic Linguistics. Berkeley: University of California
Press.

COMPETENCE
The competence-performance dichotomy lies at the center of
transformational grammar, the linguistic theory introduced
by Noam Chomsky in the late 1950s (Chomsky 1957). Virtually all
current approaches to grammatical theory that descended from
Chomskys original work take the dichotomy as their starting point.
In brief, competence represents the system of abstract structural
relationships that characterize grammars, and performance the
faculties involved in putting that knowledge to use. It is generally
assumed that performance is determined in part by competence,
but is also a function of physiology, the communicative and social
aspects of language, and general cognitive architecture.
Competence and performance are modern reinterpretations
of the dichotomy between language and speech, which was
bequeathed to the field about a century ago by the great Swiss
linguist Ferdinand de Saussure ([1916] 1966). The French words
that Saussure used for language and speech, langue and
parole respectively, are still encountered today: For Saussure,
langue represents the structural system at the heart of language
a system shared by all members of the speech community; parole
is the individual act of speaking. Saussure compared language
to a symphony. Langue represents the unvarying score, parole
the actual performance, no two of which are identical. Rather
than sticking with langue and parole, Chomsky coined the new
terms competence and performance since he wished to
underscore two important differences between competence and
langue: Competence for Chomsky encompasses syntactic relationships, despite Saussures consignment of much of syntax
to parole; and competence is characterized by a set of generative

186

rules and principles, unlike Saussures langue, which was essentially a taxonomic inventory of grammatical elements.
Chomsky has always considered competence a psychological
construct, defining it as the speaker-hearers knowledge of his
language (1965, 4). Hence, support for the notion tends to be
derived from the apparent disparity between our mental representations of grammatical patterning and the actual use of language in communication. So it is frequently pointed out that
the structural principles that characterize grammars are far from
being in a one-to-one relation with the principles and conventions governing use (Newmeyer 1998). More direct psychological
evidence for competence has been adduced from observations
about child language learning. Experimentation has shown that
even very young children exhibit subtle grammatical knowledge
that points to their possessing a cognitive system encoding strictly
grammatical facts. For example, one-word speakers between 13
and 15 months know that words presented in strings are not isolated units but are part of larger constituents; one-word speakers
between 16 and 19 months recognize the significance of word
order in the sentences that they hear; and 28-month-old children who have productive vocabularies of approximately 315
words and who are speaking in four-word sentences can use a
verbs argument structure to predict verb meaning (Hirsh-Pasek
and Golinkoff 1996). There also appears to be neurological evidence for the competence-performance dichotomy. Numerous
pathological cases have been observed in which grammatical
abilities are lost while other cognitive faculties are preserved,
and vice versa (Pinker 1994).
Some linguists have applied the notion of competence to a far
broader range of abilities than the sort of grammatical knowledge
outlined here. For example, Dell Hymes coined the term communicative competence as the most general term for the speaking and hearing capacities of a person (1971, 16). A broadened
notion of competence was soon applied to such capacities as the
ability of bilinguals to switch languages appropriately (Gumperz
1972), the proper control of stylistic registers (White 1974), the
ability of readers to fathom aspects of literature properly (Culler
1975), and even the use of language by doctors in emergency
wards (Candlin, Leather, and Bruton 1976). The all-too-easy metaphorical extension of the ordinary English word competence has
led Chomsky and others to avoid use of the term in recent years.
Rather, it has become standard to use the term i-language
(short for internalized language). In this usage, I-language contrasts not with performance but with E(xternalized)-language.
Finally, it should be mentioned that some linguists have
questioned the existence of the competence-performance
dichotomy on the basis of the belief that grammatical structure
is an emergent property of language use (see, for example,
Langacker 1987 and Bybee and Hopper 2001).
Frederick J. Newmeyer
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bybee, Joan L., and Paul Hopper, eds. 2001. Frequency and the Emergence
of Linguistic Structure. Vol. 45. of Typological Studies in Language.
Amsterdam: John Benjamins.
Candlin, Christopher N., Jonathan H. Leather, and Clive J. Bruton. 1976.
Doctors in casualty: Applying communicative competence to components of specialist course design. IRAL 14: 24572.

Competence and Performance, Literary


Chomsky, Noam. 1957. Syntactic Structures. Vol 4. of Janua Linguarum
Series Minor. The Hague: Mouton.
. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
. 1988. Language and Problems of Knowledge: The Managua
Lectures. Vol. 16 of Current Studies in Linguistics. Cambridge, MA: MIT
Press. Chomskys most readable defense of the notion competence.
Culler, Jonathan. 1975. Structuralist Poetics. Ithaca, NY: Cornell University
Press.
Gumperz, John. 1972. The communicative competence of bilinguals: Some hypotheses and suggestions for research. Language in
Society 1: 14354.
Hirsh-Pasek, Kathy, and Roberta Golinkoff. 1996. The Origins of
Grammar: Evidence from Early Language Comprehension. Cambridge,
MA: MIT Press.
Hymes, Dell. 1971. Competence and performance in linguistic theory.
In Language Acquisition: Models and Methods, ed. Renira Huxley and
Elisabeth Ingram, 324. New York: Academic Press.
Langacker, Ronald W. 1987. Foundations of Cognitive Grammar. Vol.
1: Theoretical Prerequisites. Stanford, CA: Stanford University Press.
Newmeyer, Frederick J. 1998. Language Form and Language Function.
Cambridge, MA: MIT Press.
Pinker, Steven. 1994. The Language Instinct: How the Mind Creates
Language. New York: Morrow. An entertaining, but still serious, discussion of competence and performance, bringing in evidence from
many different areas of investigation.
Saussure, Ferdinand de. [1916] 1966. Course in General Linguistics. New
York: McGraw-Hill. Translation of Cours de linguistique gnrale.
Paris: Payot.
Smith, Neil, and Ianthi-Maria Tsimpli. 1995. The Mind of a
Savant: Language Learning and Modularity. Oxford: Blackwell.
Support for competence from Christopher, a savant who is severely
impaired cognitively but can learn a language virtually overnight.
White, Ronald. 1974. Communicative competence, registers, and second language teaching. IRAL 12: 12741.

COMPETENCE AND PERFORMANCE, LITERARY


Literary competence, by analogy with Noam Chomskys concepts
of linguistic competence and performance, is the implicit
knowledge that enables readers to process literary works as they
do, connecting elements and deriving meaning; performance
would be their actual engagements with literary works. The concept of literary competence works to highlight the importance in
literary studies of a poetics that describes the conventions and
interpretive operations that make possible the intelligibility of
literary works, as opposed to a hermeneutics (see philology
and hermeneutics) that seeks to develop new interpretations. It is also a claim about the relation between linguistics and
literary study: Rather than apply techniques of linguistic analysis directly to the language of literary works, it is more fruitful to
attempt to take from linguistics the methodological model for
the construction of a poetics.
Chomsky makes a fundamental distinction between competence (the speaker-hearers knowledge of his language) and performance (the actual use of language in concrete situations) (1965,
4). The notion of literary competence is introduced, on the analogy with linguistic competence, in Jonathan Cullers Structuralist
Poetics (1975). Rejecting corpus-based versions of descriptive
linguistics, Chomsky argues that the task of linguistics is not the
discovery of regularities in a corpus but a modeling or rendering
explicit of the speaker-hearers implicit knowledge. Culler argues

that just as the goal of the analysis of a language is not the description of a corpus of utterances but an explicit account of the linguistic competence of speakers of the language, so ought the goal
of poetics and quite possibly of literary study generally not be the
analysis and interpretation of literary works but an account of the
rules, conventions, and procedures that enable readers to make
sense of literary works as they do (1975, viii, 215, 301, 11330).
His account stresses, for example, the shared knowledge and
processing techniques that enable readers to grasp the plot of a
narrative (a matter on which considerable agreement usually can
be reached) and to construct characters from the implicit and
explicit information scattered through a text, as well as to engage
in the thematic and symbolic interpretation that the institution
of literature encourages (ibid., 189238). He also stresses the distinctive assumptions and operations involved in making sense of
a lyric poem, such as a presumption of significance, the relevance
of sound patterning, and so on (ibid., 13188).
Culler presents literary competence as a revision of the
framework and goals of literary studies, an attempt to integrate
the accomplishments of structuralism and narratology
in literary studies with the program of a generative linguistics,
but others have suggested that taking the concept and the model
of generative grammar seriously would lead to a generative poetics. As a description of competence, a fully adequate
grammar must assign to each of an infinite range of sentences
a structural description indicating how the sentence is understood by the ideal speaker-hearer (Chomksy 1965, 45). Ellen
Schauber and Ellen Spolsky maintain that [a] generative poetics, therefore, will need to describe the derivation of competing
well-formed interpretations and to distinguish them from inadequately derived interpretations (1981, 397). Calling Cullers conception of literary competence focused on literary conventions
and distinctive interpretive operations intolerably restrictive,
Schauber and Spolsky propose that a generative poetics should
integrate three competencies: linguistic competence, communicative competence, and literary competence, on the principle
that literary competence in Cullers sense could never lead to the
derivation of well-formed interpretations (ibid., 398; 1986).
Critiques of the concept of literary competence have suggested that Chomskys specification of the competence of an
ideal speaker-hearer makes the concept of competence inherently elitist. Joseph Dane, while disputing the parallel between
linguistic and literary competence, contrasts a technical sense of
competence as knowledge that makes any literary performance
(including interpretation) possible with the everyday sense
where competence is a matter of qualifications and credentials
(1986, 53, 59). Despite Cullers argument that literary competence
does not involve a supposition that readers will agree upon an
interpretation but only that there are literary conventions that
guide interpretation and make possible some conclusions and
not others, Dane argues that a principle of stability must remain.
Some of us possess this competence; others of us must go to the
university to learn how to be perceptive and competent (ibid.,
60); [c]ompetence is simply that which is possessed by the most
powerful leaders of the literary community (ibid., 72).
The prestige of interpretation in literary studies, where the
task of the critic is to produce a more powerful interpretation,
has blocked the program of the study of literary competence as

187

Compositionality
something shared by readers, though it is implicit in any account
of narratology, for example, or of literary interpretation generally.
The cognitivist turn in literary studies (Turner 1996) provides an
opportunity for returning to aspects of literary competence and
the key question raised by the Chomskian model of the extent
to which such competence involves kinds of knowledge specific
to literature. If our ability to make sense of the world is defined
in terms of perceiving stories, organizing perceptions according
to metaphorical fields, and so on, it may be possible to go on to
identify interpretive moves that are specific to the reading and
appreciation of literary works.
Jonathan Culler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge,
MA: MIT Press. The classic theorization of a grammar as a description
of linguistic competence.
Culler, Jonathan. 1975. Structuralist Poetics: Structuralism, Linguistics,
and the Study of Literature. London: Routledge. Assesses structuralist
work and attempts to show that description of literary competence is a
fruitful program for literary studies.
Dane, Joseph. 1986. The defense of the incompetent reader.
Comparative Literature 38.1: 5372. A critique of the analogy with linguistic competence and of an implicit elitism.
Schauber, Ellen, and Ellen Spolsky. 1981. Stalking a generative poetics.
New Literary History 12.3: 397413. Starting with literary competence,
lays out a broader program.
. 1986. The Bounds of Interpretation: Linguistic Theory and Literary
Text. Stanford, CA: Stanford University Press. Develops the conception
of a generative poetics at greater length.
Turner, Mark. 1996. The Literary Mind. New York: Oxford University Press.
An important instance of the cognitivist program in literary studies.

COMPOSITIONALITY
The principle of compositionality was first formulated by the
German philosopher Gottlob Frege (1892) and is also referred
as the Frege principle. It states that the meaning of a complex
expression is a function of the meaning of its parts. A mapping
from expressions to meanings that satisfies this principle is
called compositional. Frege identified compositionality as a basic
requirement for an account of the meaning of natural language
(see language, natural and symbolic), and all serious
accounts of sentence meaning are compositional. Therefore,
current research seeks to find more restrictive notions of compositionality that can be used to assign a degree of compositionality
to a semantic analysis, as discussed in this entry. The question
of compositionality has also been asked for nonlinguistic communication systems among humans and other species, which I
mention towards the end.
For a semantics of natural language, compositionality is
a basic requirement because humans can generate infinitely
many sentences (see discrete infinity) and associate them
with one from an infinite set of meanings. Since human memory
is a finite resource, there can only be a finite set of memorized
lexical meanings (see lexical semantics). It follows that
natural language must contain nonlexical expressions and that
the meaning of such nonlexical expressions is determined by a

188

compositional procedure. Therefore, compositionality is a necessary property of any semantics of natural language that claims
complete coverage. The result, however, leaves open what the
lexical expressions of natural language are and how many composition principles there are. Often, words can be assigned a
compositional meaning; for example, the meaning of slept is the
result of sleep combined with past tense. In other cases, however, syntactically complex phrases seem to have a noncompositional meaning. For example, that kick the bucket is synonymous
with die does not follow naturally from the meanings of kick
and the bucket (cf. idioms). In the history of language, complex
expressions often take on a noncompositional meaning over
time (cf. grammaticalization.)
The composition principles are closely tied to a particular
semantic theory. Compositionality plays a central role in formal
semantics and truth conditional semantics of natural
language, while other theories of language meaning have not
addressed compositionality (cf. construction grammars
and cognitive grammar). The textbook by Irene Heim and
Angelika Kratzer (1998) provides one influential account. This
account assumes that humans construct a syntactic representation of a sentence, the logical form, which is then mapped
at the syntax-semantics interface to a meaning. This mapping
is a recursive procedure (see recursion, iteration, and
metarepresentation).
In addition to the meanings of a finite set of lexical items,
general composition rules determine the meaning of complex
phrases. Of the meanings of lexical items, only some aspects
are important for composition. In Heim and Kratzers analysis,
these aspects are captured by the semantic type. For example,
the meaning of proper names like Kai and Berlin have the type
of individuals, and the meanings of both to like and to hate are
of the type of two-place functions. The parts of a complex phrase
can be either lexical items or complex phrases themselves.
Therefore, only one composition rule is required: a rule that
combines the meanings of two subphrases into one. Heim and
Kratzers analysis makes use of three composition rules: function
application, predicate modification, and predicate abstraction.
Which composition rule is applied is determined by the types of
the meanings of the two parts of the complex phrase. The simple
example Kai likes Berlin illustrates only function application. We
assume that the sentence consists of only three lexical items: Kai,
likes, and Berlin, though a full analysis would contain at least
present tense as well. The lexical meanings of Kai and Berlin
are the individual concepts kai and berlin. The lexical entry for
likes is the function like which applied to one individual yields
another function that, when applied to another individual, yields
a sentence meaning. The logical form of the sentence shown
in (1) determines the order in which like is composed with its
arguments.

(1)

like(berlin)(kai)
Kai

like(berlin)
likes

Berlin

Compositionality

Computational Linguistics

While necessarily abstract, the analysis captures two important


aspects: 1) the commonality between the meaning of the sentence and that of structurally similar sentences such, as Jan hates
the capital of Germany, and 2) the incompleteness of examples
like *Kai likes.
Recent work has pointed out a need to develop a stricter formal notion of compositionality. One motivation is the following
theorem of Wlodek Zadrozny (1994): If there is a function that
assigns a meaning to each complete expression of a language,
a compositional meaning function can also be given. This result
relies on an extension of function beyond its natural domain.
For example, we might construct a compositional semantics for
the idiom kick the bucket in the following way: For one, stipulate that the bucket has in addition to its ordinary meaning also
the special symbol X as its meaning. Secondly, define the meaning of kick applied to X as the meaning of die, thereby compositionally defining the meaning of kick the bucket. However, this
analysis strikes most researchers as less desirable than a formally
noncompositional one. For this reason, current research tries
to formulate notions of compositionality stricter than Freges
(Kazmi and Pelletier 1998; Szab 2000). In particular, Ali Kazmi
and Francis J. Pelletier suggest restricting the use of functions as
meanings, but it is still an open question how exactly to do this.
Looking beyond human language, compositionality has
emerged as an important property to classify communication
systems. Tim Horton (2001) investigates the compositionality
of music (see music, language and). Even more interesting
is the case of animal communication and human evolution
(Bickerton 1990). Elizabeth Spelke (2003) proposes that compositional semantics is crucial for human intelligence. She argues
that humans and higher animals possess a similar ability to form
basic concepts. Only humans, however, via the compositional
semantics of language have the ability to combine these basic
concepts into an infinite array of derived meanings.
Uli Sauerland
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bickerton, Derek. 1990. Language and Species. Chicago: University of
Chicago Press.
Fodor, Jerry, and Ernest Lepore. 2002. The Compositionality Papers.
Oxford: Oxford University Press.
Frege, Gottlob. 1892. ber Sinn und Bedeutung. Zeitschrift fr
Philosophie und philosophische Kritik, NF 100: 2550.
Heim, Irene, and Angelika Kratzer. 1998. Semantics in Generative
Grammar. Oxford: Blackwell.
Horton, Tim. 2001. The compositionality of tonal structures: A generative approach to the notion of musical meaning. Music Scienti
5.2: 13156.
Kazmi, Ali, and Francis J. Pelletier. 1998. Is compositionality formally
vacuous? Linguistics and Philosophy 23: 62933.
Partee, Barbara. 2006. Compositionality in Formal Semantics.
Oxford: Blackwell.
Spelke, Elizabeth. 2003. What makes us smart? Core knowledge and
natural language. In Language in Mind, ed. Dedre Gentner and Susan
Goldin-Meadow, 279311. Cambridge, MA: MIT Press.
Szab, Zoltn Gendler. 2000. Compositionality as supervenience.
Linguistics and Philosophy 23: 475505.
Zadrozny, Wlodek. 1994. From compositional to systematic semantics.
Linguistics and Philosophy 17: 32942.

COMPUTATIONAL LINGUISTICS
Computational linguists develop working models of various
aspects of languages in the form of computer programs. These
models fall under three main headings: analysis, generation, and
learning. Analysis models take in (usually typewritten) texts and
figure out the details of their linguistic structure, possibly producing a meaning representation. Starting from an abstract representation of a meaning, generation models compose text (e.g.,
a sentence) expressing that meaning in a particular language.
Some systems combine analysis and generation with other tasks.
For example, a database enquiry system analyzes queries in order
to figure out what information is sought, retrieves the requested
information from a database, and uses a generation system to
express that information in natural-language output. Machine
translation systems analyze input in one language and generate
corresponding expressions in a different language. Most systems
rely on grammar rules and resources such as text corpora and
dictionaries. Machine learning researchers build models that
learn the relevant information from training examples to avoid
hand-crafted rules.

Grammatical and Lexical Analysis


Computational analysis of linguistic input using a parsing
algorithm quickly reveals the ambiguity of syntax, due in
part to the fact that many words have several word classes.
Parsing is greatly assisted if the lexical category can be resolved
by looking at each words neighbors, a task performed by a
part-of-speech tagging (i.e., labeling) program. The two main
approaches are i) rule-based versus ii) probabilistic. The rulebased approach exploits rules such as the can precede a noun or
adjective but never a verb to rule out contextually inappropriate
part-of-speech tags. These rules must be written and tested by an
expert and need to be debugged by trial and error since no one
can discover all the correct rules immediately. The probabilistic
approach determines the most likely sequence of tags, calculated
using probability theory according to the frequency with which
one tag follows another in a training corpus. For example, Daniel
Jurafsky and James H. Martin (2000, 305) estimate the probability that race is a noun as P = 0.000007 if the previous word is to; it
is more likely to be a verb (P = 0.00001) in that context.
Both approaches face difficulties with new words and new
uses of old words. Web pages, for example, contain so many
names, new technical terms (especially new compounds), and
misspellings that about 15 percent of words are not listed in
dictionaries. In such cases, morphological analysis of word
structure (perhaps using probabilistic methods) may help. In
agglutinative languages (such as Finnish) and in languages with
complex morphological patterns (such as Arabic), it is almost
essential. No dictionary contains the word Shamoization
(coined for this article), but we can infer the stem Shamo and
be fairly confident that it is a proper noun, because -ization combines with nouns and it begins with a capital letter. With productive morphemes like -ability in English, decomposition of
words into stems and affixes can reduce the dictionary size and
may also help us deal with new words (crushability, etc).
The commonest methods for morphological analysis use
finite-state automata, corresponding to the least powerful kind

189

Computational Linguistics
of grammar in the Chomsky hierarchy. Despite their limitations,
finite-state automata are efficient and adequate for lexical preprocessing, including decomposition of words into morphemes
(e.g., geese goose + Nplu) and normalization of spelling (e.g.,
driver [drive/V + er]Nsing). For text-to-speech conversion,
interpretation of abbreviations and symbols may be necessary,
too, for example, Mr. Mister, 4.36 four pounds thirty-six
(not pound four point three six).

Semantic Analysis
In real-world applications of computational linguistics, syntactic and morphological analysis are merely a means to an end.
In database enquiry systems or machine translation, we may
compute a representation of the meaning of the input. One of
the difficulties facing the computational treatment of meaning
is presented by groups of words with similar or related meanings. For example, in Whats the first class fare? and Whats
the price of a first class ticket?, fare and price have almost the
same meaning (i.e., the answer would be found in the same entry
in the ticket-suppliers database). In order to recognize lexical
relations, many systems use computationally implemented
thesauruses, such as WordNet, Princeton Universitys lexical
database for English.
To represent sentence meanings, computational linguistics often employs formal logic. A question, for example, can be
translated into a logical proposition with some information missing. Experts in formal semantics may express the meaning of
I would like the cheapest flight from Washington to Atlanta as
the predicate calculus formula:
(1)

A.flight(A) & from(A,washington) & to(A,atlanta) &


cheapest(A) & like(i,A)

that is, There exists a flight A and A is from Washington and A is


to Atlanta and A is cheapest and I like A. In order to answer the
question, an information retrieval system (e.g., a ticketing system) could search its flight information database to find one with
the desired origin, destination, and so on. Formulae like (1) can
be automatically converted to statements of a database query
language, or used in programming languages such as Prolog
(Clocksin and Mellish 2003).

Machine Translation
Conversion from natural languages to logical formulae and vice
versa suggests one method of machine translation:
(2)

Language 1 input Logical representation of its meaning


(same for all languages) Output in language 2.

In this scheme, logic is used as an interlingua. Rather than going


directly from one language to the other, an interlingua is attractive because different languages can express similar ideas in
quite different ways. For example, the English verb like translates in Japanese as suki desu an adjective + to be, is likeable.
In Irish, I can X is expressed X is possible for me, using an
adjective. Using an intermediate meaning representation might
overcome such grammatical differences between languages.
Also, to translate between a large number of different languages
(e.g., the 21 official languages of the European Union), it seems
simpler to translate them all into an interlingua (21 language

190

pairs) than to develop translation rules and bilingual dictionaries for all 210 pairs. One technical problem with this approach
is that logical formulae are not unique representations of meaning: for example, if A then B is equivalent to not (A and not
B). But an overseas booking clerk might be confused if your
statement If meals arent served in economy class, I want a first
class ticket were translated as I dont want no meals served
in economy class and a first class ticket, even though this is
logically correct. Consequently, real machine translation systems combine transfer methods, which map structures of one
language to the other, direct methods that use word and phrase
correspondences with as little linguistic manipulation as possible, and some statistics to help choose the most likely ways of
expressing the output.
Machine translation and information retrieval requires
the generation of linguistic output, a task with its own particular challenges. When there are many equivalent ways of saying
the same thing (e.g., John drove the car, the car was driven by
John), the most appropriate variant must be chosen, observing
pragmatic conventions, such as putting given information
before new, the conventional order of words (big, red bus,
not red, big bus), and the time sequence they suggest: The
accused broke his leg and fell out of the window does not mean
the same as the accused fell out of the window and broke his
leg. Sentence generation often uses a slot-filling technique: The
agent of an action is placed in subject position; the undergoer
is the object, and so on. focus might prompt a particular sentence pattern, for example, it was the policeman who broke his
leg. Outputs are also generated in dialogue systems and in text
summarization. Dialogue systems collect information from the
user and provide information that the user requires according
to the accepted conventions of dialogue sequence. The dialogue
may be managed via a script that successively prompts the user
for gobbets of information. This is akin to form filling, as when
purchasing products on the Internet. The order in which the
user gives the information may not matter so long as all required
fields are eventually filled in. To navigate its script, the system
takes the lead in the conversation.
In text summarization, documents are analyzed in order to
extract the most important pieces of information according to
various criteria, such as discourse structure and word frequency.
This information is then used to generate a summary of the original, to a required length. Often, the summary simply consists of
the most relevant extracts of the original.

Probabilistic Methods
Computational linguists employ a wide range of probabilistic
methods that are helpful in various problems, especially sentence
and word-sense disambiguation. For example, the girl saw the
dog with the telescope has at least two structures and meanings,
depending on whether the girl or the dog has the telescope. Both
these structures and meanings are legitimate, but in real-world
applications such as machine translation, we may need to determine which structure and meaning is intended. Parsers quickly
reveal that average-length sentences have many possible structures, some quite implausible and unwanted. It is impractical to
model a speakers world knowledge, such as the fact that dogs
cannot have telescopes. But it is feasible to use the statistics of

Concepts
word combinations, such as the fact that telescope occurs with see
more often than dog, to select more likely analyses.
Probabilities can also help with word-sense disambiguation: For example, in she joined the club, it is not hard to work
out that club is more likely to be association of persons than
heavy staff of wood or suit of cards, simply on the basis of the
frequency of the collocation join club. Consequently, lexical
semantic analysis often combines probabilistic methods and
symbolic resources such as thesauruses.

Learning
In order to determine probabilities of rules, word senses, and so
on, a system must be trained on a corpus of language data, in
effect learning them. We would like computers to do more of the
hard work of finding the best grammar for a language. Consider
the pairs of rules that, according to X-BAR THEORY, define the
structure of noun phrases in all languages:
(3)

a) NP Det N1 (as in English) or b) NP N1 Det (as in


Norwegian)

(4)

a) N1 Adj N1 (as in English) or b) N1 N1 Adj (as in French)

(5)

a) N1 PP N (as in German or Japanese) or b) N1 N PP (as


in English)

Given these predetermined possible rules, a language learner


(computer or human) only needs to count the number of times
each rule is applicable. In English, rule (4a) will be applicable
every time an adjective precedes a noun, whereas in French, rule
(4b) will be applicable where adjectives follow nouns. By counting the frequencies with which these rules apply, a learner can
soon work out that (4a) is more suitable for English than (4b) and
vice versa for French. Probabilistic methods are now so common
in computational linguistics that a detailed review is impossible
here. One framework for computation and learning, however,
connectionism, is much used in some areas and has attracted
considerable interest from psycholinguists.
John Coleman
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Clocksin, William F., and Christopher S. Mellish. 2003. Programming in
Prolog: Using the ISO Standard. Berlin: Springer. An authoritative yet
readable textbook.
Coleman, John. 2005. Introducing Speech and Language Processing.
Cambridge: Cambridge University Press. An elementary introduction
to the field, aimed at readers with a less technical background (especially linguists).
Jurafsky, Daniel, and James H. Martin. 2000. Speech and Language
Processing. Upper Saddle River, NJ: Prentice Hall. A highly respected
and compendious textbook.
Manning, Christopher D., and Hinrich Schtze. 1999. Foundations of
Statistical Natural Language Processing. Cambridge, MA: MIT Press.
Enormous and comprehensive.

CONCEPTS
Concepts are the central constructs in most modern theories of
the mind. Humans (and arguably other organisms) are seen as living in a conceptually categorized world (see categorization).

Objects and events (from household items to emotions to gender


to democracy), although unique, are acted toward as members
of classes. Without this ability to categorize, it would be impossible to learn from experience. Since at least the nineteenth
century, it has been common to refer to the mental or cognitive aspect of categories as concepts. Philosophy, psychology,
computer science, and linguistics have all made contributions
to conceptual theory and research. At present, there are seven
major views of the nature of concepts that form the basis for
inquiry and debate: 1) the classical view, 2) the prototype and
graded structure account, 3) the theory theory, 4) neoclassical combination models, 5) connectionist computer models,
6) conceptual atomism, and 7) nonrepresentational ecological
approaches.

The Classical View


The classical view is the approach to concepts derived from the
history of Western philosophy. When humans begin to look at
their experience by means of reason, questions about the reliability of the senses and the bases for knowledge arise, as do
more specific questions about how categories can have generality (called the problem of universals), how words can have
meaning, and how concepts in the mind can relate to categories
in the world. The Greeks and most Western philosophers ever
since have agreed that experience of particulars, as it comes
moment by moment through the senses, is unreliable; therefore,
only stable, abstract, logical categories can function as objects
of knowledge and objects of reference for the meaning of words.
To fulfill these functions: a) conceptual categories had to be
exact, not vague (i.e., have clearly defined boundaries), and b)
their members had to have attributes in common, which were
the necessary and sufficient conditions for membership in the category. It follows that c) all members of the conceptual category were equally good with regard to membership;
either they had the necessary common features or they didnt.
Categories were thus seen as logical sets. It is on this basis that
conceptual categories could be the basis for logical inferences
as in the familiar All men are mortal; Socrates is a man; Socrates
is mortal (see verbal reasoning). This is also the basis of the
way in which words are defined by genus and differentia in our
dictionaries.
In psychology, the first body of research on concept learning
mirrored the philosophers view of conceptual categories. Led by
the work of Jerome Bruner and his associates (Bruner, Goodnow,
and Austin 1956), subjects were asked to learn categories that
were logical sets defined by explicit attributes, such as red and
square, combined by logical rules, such as and. Theoretical interest was focused on how subjects learned the attributes that were
relevant and the rules that combined them. In developmental
psychology, the theories of Jean Piaget and Lev Vygotsky were
combined with the concept-learning paradigm to study how
childrens ill-structured, often thematic, concepts developed
into the logical adult mode. Artificial stimuli were typically used
in research at all levels, structured into microworlds in which
the prevailing beliefs about the nature of categories were already
built in.
In linguistics, most mainstream twentieth-century work
in phonology, semantics, and syntax rested on the

191

Concepts
assumptions of the classical view. phonemes were analyzed as
sets of universal, abstract, binary features (Chomsky and Halle
1968). word meaning, the province of semantics, was likewise represented by a componential analysis of features (see
feature analysis); for example, bachelor was rendered as the
features +human, +male, +adult, +never married (Katz and Postal
1964). A complex concept such as bald bachelor was considered
the unproblematic intersection of the features of bachelor with
those of bald. Synonymy, contradiction, and other relational
aspects of word meaning were accounted for in a similar fashion.
Syntax was analyzed by formal systems such as transformational grammar (Chomsky 1965) that also relied on decomposition into features (see Taylor 2003). Such an understanding
of language was adopted with enthusiasm by computer science
because meaning could be divorced from world knowledge and
readily represented by the substitutable strings of symbols on
which computers work.

Prototypes and Graded Structure


Consider the color red: Is red hair as good an example of your
idea or image of red as a red fire engine? Is a dentists chair as
good an example of chair as a dining room chair? Are you immediately sure how to classify and name every color and object you
see? From its inception as a discipline separate from philosophy,
psychology has investigated types of learning and behavior that
show graded effects. For example, Ivan Pavlovs dogs produced
decreasing amounts of saliva as tones grew farther from the tone
originally combined with meat powder. This is called stimulus
generalization. Note how different it is from the classical view of
conceptual categories.
The first programmatic, empirically based challenge to the
classical view came from Eleanor Roschs work on prototypes
and graded structure (Rosch 1978, 1999; Rosch and Lloyd 1978).
A wide variety of conceptual categories were shown to have
gradients of membership; that is, subjects can easily, rapidly,
and meaningfully rate how well a particular item fits their idea
or image of its category. This is true for perceptual, semantic,
social, biological, and formal concepts. In contrast, subjects cannot list criterial attributes for most categories (Rosch and Mervis
1975). More importantly, the psychological import of gradients
of membership was demonstrated by their effect in a series of
experiments on virtually every major method of study and measurement used in psychological research: learning, association,
speed of processing, expectation, inference, probability judgments, and judgments of similarity. Rosch suggested a model
in which categories formed around perceptually, imaginally, or
conceptually salient stimuli, which she called prototypes, then,
by stimulus generalization, spread to other similar stimuli without necessarily any analyzable criterial attributes, formalizable
definitions, or definite boundaries. It is the prototype that was
claimed to mentally represent the category for most purposes. A
profusion of factors have been found to create prototypes: physiological saliency, statistical frequencies (including central tendencies and family resemblances), social structure, formal
structure, extremes of attribute dimensions, cultural ideals,
causal beliefs, and particular stimuli (exemplar theories
are based on these) that are the first learned, or most recently
encountered, or the most emotionally charged, vivid, concrete,

192

meaningful, or interesting. The classical view of concepts cannot


deal with any of this.
The prototype view has spread beyond psychology to many
fields, including linguistics and narratology. Gradients of
exemplariness are ubiquitous in linguistic phenomena, even
in phonology where actual speech is less clear-cut than would
appear in an abstract componential analysis. In semantic and
syntactic analyses (particularly in cognitive grammar and
the understanding of metaphor), prototype effects, in addition
to providing specific case studies, are often used as evidence that
formal analysis is insufficient of itself and that world knowledge
must be part of ones theory (Lakoff 1987a, 1987b; Langacker
1990; Taylor 2003).

Theories
The theories approach to concepts takes advantage of peoples
intuitions that life activities and the concepts that map them take
place in a context larger than is offered by either formal description or laboratory experiments. The basic claim is that concepts
get their meaning through mental theories. There are actually
two groups of theory theorists: cognitivist-oriented cognitive
psychologists, who primarily address categorization issues, and
developmental psychologists of the theory theory school, who
address conceptual change. The first group (Medin 1989; Medin
and Wattenmaker 1987; Murphy and Medin 1985) has used the
idea of theories primarily as criticism of previous categorization
research. These theorists point out that previous accounts of
concepts cannot properly define or explain either attributes or
similarity and that previous experiments on conceptual categories are all subject to context effects; for example, judgments of
the prototypicality of animals changes if a zoo context is specified. They do not, however, themselves give an account of attributes or similarity, nor do they specify what a theory is or give
any concrete examples of a theory defining a concept.
In contrast, the theory-theory school of developmental psychology (see Gopnik and Meltzoff 1997) explicitly defines theory
as analogous to scientific theories, much like Kuhnian paradigms,
and argues that cognitive development should be viewed as the
successive replacement of one paradigm theory held by the child
by another. Interest in concepts tends to be from the point of
view of change in the childs (rather than the researchers) theory
of what a concept is. When specific concepts are studied (such
as biological types Carey 1985; Keil 1979), the thrust is to show
them as parts of larger theoretical units.
In linguistics, the discussion tends to be formulated in terms
of the relation of word meaning to general knowledge in a variety of specific contexts. Such contexts have been characterized as
schemas, frames, scripts, image schemas, domains, and
perspectivization. For example, Ronald Langacker (1990) talks
of the seven-day week as the semantic domain within which
Monday is understood, and George Lakoff (1987a) points to five
frames needed to explain our use of the word mother (genetic,
birth, nurturance, genealogical, and marital). Computer science
has worked on similar formulations in the design of the type of
program known as story understanders. Such work tends to be
classified under theories despite its lack of general explanatory
hypotheses because, lacking specification of what is to count
as a theory, virtually any demonstration of the embedding of

Concepts
individual concepts in larger semantic complexes or in world
knowledge has been argued as support for the theories view a
diffuseness that has also been used as a critique of that view.

Neoclassical Combination Models


Such approaches are called neoclassical because they incorporate elements of the classical view, considering it a necessary
basis, but add elements from other approaches. Psychology
and linguistics treat the issue differently. Psychology offers dual
models. These typically begin with criticisms of prototype theory,
some of which seem based on misinterpretation (e.g., taking
graded structure as a probability distribution [Smith and Medin
1981] or limiting prototypes to only one type of prototype), but
most seem to be based on the philosophical intuition that the
real meaning of a concept, that to which the concept refers, must
be the identifiable necessary attributes of a classical definition.
Two main types of evidence are offered for this account. One is
that prototype and graded structure effects can be found for conceptual categories that have a formal classical definition, such
as odd number (Armstrong, Gleitman, and Gleitman 1983), the
other that prototypes do not form componential combinations
as do the elements of classical definitions (e.g., a good example
of pet fish is neither a prototypical pet nor a prototypical fish
[Osherson and Smith 1981]). Both findings are taken to indicate
that prototypes are something other than and irrelevant to a concepts meaning. The solution is a dual model in which prototypes
are assigned the function of rapid recognition of conceptual
referents, whereas the true meaning is provided by a classically
defined core (Osherson and Smith 1981; Smith, Shoben, and
Rips 1974).
Linguistic models are more complex. All wish to include structured real-world knowledge in some form, along with at least a
minimum of necessary defining classical attributes. Some forgo
a complete characterization of the concept to concentrate on
grammatically relevant structure (Pinker 1989). Perhaps the most
complete attempt to cover all bases is Ray Jackendoffs (1983)
account of the conditions needed to specify word meaning; these
include partial definitions (red must at least include color), gradients of relevant attributes (such as hue for colors), and sensory
specifications, such as a model of what the referent looks like.

Connectionist Computer Models


The previous accounts of concepts are all, to greater or lesser
extent, based on the idea of mental representations and are
formulated at the symbolic level of mental functioning, the theories view being the most top-down. In sharp contrast, connectionist semantics seeks to derive apparently symbolic functions
from subsymbolic mechanistic neuron-like processes, such
as weighted connections among units, the strengths of which
are gradually adjusted on the basis of feedback to the program
(Rogers and McClelland 2004). A question for this approach,
as for much present psychology, is the extent to which findings
about biological (or pseudobiological) substrates are to preclude
feedback and explanation at the higher symbolic levels.

Conceptual Atomism
All of the previous approaches are subject to philosophical criticisms, even the classical views (for reviews, see Fodor 1998 and

Laurence and Margolis 1999). Jerry Fodors (1990, 1998) conceptual atomism attempts to sidestep such issues by arguing that
the concept BIRD (Fodors notation) simply expresses the single
atomic property bird. The concept derives that meaning from its
causal history (as in Kripke [1972] 1980 and Putnam 1975 see
also essentialism and meaning). Concepts have no structure and are not decomposable into any kind of properties, internal or external. This view has not been adopted by psychologists
who presume (despite Fodors denial) that atomic concepts
would need to be innate (how could they be learned?) and that
it seems highly unreasonable to propose biologically innate concepts for everything in the universe, including televisions, penicillin, and so on. More broadly, conceptual atomism has not so
far appeared generative of psychological empirical research.

Nonrepresentational Ecological Approaches


This is the approach that deals with the use of concepts. Ludwig
Wittgenstein (1953) is much cited in concept research for family resemblances, but that observation was a small fragment of
his argument that concepts and words should be understood
not referentially but as embedded interactive parts of forms of
life. Since forms of life show up in a succession of situations,
one way to approach this issue is to study concepts in their natural situational environments (for example, Cantor, Mischel, and
Schwartz 1982). Another way is to analyze concepts as mental
simulators of situations (Kahneman, Slovic, and Tversky 1982),
and another is to show the contextual sensitivity of concepts
(Barsalou 1987; see also ad hoc categories). Yet a different
point of entry into conceptual use may be to ask how the basic
level of abstraction, the default level at which conceptual categories appear to be named and understood (chair rather than
furniture or office chair), maps onto the forms of human activities
(Rosch et al. 1976). General theoretical ecological accounts are
provided in Rosch (1999) and Gabora, Rosch, and Aarts (2007).

Conclusion
Concepts occur in use only in particular moments in particular
situations. From the perspective of their use, one can see the
aspects of mental and interpersonal activities in which each of the
seven accounts of concepts offers insight. For example, one might
be actively seeking to find the attributes for a classical definition
by means of thoughts that use mental prototypes. This would be
done against a background of loosely organized frames, scripts,
and so on the sort of knowledge structures toward which the
theories view points. The concepts being used could be atomic
and inherent to that moment, given that recognition of items had
already been performed and was now inherited from previous
moments. In short, each of the views maps a particular intuition
that humans seem to hold about concepts: The classical view
offers essences of a sort; the prototype view highlights concrete,
holistic mental representations; the theories view points toward
a background of conceptually structured world knowledge; connectionism points to a subsymbolic neuronal substrate; atomism
brings in history (and simplicity); combination models attempt
to bring it all together; and the ecological approach attempts to
bring it all together in terms of the ways in which concepts participate in real-world uses.
Eleanor Rosch

193

Concepts
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Armstrong, Sharon, Lila Gleitman, and Henry Gleitman. 1983. What
some concepts might not be. Cognition 13: 263308.
Barsalou, Lawrence. 1987. The instability of graded structure: Implications
for the nature of concepts. In Neisser 1987, 10140.
Bruner, Jerome, Jacqueline Goodnow, and George Austin. 1956. A Study
of Thinking. New York: Wiley.
Cantor, Nancy, Walter Mischel, and J. C. Schwartz. 1982. A prototype
analysis of psychological situations. Cognitive Psychology 14: 4577
Carey, Susan. 1985. Conceptual Change in Childhood. Cambridge,
MA: MIT Press.
Chomsky, Noam. 1965. Aspects of a Theory of Syntax. Cambridge, MA: MIT
Press.
Chomsky, Noam, and Morris Halle.1968. The Sound Pattern of English.
New York: Harper and Row.
Fodor, Jerry. 1990. Information and representation. In
Information, Language, and Cognition, ed. Philip Hanson, 17590.
Vancouver: University of British Columbia Press.
. 1998. Concepts: Where Cognitive Science Went Wrong. New
York: Oxford University Press.
Gabora, Liane, Eleanor Rosch, and Diederik Aerts. 2007. Toward an
Ecological Theory of Concepts. Ecological Psychology 20.1: 84116.
Gopnik, Allison, and Andrew Meltzoff. 1997. Words, Thoughts, and
Theories. Cambridge, MA: MIT Press.
Jackendoff, Ray. 1983. Semantics and Cognition. Cambridge MA: MIT
Press.
Kahneman, Daniel, Paul Slovic, and Amos Tversky, eds. 1982. Judgment
under Uncertainty: Heuristics and Biases. New York: Cambridge
University Press.
Katz, Jerrold, and Paul Postal. 1964. An Integrated Theory of Linguistic
Descriptions. Cambridge, MA: MIT Press.
Keil, Frank. 1979. Semantic and Conceptual Development: An Ontological
Perspective. Cambridge: Harvard University Press.
Kripke, Saul. [1972] 1980. Naming and Necessity. Cambridge: Harvard
University Press.
Lakoff, George. 1987a. Cognitive models and prototype theory. In
Neisser 1987, 63100.
. 1987b. Women, Fire, and Dangerous Things: What Categories
Reveal about the Mind. Chicago: University of Chicago Press.
Langacker, Ronald. 1990. Concept, Image, and Symbol: The Cognitive
Basis of Grammar. Berlin: Mouton de Gruyter.
Laurence, Stephen, and Eric Margolis. 1999. Concepts and cognitive
science. In Concepts: Core readings, ed. Eric Margolis and Stephen
Laurence, 381. Cambridge, MA: MIT Press.
Medin, Douglas. 1989. Concepts and conceptual structure. American
Psychologist 44: 146981.
Medin, Douglas, and William Wattenmaker. 1987. Cognitive cohesiveness, theories, and cognitive archeology. In Neisser 1987, 2562.
Murphy, George, and Douglas Medin. 1985. The role of theories in conceptual coherence. Psychological Review 92: 289316.
Neisser, Ulric, ed. 1987. Concepts and Conceptual Development: Ecological
and Intellectual Factors in Categorization. Cambridge: Cambridge
University Press.
Osherson, Daniel, and Edward Smith. 1981. On the adequacy of prototype theory as a theory of concepts. Cognition 9: 3558.
Pinker, Steven. 1989. Learnability and Cognition: The Acquisition of
Argument Structure. Cambridge, MA: MIT Press.
Putnam, Hillary. 1975. Mind, Language and Reality. New York: Cambridge
University Press.
Rogers, Timothy, and James McClelland. 2004. Semantic Cognition: A
Parallel Distributed Processing Approach. Cambridge, MA: MIT Press.
Rosch, Eleanor. 1973. Natural categories. Cognitive Psychology
4: 32850.

194

Conceptual Blending
. 1978. Principles of categorization. In Cognition and
Categorization, ed. Eleanor Rosch and Barbara Lloyd, 2748. Hillsdale,
NJ: Lawrence Erlbaum.
. 1996. The environment of minds: Towards a noetic and hedonic
ecology. In Cognitive Ecology (Handbook of Perception and Cognition).
2d ed. Ed. Morton Friedman and Edward Carterette, 524. San Diego,
CA: Academic Press.
. 1999. Reclaiming concepts. Journal of Consciousness Studies
6.11/12: 6177.
Rosch, Eleanor, and Barbara Lloyd, eds. 1978. Cognition and
Categorization. Hillsdale, NJ: Erlbaum.
Rosch, Eleanor, and Carolyn Mervis. 1975. Family resemblances: Studies
in the internal structure of categories. Cognitive Psychology
7: 573605.
Rosch, Eleanor, Carolyn Mervis, Wayne Gray, David Johnson, and
Penelope Boyes-Braem. 1976. Basic objects in natural categories.
Cognitive Psychology 8: 382439.
Smith, Edward, and Douglas Medin. 1981. Categories and Concepts.
Cambridge: Harvard University Press.
Smith, Edward, Edward Shoben, and Lance Rips. 1974. Structure and
process in semantic memory: A featural model for semantic decisions. Psychological Review 81: 21441.
Taylor, John. 2003. Linguistic Categorization. Oxford: Oxford University
Press.
Wittgenstein, Ludwig. 1953. Philosophical Investigations. New
York: Macmillan.

CONCEPTUAL BLENDING
Conceptual blending is a basic mental operation that has been
explored as a central mechanism indispensable to grammar and
language. Conceptual blending leads to new meaning, global
insight, and conceptual compressions useful for memory and
manipulation of otherwise diffuse ranges of meaning. It plays a
fundamental role in the construction of meaning in everyday life,
in the arts and mathematics, in the natural sciences, and in the
social and behavioral sciences. The essence of the operation of
conceptual blending is to construct a partial match between two
inputs and to project (see projection [blending theory])
selectively from those inputs into a blended mental space,
which dynamically develops emergent structure.
It has been suggested that the capacity for complex conceptual blending (double-scope integration) is the crucial capacity
needed for language and for higher-order cognition of the sort
that characterizes cognitively modern human beings.
A systematic study of conceptual blending was initiated in
1993 by Gilles Fauconnier and Mark Turner, who discovered the
structural uniformity and wide applications. The central introductory statement of the field is Fauconnier and Turner 2002.
(See also Turner 1996 and 2001; Fauconnier 1997; Fauconnier
and Turner 1996 and 1998; and Turner and Fauconnier 1999.)
The blending Web site at http://blending.stanford.edu presents
an extensive body of work done since then by many researchers in various fields on the theory of conceptual blending and its
empirical manifestations in language and grammar, mathematics, art, natural and social science, literature, social pragmatics,
and music. Additional research considers mathematical and
computational modeling of conceptual blending and experimental investigation in the cognitive neuroscience of neural and
cognitive processes.

Conceptual Blending

Some Simple Examples


RITUAL OF THE NEWBORN BABY. In a European ritual, the newborn baby is carried up the stairs of the parents house as part of
a public event. The ritual is meant, symbolically, to promote the
childs chances of rising in life. One input is the ordinary action
of carrying a baby up the stairs. The other input is the schematic
space of life, already structured so that living a life is metaphorically moving along a path, such that good fortune is up and misfortune is down. In a partial match between these inputs, the
path up the stairs corresponds to the course of life, the baby is the
person who will live this life, the manner of motion up the stairs
corresponds to how the person goes through life, and so on. In
the symbolic ritual, the two inputs are blended, so that the ascent
of the stairs is the course of life, an easy ascent is an easy rise in
life for the person that the baby will become, and stumbling or
falling might take on extraordinary significance.
BOAT RACE. A famous example of blending is the boat race or
regatta. A modern catamaran is sailing from San Francisco to
Boston in 1993, trying to go faster than a clipper that sailed the
same course in 1853. A sailing magazine reports:
As we went to press, Rich Wilson and Bill Biewenga were barely
maintaining a 4.5 day lead over the ghost of the clipper Northern
Light, whose record run from San Francisco to Boston theyre trying to beat. In 1853, the clipper made the passage in 76 days, 8
hours. (Great American II 1993, 100)

Informally, there are two distinct events in this story, the run
by the clipper in 1853 and the run by the catamaran in 1993 on
(approximately) the same course. In the magazine quotation, the
two runs are merged into a single event, a race between the catamaran and the clippers ghost. The two distinct events correspond to two input mental spaces, which reflect salient aspects
of each event: the voyage, the departure and arrival points, the
period and time of travel, the boat, and its positions at various
times. The two events share a more schematic frame of sailing from San Francisco to Boston; this is a generic space, which
connects them. Blending consists in partially matching the two
inputs and projecting selectively from these two input spaces into
a fourth mental space, the blended space, as shown in Figure 1.
In the blended space, we have two boats on the same course
that left the starting point, San Francisco, on the same day.
Pattern completion allows us to construe this situation as a race
(by importing the familiar background frame of racing and the
emotions that go with it). This construal is emergent in the blend.
The motion of the boats is structurally constrained by the mappings. Language signals the blend explicitly in this case by using
the expression ghost-ship. By running the blend imaginatively
and dynamically by unfolding the race through time we have
the relative positions of the boats and their dynamics.
Crucially, the blended space remains connected to the inputs
by the mappings, so that real inferences can be computed in the
inputs from the imaginary situation in the blended space. For
example, we can deduce that the catamaran is going faster overall
in 1993 than the clipper did in 1853, and, more precisely, we have
some idea (four and a half days) of their relative performances.
We can also interpret the emotions of the catamaran crew in
terms of the familiar emotions linked to the frame of racing.

Generic space

cross-space mapping
Input space 1

Input space 2
selective projection

Blended space

Figure 1.

The boat race example is a simple case of blending. Two


inputs share structure. They get linked by a cross-space mapping and projected selectively to a blended space. The projection
allows emergent structure to develop on the basis of composition, pattern completion (based on background models), and
elaboration (running the blend).
CLINTON AND ROOSEVELT. The type of conceptual blend just discussed (technically called a mirror network because the same
frame organizes both inputs) is very general. For example, a
political comment on Bill Clintons presidency after he had been
in office a few months might have been:
By this point, Roosevelt was far ahead of Clinton.

The two inputs are Roosevelts and Clintons presidencies.


They are mapped onto each other in a natural way: Starting
points, midpoints, and so on are matched. In the blend, Roosevelt
and Clinton are brought together within the same time frame so
that they are competing against each other. Blends of this sort are
routinely elaborated for reasoning purposes in political analysis.

Computer Interfaces
A nice example of conceptual blending in action and design is
the desktop interface, in which the computer user moves icons
around on a simulated desktop, gives alphanumeric commands, and makes selections by pointing at options on menus.
Users recruit from their knowledge of office work, interpersonal
commands, pointing, and choosing from lists. All of these are
inputs to the imaginative invention of a blended scenario that
serves as the basis for integrated performance. Once this blend
is achieved, it delivers an amazing number of multiple bindings across quite different elements, bindings that seem, in retrospect, entirely obvious. A configuration of continuous pixels
on the screen is bound to the concept folder, no matter where
that configuration occurs on the screen. Folders have identities,
which are preserved. The label at the bottom of the folder in one
view of the desktop corresponds to a set of words in a menu in
another view. Pushing a button twice corresponds to opening.

195

Conceptual Blending
Pushing a button once when an arrow on the screen is superimposed on a folder corresponds to lifting into view. Of course, in
the technological device that makes the blend possible, namely,
the computer interface, there is no ordinary lifting, moving, or
opening happening at all, only variations in the illumination of a
finite and arranged number of pixels on the screen. The blend is
not the screen; the blend is an imaginative mental creation that
lets us use the computer hardware and software effectively. In
the blend, there is lifting, moving, opening, and so on happening, imported not from the technological device at hand, which
is only a medium, but from another input, namely, our mental
conception of work we do on a real desktop.

The Network Model


Conceptual blending is described and studied scientifically in
terms of integration networks. In its most basic form, a conceptual integration network consists of four connected mental
spaces: two partially matched input spaces, a generic space
constituted by structure common to the inputs, and the blended
space. The blended space is constructed through selective
projection from the inputs, pattern completion, and dynamic
elaboration. The blend has emergent dynamics. It can be run
while its connections to the other spaces remain in place.
Neurobiologically, it has been suggested that elements in mental spaces correspond to activated neural assemblies and that
linking between elements corresponds to neurobiological binding (e.g., coactivation). On this view, mental spaces are built up,
interconnected, and blended in working memory by activating structures available from long-term memory. Mental spaces
can be modified dynamically as thought and discourse unfold.
Products of integration networks can become entrenched to
constitute grammatical constructions, basic metaphors, new
frames, and other elements of the conceptual repertoire.
Four main types of integration networks have been distinguished: simplex, mirror, single-scope, and double-scope. In
simplexes, one input consists of a frame and the other consists of
specific elements. A frame is a conventional and schematic organization of knowledge, such as buying gasoline. In mirrors, a
common organizing frame is shared by all spaces in the network.
In single-scopes, the organizing frames of the inputs are different,
and the blend inherits only one of those frames. In double-scopes,
essential frame and identity properties are brought in from both
inputs. Double-scope blending can resolve clashes between
inputs that differ fundamentally in content and topology. This is
a powerful source of human creativity.
The main types of networks just mentioned are actually
prototypes along a continuum that anchors our intuitive everyday notions about meaning to a unified understanding of the
unconscious processes at work. Varieties of meaning traditionally
considered unequal or even incommensurable categorizations, analogies, counterfactuals, metaphors, rituals, logical
framing, grammatical constructions can all be situated on this
continuum. Conceptual blending has been shown to operate in
the same way at the highest levels of scientific, artistic, and literary
thought and at the supposedly lower levels of elementary understanding and sentence meaning. Elaborate blending is at work in
superficially simple expressions like safe gun versus safe child,
guilty pleasures, caffeine headache, or money problem.

196

Emergent structure arises routinely, as in This surgeon is a lumberjack, which suggests that the surgeon is incompetent, though
incompetence is a feature of neither surgeon nor lumberjack.
There are opposing pressures within an integration network
to maximize topology matching, integration, unpacking of the
blend, Web connections, compression, and intentionality. More
complex integration networks (multiple blends) allow multiple
input spaces, and successive blending in which blends at one
level can be inputs at another.

Compression
Blending is a remarkable tool of compression over vital relations
like time, space, causeeffect, identity, and change. In the newborn ritual, time is compressed: An entire lifetime becomes, in
the blend, the short time it takes to climb the stairs; in the desktop
interface, the complex sequence of events that move the mouse
horizontally and cause an apparent vertical motion of the arrow
(and other objects) on the screen is compressed and integrated
into a single action moving the arrow. This is a compression of
space, causeeffect, and change.

Language Science
The role of conceptual blending in language has been investigated in many areas.
FICTIVE MOTION. Languages have means of describing static
scenes in terms of fictive motion:
The fence runs all the way down to the river.
This works by having an imaginary trajector move along the relevant dimension of an object, in this case the fence or along some
imaginary path linking two objects. This is a remarkable mode
of expression: It conveys motion and immobility at the same time.
Objective immobility is expressed along with perceptual or conceptual motion. This apparent contradiction is a consequence of
conceptual blending, which allows several connected, but heterogeneous, mental spaces to be maintained simultaneously within
a single mental construction. An input space containing a static
scene of a fence and a river is blended with an input space that
contributes a moving trajector on a path with a reference point.
COUNTERFACTUALS. Human thought depends heavily on the
capacity for counterfactual thought, and counterfactuals are
complex blends. Most of us can effortlessly understand statements like In France, Watergate would not have hurt Nixon. This
counterfactual is intended to highlight some differences between
the American and French cultural and political systems. It is a
blend that brings in aspects of the French system from one input
and the Watergate scandal and President Richard Nixon from the
other. In the blend, we have a Watergate-like situation in France.
Running this blend delivers attitudes quite different from those
in the American input, and so in the blend, the president is not
harmed. Counterfactuals can blend frames and identities in powerful ways (If I were you ). Such blends have been shown to
play a major role for reasoning in the natural and social sciences.
THE ORIGIN OF LANGUAGE. The central problem in the origins
of language is that conceptual structure is vast relative to
expressive structure. The central problem of expression is that we

Conceptual Blending

Conceptual Development and Change

and perhaps other mammals have a vast, open-ended number of


frames and provisional conceptual assemblies that we manipulate. Even if we had only one word per frame, the result would be
too many words to manage. Double-scope integration permits
us to use vocabulary and grammar for one frame or domain or
conceptual assembly to say things about others. It brings a level
of efficiency and generality that suddenly makes the challenging
mental logistics of expression tractable. The forms of language
work not because we have managed to encode in them these
vast and open-ended ranges of meaning but because they make
it possible to prompt for high-level blends over conceptual arrays
we already command. Neither the conceptual operations nor the
conceptual arrays are encoded, carried, contained, or otherwise
captured by the forms of language. The forms need not and cannot carry the full construal of the specific situation but, instead,
consist of prompts for thinking about situations in the appropriate way to arrive at a construal. Blended spaces can have as
projections grammatical and lexical forms that come from the
input spaces. Accordingly, meaning that is special to the blend
can be expressed through forms that are already available from
the inputs. In virtue of double-scope blending, new or contextually dependent meaning does not require new expressive forms.
Double-scope blending is, accordingly, the indispensable operation that makes cognitively modern human language possible.
OTHER WORK. Other analyses of the role of conceptual blending in
language consider nominal compounds (Fauconnier and Turner
2002); relative clauses (Nikiforidou 2005); semantic extensions
(Coulson 2001); sign languages (Liddell 2003); discourse constructions (Tobin 2006); syntax and morphology (Mandelblit
2000b); polysemy (Fauconnier and Turner 2003); semantic
change and composition (Sweetser 1999); and many other areas.
Mark Turner
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Coulson, Seana. 2001. Semantic Leaps: Frame-shifting and
Conceptual Blending in Meaning Construction. New York and
Cambridge: Cambridge University Press
Coulson, Seana, and Oakley, Todd, eds. 2000. Cognitive Linguistics 11.3/4.
Special issue on conceptual blending.
Coulson, Seana, and Oakley, Todd. 2005. Journal of Pragmatics 37.10.
Special issue on conceptual blending.
Dancygier, Barbara, ed. 2006. Language and Literature 15.1. Special issue
on conceptual blending.
Fauconnier, Gilles. 1997. Mappings in Thought and Language.
Cambridge: Cambridge University Press.
Fauconnier, Gilles, and Mark Turner. 1996. Blending as a central process of grammar. In Conceptual Structure, Discourse, and Language,
ed. Adele Goldberg, 11330. Stanford, CA: Center for the Study of
Language and Information (CSLI).
. 1998. Conceptual integration networks. Cognitive Science
22.2: 13387.
. 2002. The Way We Think: Conceptual Blending and the Minds
Hidden Complexities. New York: Basic Books.
. 2003. Polysemy and conceptual blending. In Polysemy: Flexible
Patterns of Meaning in Mind and Language, ed. Brigitte Nerlich,
Vimala Herman, Zazie Todd, and David Clarke, 7994. Berlin and New
York: Mouton de Gruyter.
Great American II. 1993. Latitude 38 (April): 100.

Liddell, Scott K. 2003. Grammar, Gesture, and Meaning in American Sign


Language. Cambridge: Cambridge University Press.
Mandelblit, Nili. 2000a. Conceptual blending and the interpretation of relatives: A case study from Greek. Cognitive Linguistics
11.3/4: 197252.
Mandelblit, Nili.. 2000b. The grammatical marking of conceptual integration: From syntax to morphology. In Coulson and Oakley 2000,
197252.
Nikiforidou, Kiki. 2005. Conceptual blending and the interpretation of relatives: A case study from Greek. Cognitive Linguistics
16.1: 169206.
Sweetser, Eve. 1999. Compositionality and blending: Semantic
composition in a cognitively realistic framework. In Cognitive
Linguistics: Foundations, Scope, and Methodology, ed. Theo Janssen and
Gisela Redeker, 12962. Berlin and New York: Mouton de Gruyter.
Tobin, Vera. 2006. Ways of reading Sherlock Holmes: The entrenchment
of discourse blends. In Dancygier 2006, 7390.
Turner, Mark. 1996. The Literary Mind: The Origins of Thought and
Language. New York: Oxford University Press.
. 19992007. The Blending Web site. Available online at: http://
blending.stanford.edu
. 2001. Cognitive Dimensions of Social Science: The Way We Think
About Politics, Economics, Law, and Society. New York: Oxford
University Press.
Turner, Mark, and Gilles Fauconnier. 1999. A mechanism of creativity.
Poetics Today 20.3: 397418.

CONCEPTUAL DEVELOPMENT AND CHANGE


Most researchers allow for some kind of change in concepts
over the course of development and as adults go from lay knowledge to expert knowledge. The main exception to this view would
be those who maintain that concepts have no discernible internal mental structure and, therefore, lack sufficient substrate to
change (Fodor 1998). For present purposes, it is assumed that
concepts do have internal structure that then can be used to
characterize change. For those more minimalist accounts of
structure, conceptual change must be understood as a proxy for
changes in how stable concepts are accessed, used, and mentally
manipulated.
The study of conceptual change has often been hampered
by quite different senses of what is meant by the phrase, with
the consequences that scholars who seem to be disagreeing are
often talking past one another. It is therefore useful to consider
several distinct senses of conceptual change as well as patterns
of developmental change that do not reflect alterations in the
concepts proper.
The most minimal forms of conceptual change involve elaborations on concept structures in ways that do not cause changes
in any other concepts and that do not cause a restructuring of the
concept in which the elaboration occurs. For example, a child
might learn that chairs can be subdivided into kitchen chairs
and living room chairs. Such a subdivision might not appreciably
change the concept of the superordinate category of chair (see
categorization). One can also learn more details about members of a category by simply adding more features to all members. Thus, a child might initially know that cars have wheels and
carry people and only later add the additional features of having
windshield wipers, brakes, and seatbelts. Concepts can therefore
go from relatively feature-sparse representations of members of

197

Conceptual Development and Change


a category to feature-rich representations. Some call this conceptual enrichment, arguing that it shouldnt really count as true
conceptual change at all (Carey 1991).
Some changes are in the kinds of features that make up
concepts. Others are in the ways those features are mentally
represented and used. For example, young children might
favor perceptual features over functional ones (Nelson 1973;
Tomikawa and Dodd 1980), as well as concrete features over
abstract ones (Werner and Kaplan 1963). Instances of a concept
category would then initially be picked out on the basis of one
type of feature (e.g., perceptual features of chairs) and later on
the basis of another kind of feature (e.g., functional features of
chairs). A different approach would see changes not in the feature types but rather in how those features are used to make decisions about members of a category. For example, children have
been described as moving from holistic representations to more
analytic ones (Vygotksy [1934] 1986; Kemler and Smith 1978).
Thus, they might initially use a broad, roughly equal weighting
of all features that typically occur with members of the category
and then later switch to a focus on a few critical defining or central features. A child might initially identify an uncle as a friendly
adult about the age of ones parents and who is present around
holidays and has close bonds with one or both parents. Later,
that same child might focus exclusively on the features having to
do with whether a person is a male blood relative of one parent
(Keil 1989). In this characteristic-to-defining shift, feature sampling can change with development, even if feature types do not
(Keil and Batterman 1984).
Changes in feature types or distributions often seem to happen in ways that are related across concepts, leading to the idea
that concepts often change in clusters or domains. This pattern
suggests that concepts might get their meanings not just from
their constituent features but also from the ways in which they
relate to other concepts in the same domain. Experimentally,
such effects have been shown in cases where shifts in feature
usage, such as for uncle, are closely linked in time to shifts for
other kinship terms (Keil 1989). Similarly, once a child learns how
to extend one term in a domain to a new domain, such as the texture term rough to personality, the child is likely to extend at the
same time all other texture terms to personality, such as smooth
and slippery (Lehrer 1978; Keil 1986). These semantic field
effects suggest that conceptual change occurs in a larger framework that then influences all concepts within that framework.
One view of concepts as parts of larger structures is known
as the theory theory, in which conceptual change is understood
as part of a process of theory change (Carey 1985; Gopnik and
Wellman 1994). Thus, having a concept such as bird involves not
only knowing the features associated with birds but also having
a sense of why those features co-occur. Birds have wings, feathers, and hollow bones because all those structures work together
causally to support flight (Murphy and Medin 1985). Several
forms of restructuring have been proposed to model theory
change: conceptual differentiation, conceptual coalescences,
addition of new deeper levels of explanation, and complete reorganizations or revolutionary changes.
Conceptual differentiation occurs when a single concept differentiates into two new concepts that make the earlier concept
obsolete. Thus, children might initially have a concept of felt

198

weight or heaviness that later splits into two concepts of weight


as physical quantity and of density (Smith, Carey, and Wiser
1985). Conceptual differentiation is different from the case of
subdivision described earlier because it makes the original parent concept nonviable and, therefore, creates incommensurability between the differentiated conceptual structures and their
parent (Kuhn 1970; Carey 1988).
Conceptual coalescences often occur in the same system
where differentiations occur. Thus, as children split apart one
concept, they merge two other concepts together in a manner
that makes the original two incoherent. Young children may see
solids and liquids as of one kind of stuff and air as something
entirely different. Later, however, they may see all three phases
of matter as just variants on the same stuff (Smith, Solomon, and
Carey 2005). Similarly, young children may see stars and the sun
as very different kinds of things, only to realize later that they are
all of the same kind (Nussbaum 1979).
Deeper levels of explanation occur when a whole new realm
of causal regularities get added to the theory. For example, a
child might initially understand the bodys functions in terms of
gross macroscopic events, such as the chewing of food and motor
movement, and then later sense causal patterns at work at the
microscopic level. Although such additions can occur without
changing the high level of explanation, new insights at a lower
level can often feed back and influence the higher level.
Finally, conceptual revolutions can involve a dramatic reorganization of all the elements in a domain. For example, young
children might only understand animals, plants, and people as
either psychological entities or mechanical physical ones (Carey
1985). They would not understand plants and animals as living
things. At some point, however, in a manner analogous to conceptual revolutions in the history of science (e.g., Kuhn 1970;
Thagard 1992), a dramatic reorganization of an entire belief
system (see meaning and belief) occurs and a new category
of living things emerges (Chinn and Brewer 1993). It is unclear
how often such dramatic revolutions really occur (Inagaki and
Hatano 2002; Keil 2003).

Imposters
A major problem in the study of conceptual change concerns
cases where other patterns of cognitive developmental change
may give an appearance of conceptual change when none is
actually happening. Two common cases involve increasing
access and shifting relevance.
Increasing access refers to cases where cognitive limitations
having nothing to do with the concept per se limit its use (Rozin
1976). Thus, younger children might differ from older ones in
terms of memorial or attentional capacities that make them
unable to access a concept in a certain set of tasks. For example,
children might fail to engage in transitive reasoning in a wide
range of tasks, not because the children lack the concept of transitivity but because of the memory burdens imposed by having to
keep several inequalities in mind at the same time. When those
memory burdens are reduced by intensive practice with the
inequalities, the concept of transitivity is easily accessed, even
as the learning of the inequalities might be quite difficult (Bryant
and Trabasso 1971). One way of thinking about increasing access
can be seen in the metaphor of a young child learning to use a

Conceptual Development and Change

Conceptual Metaphor

heavy hammer. We might note at first that the child cannot use
the hammer at all and think that he or she has a hammer deficit, only to find out that with a much lighter hammer, the child
reveals a full understanding of hammers. In other cases, there
may be a real deficit in the form of the missing concept.
Shifting relevance refers to changes in which several possible conceptual interpretations first come to mind in a task. For
example, when young children are asked if worms eat, they
might initially say that worms do not because they interpret eat
in a psychological manner involving feelings of satiation, hunger,
and pleasant tastes. Older children who interpret eat in a biological sense of providing nutrition might judge that worms do it.
Such changes have been interpreted as evidence for the emergence of the new conceptual domain of biology (Carey 1985). Yet
younger children may also be able to access the biological sense
of eat when shown that such an interpretation is appropriate
(Gutheil, Vera, and Keil 1998).
In summary, there are several distinct varieties of conceptual
change as well as other patterns of cognitive change that can
masquerade as conceptual change. It is critical in discussions of
conceptual change and of the relations of conceptual change to
other topics, such as word meaning, to know which senses are
in play.
Frank Keil
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bryant, Peter E., and Thomas Trabasso. 1971. Transitive inferences and
memory in young children. Nature 232: 4568
Carey, Susan. 1985. Conceptual Change in Childhood. Cambridge,
MA: MIT Press.
. 1988. Conceptual differences between children and adults.
Mind and Language 3: 16781.
. 1991. Knowledge acquisition: Enrichment or conceptual
change? In The Epigenesis of Mind: Essays on Biology and Cognition,
ed. S. Carey and R. Gelman, 25791. Hillsdale, NJ: Erlbaum.
Chinn, C., and W. Brewer. 1993. The role of anomalous data in knowledge acquisition: A theoretical framework and implications for science
instruction. Review of Educational Research 63: 149.
Fodor, Jerry. 1998. Concepts: Where Cognitive Science Went Wrong. The
1996 John Locke Lectures. Oxford: Oxford University Press.
Gopnik, Allison, and Henry Wellman. 1994. The theory-theory. In
Mapping the Mind: Domain Specificity in Cognition and Culture, ed.
L. Hirschfeld and S. Gelman, 25793. New York: Cambridge University
Press.
Gutheil, G., A. Vera, and F. Keil. 1998. Houseflies dont think: Patterns of
induction and biological beliefs in development. Cognition 66: 3349.
Inagaki, Kayoko, and Giyoo Hatano. 2002. Young Childrens Naive
Thinking about the Biological World. New York: Psychology Press.
Keil, Frank. C. 1986. Conceptual domains and the acquisition of metaphor. Cognitive Development 1: 7396.
. 1989. Concepts, Kinds, and Cognitive Development. Cambridge,
MA: MIT Press.
. 2003. Thats life: Coming to understand biology. Human
Development 46: 36977.
Keil, Frank C., and Nancy Batterman. 1984. A characteristic-to-defining
shift in the acquisition of word meaning. Journal of Verbal Learning
and Verbal Behavior 23: 22136.
Kemler, Deborah G., and Linda B. Smith. 1978. Is there a developmental trend from integrality to separability in perception? Journal of
Experimental Child Psychology 26: 498507.

Kuhn, Thomas. 1970. The Structure of Scientific Revolutions.


Chicago: University of Chicago Press.
Lehrer, A. 1978. Structures of the lexicon and transfer of meaning.
Lingua 4: 95123.
Murphy, Gregory, and Douglas Medin. 1985. The role of theories in conceptual coherence. Psychological Review 92: 289316.
Nakhleh, M. B., and A. Samarapungavan. 1999. Elementary school childrens beliefs about matter. Journal of Research in Science Teaching
36: 777805.
Nelson, Katherine. 1973. Some evidence for the cognitive primacy of
categorization and its functional basis. Merrill-Palmer Quarterly
19: 2139.
Nussbaum, J. 1979. Childrens conception of the earth as a cosmic
body: A cross age study. Science Education 63: 83.
Rozin, Paul. 1976. The evolution of intelligence and access to the cognitive unconscious. In Progress in Psychobiology and Physiological
Psychology. Vol. 6. Ed. J. M. Sprague and A. N. Epstein, 24528. New
York: Academic Press.
Smith, Carol, Susan Carey, and Marianne Wiser. 1985. On differentiation: A case study of the development of size, weight, and density.
Cognition 21: 177237.
Smith, Carol, Gregory Solomon, and Susan Carey. 2005. Never getting to
zero: Elementary school students understanding of the infinite divisibility of matter and number. Cognitive Psychology 51: 10140.
Thagard, Paul. 1992. Conceptual Revolutions. Princeton, NJ: Princeton
University Press.
Tomikawa, S. A., and D. H. Dodd. 1980. Early word meanings: Perceptually
or functionally based? Child Development 51: 11039.
Vygotsky, L. S. [1934] 1986. Thought and Language. Cambridge, MA: MIT
Press.
Werner, H., and B. Kaplan. 1963. Symbol Formation: An OrganismicDevelopmental Approach to Language and the Expression of Thought.
New York: Wiley.

CONCEPTUAL METAPHOR
According to proponents of conceptual metaphor theory, conceptual metaphors are metaphors that we have in our minds
that allow us to produce and understand abstract concepts. The
theory was first expounded by George Lakoff and Mark Johnson
(1980), who argued that conceptual metaphors structure how
people perceive, how they think, and what they do. According
to Lakoff (1993), conceptual metaphors represent habitual ways
of thinking, in which people metaphorically construe abstract
concepts such as time, emotions, and feelings, in terms of more
concrete entities.

Some Conceptual Metaphors and Mappings


Conceptual metaphors are usually expressed in an A IS B format,
using capital letters. For example, in the conceptual metaphor
THEORIES ARE BUILDINGS, theories (an abstract concept)
are viewed metaphorically as buildings (a concrete entity).
Conceptual metaphors consist of a source domain and a target domain (see source and target ). So in the conceptual
metaphor THEORIES ARE BUILDINGS, buildings constitute
the source domain, and theories constitute the target domain.
THEORIES are thus viewed as if they were BUILDINGS (examples follow). Lakoff (1993) describes the relationship between
the two domains of a conceptual metaphor as a function,
where specific properties of the source domain are mapped
onto the target domain (see mapping). So, in the conceptual

199

Conceptual Metaphor

Conceptual Metaphors
e.g., ARGUMENT IS WARFARE

Linguistic Metaphors
e.g., Mr. Marshall had the knives out
for Mr. Manning

They involve the drawing together of


incongruous domains.

They involve the drawing together of


incongruous words.

They are structures that are deeply


embedded in the collective
subconscious of a speech community.

They are surface-level linguistic features.

They are thought to constitute a


structured system upon which much
abstract thought is based.

They are usually used to get a particular


point across, or to perform a particular
function.

metaphor THEORIES ARE BUILDINGS, properties of the


source domain, BUILDINGS, such as needing a foundation or
being built from component parts, are mapped onto the target
domain of THEORIES, allowing us to talk about theories being
built on assumptions and axioms or put together by connecting smaller ideas. The relationship is thus one way: Theories are
treated as buildings, but buildings are not treated as theories.
Source domains are thus broad, often complex, cluster-like categories that can provide a rich source of mappings (Littlemore
and Low 2006). They are sometimes described as image schemas, as they can be represented in highly abstracted simple
diagrams.
Conceptual metaphors are thought to be acquired in infancy,
through our physical interaction with the world, by the way in
which we perceive the environment, move our bodies, and exert
and experience force. Other peoples habitual ways of selecting
and using image schemas will also be influential.
The conceptual metaphor THEORIES ARE BUILDINGS manifests itself in expressions such as:
You have to construct your argument carefully.
but they now have a solid weight of scientific evidence.
The pecking order theory rests on sticky dividend policy.
This theory is totally without foundation.
in which case, the entire theory would have no support.
He has done his best to undermine the theory.
In an attempt to build a formal theory of underdevelopment

The value of a scholarly theory should stand or fall on the


character of the evidence.
One of the most productive conceptual metaphors is the

conduit metaphor in which communication is seen as transfer from one person to another, allowing us to talk, for example,
about conveying information, and getting the message across.
Another conceptual metaphor, PROGRESS THROUGH TIME IS
FORWARD MOTION, results in expressions such as plan ahead,

200

Figure 1. The main differences between conceptual


and linguistic metaphors.

back in the 60s and to move on. In the same way, ARGUMENT
is often thought of in terms of WARFARE, UNDERSTANDING is
often expressed in terms of SEEING, LOVE is often thought of in
terms of a PHYSICAL FORCE, and IDEAS are often thought of in
terms of OBJECTS. Conceptual metaphors are thought to exist
for every abstract concept that we have, although there is no oneto-one mapping: A single abstract concept can be understood
through several conceptual metaphors, and a single conceptual
metaphor can be used to explain several abstract concepts. Some
conceptual metaphors are universal, whereas others vary from
language to language (cf. metaphor, universals of).
Conceptual metaphors are often very complex, and one
conceptual metaphor will frequently give rise to a series of
mappings. For example, the conceptual metaphor THINKING
IS PERCEIVING gives rise to mappings such as IDEAS ARE
THINGS PERCEIVED (its quite clear to me); ATTEMPTING TO
GAIN KNOWLEDGE IS SEARCHING (Im still looking for a solution); and BEING IGNORANT IS BEING UNABLE TO SEE (you
have allowed yourself to be blinded to the truth) (Gibbs 2006).
Conceptual metaphors differ from conceptual metonymies in
that they involve mappings between different domains, whereas
in conceptual metonymies, one part of the single domain is used
to refer to another, related part of that domain.

Conceptual and Linguistic Metaphor


It is useful to distinguish between conceptual metaphor and
linguistic metaphor. Conceptual metaphors are cognitive structures that are deeply embedded in our subconscious minds,
whereas linguistic metaphors are surface-level linguistic phenomena. It is important to note that the precise words used to
describe the two domains in a conceptual metaphor (like TIME
and MONEY) are not important or at least not crucial. This is very
different from the situation with linguistic metaphors, where it
is the exact words that constitute the metaphor (Littlemore and
Low 2006). Indeed, the whole point of a conceptual metaphor is
that it stands apart from actual exemplars. Figure 1 shows the
main differences between conceptual metaphors and linguistic
metaphors.

Conceptual Metaphor
At times, the ability to understand linguistic metaphors
(when they are first encountered) may rely on the successful identification of a relevant conceptual metaphor; at other
times, it may not. However, the ability to identify an appropriate conceptual metaphor in itself is rarely sufficient to allow a
complete understanding of a linguistic metaphor. Additional
metaphoric thinking is usually required, which takes into
account the context in which the metaphor appears and the
function that it is intended to perform. For example, in order
to understand the metaphor slavery was well on the road to
extinction, it may be helpful (but not necessary) to think in
terms of the conceptual metaphor PROGRESS IS FORWARD
MOTION. However, further metaphoric thinking is required to
understand that considerable progress has already been made
and that there is likely to be no turning back. Thus, conceptual metaphors sometimes help us to understand linguistic
metaphors, but they are not always a necessary prerequisite
nor a sufficient condition (see necessary and sufficient
conditions ).

Developments in Conceptual Metaphor Theory


Although conceptual metaphor theory has been hugely influential, it has come in for a certain amount of criticism, which has led
to the theory itself being developed and refined. The main criticisms of conceptual metaphor theory include the following: that
the number of conceptual metaphors has had a tendency to proliferate; that they vary significantly in the extent to which they
are employed and elaborated; and that there is a huge amount of
overlap between them. Moreover, as Graham Low (1999; 2003)
points out, although it may be tempting, for example, to identify
a single conceptual metaphor of A THEORY IS A BUILDING in a
text containing phrases such as those listed previously, the analyst has no proof that buildings were ever present in the writers
mind when he or she was producing the text (cf. metaphor,
information transfer in). If the conceptual metaphor isnt
in the writers mind, then where is it? Could it be that it exists
only in the analysts mind?
In a partial response to criticisms such as these, Joseph Grady
(1997) suggests that conceptual metaphors do not in fact constitute the most basic level of mapping. Instead, he proposes the
idea of primary metaphors, which constitute a more fundamental type of metaphor (Grady and Johnson 2002). Primary metaphors arise out of our embodied functioning in the world (Gibbs
2006) and, as such, are more basic than conceptual metaphors.
They include very basic concepts, such as CHANGE IS MOTION,
HELP IS SUPPORT, and CAUSES ARE PHYSICAL SOURCES. One
primary metaphor can often underlie several conceptual metaphors. For example, the primary metaphor EXPERIENCE IS A
VALUED POSSESSION is held to underlie the conceptual metaphors DEATH IS A THIEF, A LOVED ONE IS A POSSESSION, and
OPPORTUNITIES ARE VALUABLE OBJECTS.
Primary metaphors are experiential, in that they result from
a projection of basic bodily experiences onto abstract domains.
As such, they are representative of a wider view of human cognition that gives a central role to embodiment. Proponents
of embodiment argue that we understand abstract concepts
in terms of our physical experiences with the world and that
the two are impossible to separate. This is closely linked to the

concept of synaesthesia, where one sensory stimulus evokes a


stimulus in a different sensory organ (see Ramachandran and
Hubbard 2001). For example, the fact that the color red often
denotes heat is due to our ability to make synaesthesic mappings between the senses of sight and touch. The synaesthetic
relationship between sound and vision is reflected in the fact
that dark or heavy music is likely to involve low notes and
minor keys, whereas light music is more likely to involve high
notes and major keys. Primary metaphors thus constitute a
more clearly delimited, cognitive, embodied phenomenon and
lend themselves much more readily to rigorous empirical testing than do conceptual metaphors.
Another criticism of conceptual metaphors is that they often
give only a partial explanation of more creative linguistic metaphors, and the relationship between the two is unclear. In order
to address this criticism, Andrew Goatly (1997) has extended
conceptual metaphor theory to take account of the more creative extensions of conceptual metaphors. Instead of conceptual
metaphors, he refers to root analogies. He uses this term to
reflect the fact that the original analogy often remains hidden
and its relationship to the creative expression is not always clear.
To illustrate his point, Goatly cites the expression the algebra
was the glue they were stuck in. This novel metaphorical expression is a creative extension of the root analogy DEVELOPMENT
IS FORWARD MOVEMENT, but the relationship is complex and
not immediately apparent. The root is there, but it cannot actually be seen.
A final criticism of conceptual metaphor theory has been that
the examples used to illustrate conceptual metaphors are not
always taken from real language data. Significant efforts are now
being made to address this issue, most notably by Alice Deignan
(2005) and Anatol Stefanowitsch and Stefan Gries (2006), all of
whom have used language corpora not only to identify examples
of conceptual metaphors but also to refine and develop conceptual metaphor theory itself (see corpus linguistics). This
approach allows for a more systematic assessment of the types of
source domains that feature in different genres and of the complex interplay between conceptual and linguistic metaphor. An
interesting insight to have come from their research is the fact
that the phraseological patterns surrounding the metaphorical
senses of a word often differ from those that surround its more
literal senses. Phraseological patterning is thus likely to make an
important contribution to the creation of meaning, and this must
be taken into account when studying conceptual metaphors.
Jeannette Littlemore
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Cameron, Lynne, and Graham Low, eds. 1999. Researching and Applying
Metaphor. Cambridge: Cambridge University Press.
Deignan, Alice. 2005. Metaphor and Corpus Linguistics. London: John
Benjamins.
Gibbs, Raymond. 2006. Embodiment and Cognitive Science.
Cambridge: Cambridge University Press.
Goatly, Andrew. 1997. The Language of Metaphors. London: Routledge.
Grady, Joseph. 1997. Theories are buildings revisited. Cognitive
Linguistics 8: 26790.
Grady, Joseph, and Christopher Johnson. 2002. Converging evidence for
the notions of subscene and primary scene. In Metaphor and Metonymy

201

Conduit Metaphor

Connectionism and Grammar

in Comparison and Contrast, ed. Rene Dirven and Ralf Prings, 53354.
Berlin: Mouton de Gruyter.
Kovecses, Zoltan. 2002. Metaphor: A Practical Introduction. Oxford: Oxford
University Press.
Lakoff, George. 1987. Women, Fire and Dangerous Things: What
Categories Reveal About the Mind. Chicago and London: University of
Chicago Press.
. 1993. The contemporary theory of metaphor. In Metaphor and
Thought 2d ed. Ed. Andrew Ortony, 20251. Cambridge: Cambridge
University Press.
Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By.
Chicago: University of Chicago Press.
Littlemore, Jeannette, and Graham Low. 2006. Figurative Thinking and
Foreign Language Learning. Basingstoke, UK: Palgrave MacMillan.
Low, Graham. 1999. Validating metaphor research projects. In
Researching and Applying Metaphor, ed. Lynne Cameron and Graham
Low, 4865. Cambridge: Cambridge University Press.
. 2003. Validating models in applied linguistics. Metaphor and
Symbol 18.4: 23954.
Ramachandran, V. S., and E. M. Hubbard. 2001. Synaesthesia a window into perception, thought and language. Journal of Consciousness
Studies 8.12: 334.
Stefanowitsch, Anatol, and Stefan Gries, eds. 2006. Corpus-based
Approaches to Metaphor and Metonymy. Berlin: Mouton de Gruyter.

CONDUIT METAPHOR
The conduit metaphor (Reddy [1979] 1993) models communication as a process in which the speaker puts information into
words and gets it across to a receiver, who tries to find the meaning in the words. Words are understood as containers, meanings as objects that can be put into words. Reddy was concerned
with the biasing influence this model has on our thinking about
communication.
Jrg Zinken
WORK CITED
Reddy, Michael J. [1979] 1993. The conduit metaphor: A case of frame
conflict in our language about language. In Metaphor and Thought,
ed. A. Ortony, 164201. Cambridge: Cambridge University Press.

CONNECTIONISM AND GRAMMAR


Connectionist approaches to language employ artificial neural
networks to model psycholinguistic phenomena (see connec-

tionist models, language structure, and representation). Although a few connectionist models have been used to
directly implement traditional types of grammar (e.g., Fanty 1986),
most aim to offer new ways of capturing key properties of grammar,
such as constituent structure and recursion (see recursion, iteration, and metarepresentation). In particular,
the latter models seek to demonstrate how important aspects of
grammar may emerge through learning, rather than being built
into the language system. This entry, therefore, focuses on the
radical connectionist models as they promise to provide new ways
of thinking about grammar and, as such, potentially could provide
the most substantial contribution to the language sciences.
Words in sentences are not merely strung together as beads
on a string but are combined in a hierarchical fashion. Grammars

202

capture this by specifying a set of constraints on the way that


words are put together to form different types of constituents,
such as noun phrases and verb phrases, as well as the way these
phrases may be combined to produce well-formed sentences.
Connectionist models have begun to show how constituent structure may be learned from the input. J. L. Elman (1990) trained a
simple recurrent network (which has a copy-back loop providing
it with a memory for past inputs) on a small context-free grammar and was able to show that the network could acquire aspects
of constituent structure. In related work, M. H. Christiansen and
N. Chater (1994) demonstrated that this kind of model is capable
of generalizing to novel syntactic constructions involving longdistance dependencies across constituents, suggesting that it is
able to exploit linguistic regularities that are defined across constituents. A subsequent model by D. L. T. Rohde (2002) has further shown that constituent structure can be learned from more
natural language-like input than that used by previous models,
indicating that this approach may scale up well to deal with fullblown language.
The notion of constituency that emerges in these models is
not the same as what is found in standard models of grammar.
Rather, connectionist models suggest a more context-sensitive
notion of constituency, dividing words and phrases into clusters without categorical boundaries and treating them differently depending on the linguistic context in which they occur.
For example, Elmans (1990) model was able to learn contextsensitive animacy constraints from word co-occurrence information, thus allowing it to distinguish semantically meaningful
sentences (e.g., The boy broke the plate) from nonsensical ones
(e.g., The plate broke the boy).
The generative power of grammars derives from recursion, the
notion that constituents can be embedded within one another
and even within themselves. The model by Elman (1991) was
perhaps the first to demonstrate the acquisition of a limited ability to process recursive structure in the form of right-branching
relative clauses (e.g., The cat chased the mouse that bit the dog),
as well as center-embedded constructions (e.g., The mouse that
the cat chased bit the dog). Christiansen and Chater (1994), as
well as Rohde (2002), extended this initial work by incorporating
several additional types of recursive structure, including sentential complements (e.g., Mary thinks that John says that ),
possessive genitives (e.g., Johns brothers friend ), and prepositional phrases (e.g., The house on the hill near the lake ).
Additionally, Christiansen and Chater (1999) demonstrated
that the performance of connectionist models closely match
human data from German and Dutch that relates to complex
sentences involving recursive center embeddings (with the following dependency relationship between nouns and verbs N1 N2
N3 V3 V2 V1) and cross-serial dependencies (N1 N2 N3 V1 V2 V3),
respectively. Specifically, people find doubly center-embedded
constructions in German much harder to process than comparable levels of cross-serial dependency embedding in Dutch
(controlling for semantic factors across the two languages), and
this pattern of processing difficulty was mirrored closely by the
model. As with the connectionist notion of constituency, the
recursive abilities of connectionist models deviate from standard
conceptions of recursion. Specifically, connectionist models are
unable to accommodate unlimited recursion; it is important

Connectionism, Language Science, and Meaning


to note, however, that they are able to capture recursion at the
level of human abilities, as evidenced by psycholinguistic
experimentation.
Connectionist approaches to grammar are still very much
in their infancy and currently do not have the kind of coverage
and grammatical sophistication as seen in more traditional
computational models of syntax. Moreover, the question
remains as to whether the initial encouraging results described
here can be scaled up to deal with the full complexities of real
language in a psychologically realistic way. If successful, however, then the conception of grammar may need to be radically
rethought, including notions of constituency and recursion.
Already, connectionist models have suggested that the idea
of an infinite linguistic competence, as typically prescribed
by generative grammar, may not be required for capturing human language performance. In this regard, the kind
of grammatical framework hinted at by connectionist models
more closely resemble those of construction grammars
and the usage-based theory of language than the traditional generative grammar approaches. Whatever the future
outcome of the connectionist approach to grammar may be, it
is likely to stimulate much debate over the nature of grammar
and language itself as it has done in the past and this, in
the long run, may be where connectionism will have the largest impact on the way we think about grammar within the language sciences.
Morten H. Christiansen
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Christiansen, M. H., and N. Chater. 1994. Generalization and connectionist language learning. Mind and Language 9: 27387.
. 1999. Toward a connectionist model of recursion in human linguistic performance. Cognitive Science 23: 157205.
Elman, J. L. 1990. Finding structure in time. Cognitive Science 14:
179211.
. 1991. Distributed representation, simple recurrent networks, and
grammatical structure. Machine Learning 7: 195225.
Fanty, M. A. 1986. Context-free parsing with connectionist networks.
In Neural Networks for Computing, ed. J. S. Denker, 14045. New
York: American Institute of Physics.
Onnis, L., M. H. Christiansen, and N. Chater. 2005. Cognitive science: Connectionist models of human language processing.
In Encyclopedia of Language and Linguistics, ed. K. Brown.
Oxford: Elsevier. This review article provides a more detailed treatment
of the issues discussed here.
Rohde, D. L. T. 2002. A connectionist model of sentence comprehension
and production. Ph.D. diss., Carnegie Mellon University, Department
of Computer Science, Pittsburgh, PA.
Rohde, D. L. T., and D. C. Plaut. 2003. Connectionist models of language
processing. Cognitive studies 10: 1028. Another review of connectionist models of language.

CONNECTIONISM, LANGUAGE SCIENCE,


AND MEANING
Connectionism, or parallel distributed processing, is a general
term for a set of particular cognitive architectures. With some variations, these architectures model mental processes on a shared
set of constituents and operations, drawn from neurobiology.

The constituents are parallel to neurons, and the operations are


parallel to the firing of neurons. However, connectionist models are not strictly neurobiological and may be implemented in
various materials (e.g., computers). More exactly, a connectionist architecture has nodes as its basic constituents. These nodes
are linked to one another, forming circuits. The nodes may have
different degrees of activation, and they receive activation from
other nodes in the circuit. When a node is activated in some
models, when it reaches a particular level of activation, a threshold it fires, transmitting its activation to subsequent nodes in
the circuit.
The individual connections among nodes are commonly
understood to have different degrees of strength. Strength is typically a multiplicative relation, such that the activation of the firing or input node is multiplied by the connection strength to yield
the amount of activation transmitted to the recipient node (e.g.,
a node firing at level 1 delivers a level of activation to a second
node of .5 if the strength of the connection between the nodes
is .5). These connection strengths may be altered by activation
sequences (e.g., in many models, when nodes activate together,
the strength of their connection increases). The connections may
be excitatory or inhibitory that is, a first node my increase or
decrease the activation of a second node. Connectionist circuits
or neural networks commonly have a set of input nodes, a set
of output nodes, and layers of hidden nodes. Connectionist
models also incorporate some way that errors may be detected
and corrected. In a connectionist model, correction is a matter of readjusting connection strengths among the nodes in the
circuit.
Connectionist modeling has two broad purposes. One relates
to artificial intelligence. The other relates to actual human cognition. Insofar as connectionist models are designed to explain
human cognition, the models are constrained by properties of
human behavior. Take, for example, a connectionist model
of plural formation in English. If a connectionist is merely setting out to create a program that generates plurals, he or she
does not need to worry about the precise sorts of errors actual
human beings make with plurals, the way plural usage develops
in childhood, and so on. However, a connectionist who is modeling actual human language will wish to design a system that
produces the same curve of correct plurals and errors that we
find among real people.

Connectionism and Neuroscience


The artificial intelligence value of connectionism seems clear.
But with respect to human language, one might ask why bother
with connectionist modeling at all? Why dont we simply do
neuroscience? After all, connectionism takes up the basic principles of neurobiology neuronal units, firing thresholds, circuits. However, it tends to eschew the fine-grained, empirically
based assignment of specialized neuronal or regional functions.
Moreover, it simply leaves out such important components of
neurobiology as neurochemistry.
Certainly, connectionist modeling of human cognitive architecture cannot replace neuroscience. Moreover, it does seem
clear that such modeling should follow the basic principles of
neuroscience (e.g., in modeling human language, it should not
posit processes that have no correlate in the brain). However,

203

Connectionism, Language Science, and Meaning


connectionist analyses serve two purposes beyond empirical
neuroscience. The first is the general purpose of abstract modeling. Empirical neuroscientists are rightly concerned with trying to define just what particular neurons, circuits, and regions
of the brain do. Connectionist modeling is concerned with just
what kinds of things an architecture of this general sort can do.
Clearly, the actual human brain does not do everything that a
similarly structured connectionist model can do. However, the
actual human brain does some subset of those things. Thus,
abstract modeling tells us what we should not be investigating
in terms of neural operation (those things that no system of this
sort could ever do). It also tells us what sorts of things we should
not ignore in investigating neural operation (those things that we
may have thought were impossible for such a system, but which
connectionist models indicate are not impossible). For example,
it may be argued that connectionist networks have been used
successfully to model the emergence of semantic categories
in infancy, even when they have incorporated little in the way
of innate structure (see Rogers and McClelland 2004, 12173;
innateness and innatism). In this way, connectionist modeling may serve as an important guide for the empirical investigation of the neural substrate of language.
The second function of connectionism is related. Despite
genuine advances in neurolinguistics in recent years (see brain
and language), there is considerable divergence of opinion in
the field and relatively little is well established. Apparently, firm
ideas (e.g., about the precise location and function of brocas
area and wernickes area) are continually being revised. It
seems highly unlikely that our fundamental understanding of
neurons, neuronal circuits, and other basic functional properties
will change very much. But particular neuroscientific accounts
of language operation are far less stable. Neuroscientists rightly
tend to follow currently promising avenues of research, pursuing particular empirical hypotheses in line with recent theories.
But, as a wise man (Hilary Putnam) once remarked, one of the
few things we know about our current theories is that they are
wrong. Given this, it is valuable to have a research program that
operates on the same basic, well-established principles as neurolinguistics, but which is not closely tied to the vagaries of current theorization.

Connectionism and Symbolic Architectures


Another difference between connectionism and neurobiology is
that connectionist models move, so to speak, in the direction of
symbolic architectures (what we commonly think of as mental
architectures, rather than neurobiological ones). It is common
for connectionists to define their models in opposition to those
of symbolic architectures, such as generative grammar.
Symbolic architectures are commonly thought to operate serially, rather than in parallel, and to operate locally, rather than in
a distributed fashion. So a symbolic architecture would typically
be understood as involving some singular representation (e.g.,
a concept or a sentence) that is put through a series of processes in a certain sequence following certain rules with rules
sometimes also (putatively) rejected in connectionist models.
Conceived of in absolute terms, the dichotomy is a false one.
First, there is a sequential element in all connectionist models. There would be no sense in speaking of input and output

204

otherwise. Second, there is a parallel element in symbolic architectures, at least when they are fully elaborated. For example,
in minimalism, the production of a sentence involves at least
some phonological, logical, and syntactic parallelism. Moreover,
connectionist networks clearly involve rules (even rules that
constitute operations over variables, the key set of rules stressed
by Gary Marcus [2001; see particularly 3583]). The rules are
embodied in the ways activation operates (e.g., through summation of inputs to thresholds).
Finally, it is not entirely clear that distributed versus unified
is really an opposition. For example, lexical items have components in symbolic accounts (e.g., semantic features). Writers in
the tradition of symbolic architecture do seem to envision those
components as occurring in one place, much like items in a dictionary entry occur on a single page. However, in terms of the
theories themselves, that only means that the components are
conceptually related, accessed together, and so on. A symbolic
architecture need not be committed to localism in the neural
substrate. The meaning of a given symbol may be realized in a
pattern of activation across different areas of the brain.
This is not to say that there are no differences between connectionist and symbolic accounts. But many of these are just the
general sort of differences that arise when one moves through
distinct levels of structure. For example, connectionist models
stress dispersal. Symbolic accounts stress unities. But this need
not constitute a contradiction anymore than differences between
physical and social accounts or macroscopic and microscopic
accounts need constitute a contradiction. We dont stop speaking
of trees and discussing their biology or their gross physical properties just because we discover that they are composed of atoms.
Nor do we say that a railway system is not reasonably treated as a
single thing just because it is dispersed in space. A similar point
holds for rules. It is indeed the case that symbolic systems tend to
involve many more rules and much more specific rules than connectionist architectures. But these rules are reasonably thought
of as emerging from neural networks. The existence of a neural
substrate without, for example, specific grammatical rules does
not invalidate a linguistic discussion of grammatical principles
anymore than the existence of a particle substrate invalidates an
engineers discussion of macroscopic causal laws.
Consider, for example, the head-directionality parameter
in principles and parameters theory. As Mark Baker
explains in his entry on parameters, Roughly put, when a
word-level category X [the head] merges with a phrase Y to create
a phrase of type X, there are two ways that the elements can be
ordered: The order can be X-Y within XP [X phrase], or it can be
Y-X. In languages throughout the world, heads tend to be added
in the same position, either first or last (not half in each, as one
might expect from random distribution). Grossly oversimplifying, we could imagine a connectionist model in which the networks for a range of processes develop some alternation between
initial position and final position. For example, it is easy to set up
a model in which some set of items triggers a directionality node,
which in turn activates either a beginning or an end node. Suppose
it activates the beginning node. Once that node is activated in
the context of a task say, identifying a determiner it will lead
to a behavioral output of checking the beginning. The absence
of the determiner at the beginning will initiate the correction

Connectionism, Language Science, and Meaning


mechanism. Given a general relation between beginning and
end nodes within the network, that mechanism could by default
lead to the activation of the end node. Computer simulations of
connectionist models could be used to examine whether this
might lead to a coordination of different heads as first or last. A
successful connectionist model would not imply that one should
not speak of a principle here (heads are added in one direction)
with a parameter to be set (first/last). Rather, it would provide a
model for the substrate of an emergent principle.
This account is, as I said, grossly oversimplified. For example,
it assumes that the system collects together all heads, rather than
treating different sorts of heads separately or rather than organizing grammar in ways that do not even include heads. This is
a very small instance of the sort of problem Steven Pinker has in
mind when he questions whether or not connectionist models can
scale up to the level of a full grammar (2002, 79). Pinker offers
this criticism in the context of supporting a symbolic approach to
grammar. But here, too, the opposition is mistaken. If we assume
that the substrate of any grammatical operation is the human
brain, then something along the lines of a connectionist network
will have to scale up to the level of a full grammar. As Marcus puts
it, referring to a specific case, The right question is not Can any
connectionist model capture the facts of inflection? but rather
What design features must a connectionist model that captures
the facts of inflection incorporate? (2001, 83).
Of course, Pinker has in mind something else here innatism.
It is true that symbolic approaches to language have often posited
a rich innate grammar, whereas connectionist approaches have
tended to minimize innatism. In general, it is valuable to have conflicting views on this issue driving competing research programs.
However, it seems that the association of innatism with symbolism,
on the one hand, and the association of noninnatist learning with
connectionism, on the other hand, is contingent. There is no reason that one could not have specializations and biases in a connectionist network (cf. Martindale 1995, 250). These would, of course,
not be rules in the symbolic sense but they would be equivalent
to rules at the level of the emergent structure. Conversely, there is
no reason that a symbolic account of grammar as such requires
rich innatism. It could begin from a view that grammar is learned
through experience, given general learning mechanisms (perhaps
of the sort currently associated with connectionism).

viviparous (e.g., it is an amoeba farm) and, more importantly,


when x is a farm animal and x is both oviparous and viviparous.
The former case is the obvious one, since it does not rely on the
peculiar nature of exclusive or (or the existence of simultaneously oviparous and viviparous creatures). However, real people
at some point come to understand the second possible falsity as
well. For example, they come to draw an inference very swiftly
when they learn that x is oviparous (i.e., they infer that x is not
viviparous) or when they learn that x is viviparous (i.e., they infer
that x is not oviparous). A connectionist model should produce
this result as well.
To model exclusive or, we begin with input nodes. In a full
model, we would need an array of nodes to represent farm,
another array for animal, and so on. But, for simplicity, lets
assume one node per word or object. Specifically, lets take the
case of Fluffy. Little Buffy arrives on the farm and encounters
Fluffy. She already has a node for farm animal. Farm is activated by her presence in the barn. Animal is activated by various
properties of Fluffy. Together, these activate farm animal. Fluffy
is assigned a new node. The activation of the (new) Fluffy node
along with the farm animal node serves to link the two. Buffy
already knows that animals either have eggs or have babies.
Thus, part of her stable knowledge involves the relation of exclusive disjunction between oviparous and viviparous. (Obviously,
it doesnt matter if she knows these particular words.)
This may be modeled in the following way (see Figure 1). Both
oviparous and viviparous have their own nodes (marked o and
v). There are also nodes for nonoviparous (-o) and nonviviparous (-v). Oviparous has an excitatory connection (marked by a
pointed arrow) with nonviviparous, which, in turn, has an inhibitory connection with viviparous (marked by an arrow with a circular head). Similarly, viviparous has an excitatory connection
with nonoviparous, which, in turn, has an inhibitory connection
with oviparous. Thus, when Buffy sees Fluffy sitting on a nestful
of eggs, she infers not only that Fluffy is oviparous but also that
she is nonviviparous. (Moreover, due to inhibition, she does not
infer that Fluffy is viviparous, even though viviparous might have

An Example from Truth Conditional Semantics


Consider a variation on a standard example exclusive or (i.e.,
one of two items, but not both) embedded, in this case, in a conditional assertion. Conditionals and logical connectives, such
as or, are important not only in the semantics of formal
languages but also in the semantics of natural languages. For
example, they are critical for truth conditional semantics.
Take a sentence such as If x is a farm animal, then x is oviparous or viviparous. The semantics of this sentence are, in fact,
enormously complex. A full connectionist model would have
to account for many things. For the moment, lets assume that
most of this is taken care of, focusing only on the truth condition
issue. In a truth conditional account, we need to map the sentence onto two values. These are T (or true) and F (or false). There
are only two conditions in which the sentence is mapped onto
F that is, when x is a farm animal and x is neither oviparous nor

v
o

Figure 1. A simplified model for exclusive or in the context oviparous or


viviparous. Arrows indicate excitatory connections. Lines with solid circles
indicate inhibitory connections. The symbol - stands for not; o stands
for oviparous; v for viviparous. Note that this model is not intended to
capture the truth conditions per se. Rather, it models the psychological
process idealized in the truth conditions. Thus, a disruption in a particular
connection could lead someone to hold logically contradictory beliefs.

205

Connectionism, Language Science, and Meaning

Connectionist Models, Language Structure, and Representation

T
c
o
v
o
v

ov
F

Figure 2. A simplified model of the psychologically operative truth conditions for If x is a farm animal, then x is oviparous or viviparous. The letter
c stands for the entire conditional. T stands for true and F for false. The
letters o and v stand for oviparous and viviparous; - stands for not;
and -ov stands for neither oviparous nor viviparous. Arrows indicate
excitatory relations. Lines with solid circles indicate inhibitory connections.
Solid lines indicate a connection strength of 1. Broken lines indicate a connection strength of .5. Activation of a node occurs at 1. Thus, for example,
both -o and -v would have to be activated for -ov to be activated. This is
because the .5 connection strengths reduce the activation they communicate to .5. Note that this is not a model of logical/empirical truth conditions but of psychologically operative truth conditions. Thus, it seeks to
capture an initial presumption about the truth of the conditional based,
in this case, on authority. This is presumably more psychologically realistic
than a model that yields T only in cases of inductive validity.

received some degree of activation simply from the [unpictured]


activations of mother, etc., in connection with Fluffy.)
Going further, we might model the entire sentence. There
are many ways in which this can be done. The following is an
attempt at a version that, while extremely simple, at least points
toward the way disjunctive or might operate psychologically
(see Figure 2). Suppose that when the node for our oviparous/
viviparous conditional (c) is activated, the truth node (T) is also
activated. For instance, if Uncle Bob tells Buffy that farm animals
either lay eggs or have babies, but not both, then Buffy assumes
that Uncle Bob is right. Buffy does not require positive evidence
to judge the conditional to be true. However, Buffy may discover
that Uncle Bob is secretly farming amoeba, which are nonoviparous and nonviviparous, or that he is farming some mutant species that lays some eggs and gives birth to some babies. In the
latter case, the simultaneous activation of oviparous and viviparous could be modeled as activating F (through connections of
strength .5). In a slightly more complex sequence, positive activation of both nonoviparous and nonviviparous might activate a
hidden node (say, -ov) through .5 connection strength links; this
node could, in turn, activate F (through a connection strength of
1). Finally, in this model, F would inhibit T.
This is, of course, an extremely simple model. In fact, connectionist models of language phenomena are highly complex,
requiring computer simulations to plot their predictions and developments (see, for example, Rogers and McClelland (2004); see also

connectionist models, language structure, and representation and connectionism and grammar). Nonetheless,
this model may partially illustrate some of the relations between
connectionist models, symbolic architectures, and neurobiology
outlined previously. For example, we do not seem to gain anything

206

by denying that this complex of parallel and distributed processes


yields principles. More importantly, this account also indicates
that the standard logical rule for exclusive or may not be our actual
psychological rule. For example, this model partially incorporates
our common bias toward confirmation, rather than falsification or
neutrality (see, for example, Nisbett and Ross 1980, 23842). The
sorts of statistical processes manifest in fully developed connectionist models would be able to suggest other deviations from the
logical rule as well the place of exceptions, the degree to which
our memories often revert to the assumption of truth even after disconfirmation, and so on. Finally, such developments may suggest
avenues of inquiry for neurological investigations (e.g., in semantic
and episodic memory), which are not initially obvious in the study
of conditionals or logical connectives.
In sum, connectionism provides us with a way of modeling
language processing, storage, and acquisition indeed, history,
evolution, variation, and other areas of language science as well.
It is inspired by neurobiology, but departs from this source to
follow a more independent research program that may direct
us toward productive areas of neurobiological research, or away
from unproductive areas. At the same time, it mediates between
neurobiology and symbolic approaches, in some cases suggesting problems with and possibilities for the latter as well as the
former. Although connectionist and symbolic architectures are
often viewed as diametrically opposed, it may be best to see the
former as modeling neurobiological substrates of language and
the latter as treating (rule-approximating) structures that emerge
from that substrate.
Patrick Colm Hogan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Marcus, Gary. 2001. The Algebraic Mind: Integrating Connectionism and
Cognitive Science. Cambridge, MA: MIT Press.
Martindale, Colin. 1995. Creativity and connectionism. In The Creative
Cognition Approach, ed. Steven Smith, Thomas Ward, and Ronald
Finke, 24068. Cambridge, MA: MIT Press.
McLeod, Peter, Kim Plunkett, and Edmund Rolls. 1998. Introduction
to Connectionist Modelling of Cognitive Processes. Oxford: Oxford
University Press.
Nisbett, Richard, and Lee Ross. 1980. Human Inference: Strategies and
Shortcomings of Social Judgment. Englewood Cliffs, NJ: Prentice-Hall.
Pinker, Steven. 2002. The Blank Slate: The Modern Denial of Human
Nature. New York: Viking.
Rogers, Timothy, and James McClelland. 2004. Semantic Cognition: A
Parallel Distributed Processing Approach. Cambridge, MA: MIT Press.

CONNECTIONIST MODELS, LANGUAGE STRUCTURE,


AND REPRESENTATION
Connectionist models have many desirable properties when
it comes to understanding several very basic things about
language:
Why language has the structure that it does;
How knowledge of language structure is represented so as to
flexibly encompass the sort of structure that it has;
How language can change over time in a direction that maintains and accentuates this sort of structure.

Connectionist Models, Language Structure, and Representation


Connectionist models also have desirable properties when it
comes to understanding language acquisition.

Quasi Regularity
A key, but underrecognized, aspect of all of natural languages is
the fact that they exhibit quasi regularity. That is, linguistic expressions generally have properties that are shared with many other
forms, while also having properties that are more idiosyncratic.
In general, they are neither completely regular nor completely
irregular, but instead are best understood as lying somewhere
along a continuum of regularity, with most forms somewhere
between the extremes. Languages do have a tendency to pull
novel and infrequent forms into conformity with other forms, but
they also have a tendency to maintain (even promote) forms with
item-specific idiosyncrasies and clusters of similar items that
share such idiosyncrasies. These idiosyncrasies coexist within
items that also exhibit sensitivity to more general regularities.
Lets consider a range of examples from several different subdomains of language:
Inflectional morphology
keep-kept
tell-told
say-said
have-had
All these cases exhibit the correct regular past tense inflection together with a vowel change or consonant deletion in the
stem. The pattern in keep is shared with a number of other verbs,
including sleep, creep, sweep, and somewhat more distantly
dream, kneel, and mean.
Derivational morphology
predict, prefer
dirty, rosy
idolize, replicate,
As many have noted, and Joan L. Bybee (1985) and Luigi Burzio
(2002) have considered extensively, there are many derived
inflectional forms that preserve semantic characteristics of
their constituents but bring in altered or additional elements of
meaning. Treating these items as either fully compositional or
as fully opaque misses out on capturing one or the other aspects
of their meaning.
Idioms, constructions, and collocations
I want to see a doctor.
She felt the baby kick.
Hour by hour I grow more and more lonely.
Such expressions nearly always participate in general patterns
characteristic of other expressions with overlapping constituents
(I saw a doctor, I saw a movie, I want to see a baseball game) but
carry idiosyncratic meaning.
Spelling sound correspondences
pint
clown
great
Consider PINT. This item is an exception, but three of the letters
(P, N, and T) take their regular correspondences, and there are

related cases, such as MIND, FIND, MILD, and so on, that overlap with PINT and share the atypical vowel correspondence seen
in this item. The other cases have similar properties.
The point, in all of these cases, is that an inherent feature
of language is its quasi regularity: irregular forms are nearly
always partially regular. Approaches to language that relegate
all quasi-regular forms to the status of exceptions are, in general,
eschewing an account of an important and productive aspect of
language.

Connectionist Models and Quasi Regularity


Connectionist models offer ways of representing language
knowledge and language learning that explain why it has the
quasi-regular structure that it does and why it changes in some
of the ways that it does. No extant connectionist model is perfect
in its account of any specific aspect of language, but such models
have the fundamental properties necessary for addressing this
very general property, its quasi-regular structure, justifying their
further development.
I present the principles of distributed connectionist models
within the context of a simple model of single word reading. I
use the model to underscore the point that the knowledge that
governs processing in connectionist models is fundamentally
different from the knowledge that linguists have traditionally
attributed to speakers and hearers, but it is ideally suited to
addressing the previous points, and thus is a candidate form of
knowledge for all aspects of language.
The essence of connectionist models is that they provide a
mechanism that is at once sensitive to both general and specific
information in ways that mirror human sensitivity to such information. A connectionist model can behave as though it knows
very general and completely idiosyncratic information; and it
can exploit both general and specific information in processing
novel forms in ways that appear to mirror those seen in normal
human subjects. The model I describe is a model of single word
reading (Plaut et al. 1996). It has input units representing graphemes, output units representing phonemes, and one intermediate layer of hidden units that are initially uncommitted.
In networks such as this, the knowledge that governs processing of an item is in the strengths of the connections among the
simple processing units. Knowledge enters the system gradually, through a connection adjustment process, based on experience with corresponding inputs and outputs. The experience is
assumed to mirror human experience in reading in that frequent
items (words like HAVE and TAKE) occur very frequently, while
less frequent items (words like LINT and COPE) occur far less
frequently.
Processing and learning take place as follows: The letters from
a word are presented over the input units; activation is allowed
to propagate forward to the output units; and adjustments are
then made to the connections within the network to reduce the
difference between the output produced by the network and the
pronunciation paired with the spelling. Note that this process is
not thought of as explicit correction of overt errors in the models
performance. The networks outputs can be translated into an
overt product, but learning is thought to occur from exposure
to spelling-sound pairs. (It is possible that once the process gets
started, many of the sounds are provided internally as a result of

207

Connectionist Models, Language Structure, and Representation


the readers ability to correctly predict the correct word based on
context. Of course, at first this is unlikely, and learning depends
on there being a supportive context in which the child hears the
sound associated with each word as it is being read by a parent
or teacher).
Consider the learning that occurs as a result of processing an
ordinary pair such as MINT /mint/: It affects weights coming
from units for M, I, N, and T and going to units for /m/, /I/, /n/,
and /t/. Over the corpus, words with M in the input will overwhelmingly have /m/ in the output and little else will be common across them, and so connections will gradually arise that
map M reliably to /m/, and similarly for most of the other letters. The result is that the weights coding for the relation between
spellings that begin with M and sounds beginning with /m/ will
largely be restricted to the weights out of the M unit and into the
/m/ unit. But now consider the vowel. The situation here is far
more complex. Depending largely on aspects of the coda, it will
tend to be pronounced either as /ai/ or as /I/: The former occurs
when there is a single consonant followed by a final E, as in MINE,
FINE, DINE, and so on, while the latter tends to occur when the I
is followed only by consonants as in MINT, LINK, SILT, and so on.
But there are several exceptions, including MIND, WIND; MILD,
CHILD; PINT. The network must learn to rely on the presence of
the final E to signal the /ai/ correspondence; to generally produce
/I/ otherwise; and to overcome this tendency in the specific constellation of circumstances corresponding to the exceptions.
As a result of the gradual learning process, the weights coding
for the pronunciation of the vowel in PINT will depend not only
on the letter I but also on all of the other letters in PINT. While
one way in which this could occur would be for the network to
carve out a separate hidden unit for it, that is not what it tends
to do; the knowledge appears to be distributed over connections
into and out of many of the hidden units. The important points
here are the following:
In general, any and all aspects of an input can be relevant to
any aspect of the output, but complex cases are harder for the
network to master and must work against the grain of the general tendencies embodied in the training corpus.
For the most part, even when some aspects of the output
such as the vowel in an exception like PINT depend on
all aspects of the input, many other aspects in this case,
the handling of the P, N, and T are largely componentially
determined.
Secondarily but still important, even the exceptional aspect of
PINT benefits to a degree from the learning that occurs with
related exceptions like MIND, FIND, WIND, and so on.
The model accounts for
main effects of regularity, frequency, and frequency by regularity interactions in human RTs and in pattern of errors
(under conditions where errors are likely);
graded effects of consistency of a word with pronunciations of
other known words;
the pattern of performance exhibited by human adults in reading pronounceable nonwords that contain bodies that vary
in the consistency of their pronunciation. Specifically: The
model reads regular forms that are consistent with all of their

208

neighbors items like BOPE with a high degree of consistency with conventional spelling-sound rules, but shows considerable inconsistency in productions of forms such as:
GROOK, PREAD, MAVE

The pattern in the model, as in human readers, is to choose


one of the correspondences of the verb that occurs either in
forms traditionally treated as regular or irregular, but not necessarily to choose the regular form.
The model exhibits sensitivity to the following:
First-order spelling-sound regularities:
M /m/
B /b/
Local context-sensitive patterns:
C /s/ when followed by {I | E} as in cell, cent
O oh when followed by L{L|D|K|T} as in told, colt
Partially consistent clusters, such as
OO{K, sometimes T,F} {vowel in put}
EA{often D,TH, sometimes F} {vowel in bread}
Exceptions that weakly cluster with others, such as PINT.
Idiosyncratic high-frequency exceptions like HAVE.
The model also accounts for the near (but not total) independence of onsets and rhymes, and the much greater dependency
between vowels and codas. This independence is not complete,
as indicated by a few special forms like wash, warm, and so on.
The model captures this sensitivity in these cases.
The model produces gradient effects of the degree of consistency with properties of words with similar rhyme-spellings
and of the items own frequency. The specific pattern of these
gradient effects corresponds to key features of experimental
data: Performance on items that are highly consistent with their
neighbors is relatively insensitive to the frequency of the item
itself (leading some to assert that such items are dealt with by
algebraic rules, but fully consistent with the properties of connectionist models). Performance on items that are inconsistent with
their neighbors is highly sensitive to the items own frequency.
Sensitivity to an items own frequency is a matter of degree itself
dependent on the degree of inconsistency.
The model does these things without having an explicit representation of either any specific lexical item or of any other
subword unit (other than the grapheme and the phoneme), and
without having an explicit representation of any of the rules or
correspondences.
It also has the very important property of using general knowledge not only to process the fully regular cases but also, to the
fullest extent possible, to process those items that are partially
exceptional: It is only the idiosyncratic aspects of exceptions that
are processed differently from fully regular forms. It thus captures the tendency for exceptions to be largely regular in nature,
the key, underappreciated feature of all languages with which I
began this entry.
The same general principles have been used to capture a wide
range of phenomena in reading, language processing, and many
other domains.
Here, I briefly related mention several related models, in part
to indicate the generality of the approach and in part to address
concerns that some readers may have with some of the apparent

Connectionist Models, Language Structure, and Representation


commitments made by the reading model of D. C. Plaut and
colleagues.

commitment to structure in the representations of inputs and


outputs.

Models of past tense formation, based on mapping


stem->past (e.g., Rumelhart and McClelland 1986; Plunkett
and Marchman 1991).

Models that work from raw speech input (Keidel et al. 2003).

These models illustrate the same principles as the reading


model already discussed. Both articles discuss how such models
can capture sensitivity to quasi-regular patterns, and K. Plunkett
and V. Marchman explain why there are so few suppletions
(stem-past pairs like go-went) and those that exist are only of
high frequency. Essentially, it is very hard work for a connectionist network to learn a completely arbitrary mapping. A tendency for weights to be shared across items forces these models
to exhibit sensitivity to regularities, while the fact that they are
trained gradually through small adjustments of connections
makes them inherently sensitive to frequency effects.
Models of past tense formation, mapping from meaning to
sound and sound to meaning (Joanisse and Seidenberg 1999;
Lupyan and McClelland 2003).
The same general principles are at work here as well, but the
approach brings out the quasi-systematic relationship between
semantic and syntactic properties of words, on the one hand,
and their phonology, on the other. This line of work has not been
pursued fully enough.
Models of derivational morphology (Plaut and Gonnerman
2000).
These authors offer a model that maps from form to meaning, with such forms as government, predict, and so on. The
model captures graded priming relations, depending on degree
of compositionality. In ongoing work, Plaut and colleagues have
considered other languages (e.g., Hebrew) that show stronger
morphological effects. Here, one gets morphological priming
even in cases where there is no actual morphological relationship. Extensions of the Plaut/Gonnerman model show that when
trained with a more highly systematic corpus (characteristic of
Hebrew), networks also exhibit this property. In other words, the
networks tendency to parse morphologically complex forms into
components (see the entries on parsing) reflects both the specific properties of individual items and the prevailing tendency
found among items in the language.
Models of sentence processing (Elman 1990; McClelland,
St. John, and Taraban 1989).
Both models are trained with word sequences generated
according to a simple grammar. J. L. Elmans model simply predicts the next word in the sequence after training with grammatical sequences. The model in J. L. McClelland, M. St. John,
and R. Taraban maps from the word stream to a representation
of the set of role-filler pairs characterizing the event described
in a given input sentence. Both of these models illustrate many
of the properties already considered, and operate strictly off of a
linear sequence of inputs. There are also word reading models
that read the sequence of sounds appropriate to a spelled word
sequentially. This eliminates the concern that arises with some
of the more parallel models that theyre really hiding a specific

There are several models the first was an unpublished


effort by Elman from the late 1980s in which the model works
directly with recorded spoken language. More recent efforts
(e.g., Keidel et al. 2003) along these lines are in their infancy,
but when they have matured they will eliminate a residual
drawback of all of the aforementioned models, namely, that
they stipulate specific units on both their inputs and outputs.
Models that work directly with raw speech afford the possibility of seeing phones, phonemes, and syllables as approximate
descriptive conveniences similar to the other sorts of units we
have already been discussing. This overcomes a contradiction
inherent in the Plaut et al. reading models, and in most of the
other models discussed: These models eschew units internally,
on the one hand, but appear to depend on such units in their
inputs and outputs, on the other. The computational demands
of working directly with raw speech are daunting, but some
progress is being made in ongoing work.
A radical connectionist hypothesis would be that approaches
to representing language structure and language knowledge that
eschew prior commitment to units of any type will ultimately be
the most informative and successful. It will, of course, always be
useful to summarize facts that characterize language more succinctly in the form of rules. Such rules, however, will ultimately
be seen only as summary descriptions that approximately capture important aspects of language, rather than as directly characterizing the way in which language knowledge is represented,
acquired, and used.

Rules or Connections?
S. Pinker reacted strongly against the connectionist model of D.
E. Rumelhart and McClelland (Pinker and Prince 1988), taking
strong exception to the notion that language exhibits the graded,
similarity-based properties characteristic of connectionist models. He subsequently (Pinker 1991, 1999) accepted that language
does indeed have some gradient-like properties similar to those
of neural networks, but maintained that such properties are
restricted to operating within the lexicon. He made the case that
there is a separate, pure, rule-based system, operating according to principles quite distinct from those characteristic of connectionist networks. He made his case on the basis of a series
of empirical claims, arguing for the special status of categorical, structure- but not content-sensitive rules in performance,
acquisition, and breakdown under brain damage, also claiming support from cross-linguistic evidence. McClelland and K.
Patterson (2002) evaluated all of these claims and found instead
that properties of performance, acquisition, breakdown, and
cross-language variation do not support the view that language is
based on categorical rules; rather, the evidence supports the idea
that language knowledge is graded, semantic- and phonologicalcontent sensitive, and that there is no separate system for regular
as opposed to irregular aspects of language.
The fact that connectionist models can capture both systematic and idiosyncratic properties of linguistic forms leads to the
following suggestions:

209

Connectionist Models, Language Structure, and Representation

Consciousness and Language

There are no lexical entries in the mechanisms of language


processing as such, only sensitivity to the idiosyncratic properties of particular items.
The rules that characterize regularities in the relations
between, for example, sound to meaning (past tenses tend
to end in a variant of the dental stop) have no special status
either; however, they do capture the tendency that connectionist systems have to extend consistent relations among
constituent parts of expressions to new expressions containing the same constituents.
Similarly, other units besides the word, including subword
units such as phone, onset, rhyme, nucleus, coda, syllable, and morpheme, as well as supraword units, including
phrases, collocations, idioms, and constructions, may not
be stored as such; however, listing such units and noting the
regularities that describe their relations to other units may be
useful descriptively in characterizing aspects of the emergent
behavior of the system in which they need not be represented
as such.

McClelland, J. L., and K. Patterson. 2002. Rules or connections in pasttense inflections: What does the evidence rule out? Trends in Cognitive
Sciences 6.11: 46572.
McClelland, J. L., M. St. John, and R. Taraban. 1989. Sentence comprehension: A parallel distributed processing approach. Language and
Cognitive Processes 4: 287335.
Pinker, S. 1991. Rules of language. Science 253: 5305.
. 1999. Words and Rules. New York: Basic Books.
Pinker, S., and A. Prince. 1988. On language and connectionism: Analysis
of a parallel distributed processing model of language acquisition.
Cognition 28: 73193.
Plaut, D. C., and L. M. Gonnerman. 2000. Are non-semantic morphological effects incompatible with a distributed connectionist approach
to lexical processing? Language and Cognitive Processes 15: 44585.
Plaut, D. C., J. L. McClelland, M. S. Seidenberg, and K. Patterson. 1996.
Understanding normal and impaired word reading: Computational
principles in quasi-regular domains. Psychological Review
103: 56115.
Plunkett, K., and V. Marchman. 1991. U-shaped learning and frequency
effects in a multi-layered perceptron: Implications for child language
acquisition. Cognition 38: 43102.
Rumelhart D. E., and J. L. McClelland. 1986. On learning past tenses of
English verbs. In Parallel Distributed Processing. Vol 2: Psychological
and Biological Models. Ed. D. E. Rummelhart and J. L. McClelland.
Cambridge, MA: MIT Press.

A recent critique of connectionist approaches by Ray Jackendoff


(2007) raises a number of concerns, centering around a perceived
failure of connectionist models to be as systematic as Jackendoff
takes natural language to be. In a reply, McClelland and J. Bybee
(2007) address these concerns, emphasizing that connectionist
models do impose a tendency toward systematicity and regularity, while they yet exploit quasi regularity in exceptions, something that the approaches of Pinker and Jackendoff are not able
to do.
J. L. McClelland
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bird, H., M. A. Lambon Ralph, M. S. Seidenberg, J. L. McClelland, and
K. Patterson. 2003. Deficits in phonology and past-tense morphology: Whats the connection? Journal of Memory and Language
48: 50226.
Burzio, Luigi. 2002. Missing players: Phonology and the past-tense
debate. Lingua 112: 15799.
Bybee, Joan L. 1985. Morphology: A Study of the Relation between Meaning
and Form. Philadelphia: John Benjamins.
Bybee, J., and J. L. McClelland. 2005. Alternatives to the combinatorial
paradigm of linguistic theory based on domain general principles of
human cognition. Linguistic Review 22: 381410.
Elman J. L. 1990. Finding structure in time. Cognitive Science
14: 179211.
Jackendoff, Ray. 2007. Linguistics in cognitive science: The state of the
art. Linguistic Review 24: 347401.
Joanisse, M. F., and M. S. Seidenberg. 1999. Impairments in verb morphology following brain injury: A connectionist model. Proceedings of
the National Academy of Sciences 96: 75927.
Keidel, J. L., J. D. Zevin, K. R. Kluender, and M. S. Seidenberg. 2003.
Modeling the role of native language knowledge in perceiving nonnative speech contrasts. Proceedings of the 15th International Congress
of Phonetic Sciences: 22214.
Lupyan, G., and J. L. McClelland. 2003. Did, made, had, said: Capturing
quasi-regularity in exceptions. Proceedings of the 25th Annual Meeting
of the Cognitive Science Society. Available online at: http://csjarchive.
cogsci.rpi.edu/Proceedings/2003/mac/table.html.
McClelland, J. L., and J. Bybee. 2007. Gradience of gradience: A reply to
Jackendoff. Linguistic Review 24: 43755.

210

CONSCIOUSNESS AND LANGUAGE


John Locke, one of the founders of the study of meaning, took it
for granted that consciousness has to play a role in our understanding of language. Ultimately, linguistic signs have their
meanings in virtue of their connections to perception. So, for
example, the meanings of the color words can only be grasped by
someone who has had experience of the colors. This is the general case. Although the meanings of many words are grasped by
grasping their connections to other words, ultimately the whole
system has to bottom out in connections to perception. The general picture was that our perceptual experiences are the contact
point between us and the world. When the world affects us causally in such a way that we can think and talk about it, it does so by
affecting our perceptual experiences. These experiences, therefore, can serve as signs of the external phenomena that cause
them (somewhat as smoke can serve as a sign of fire), and their
status as signs is inherited by any further linguistic signs that we
use as a consequence of having those experiences (Locke [1690]
1975). Something like this general picture is still popular today,
partly as a result of Saul Kripkes (1980) reinvigoration of a causal
view of reference. The principal modification is that the role of
experience in perception is given little weight; consciousness is
regarded as perhaps an inessential epiphenomenon in the process and as having little to do with our grasp of language.
There are two lines of argument behind this elimination of
consciousness from the study of language. One is the direction
taken by the scientific study of vision. On the cognitive science
approach exemplified by David Marr (1982), vision has to be
understood at three levels: the level of computation, at which we
specify the task being performed by a visual system; the level of
algorithm, at which we specify just how the task is performed;
and the level of implementation, at which we specify the neural
basis of the system. It is usually assumed that there is some sense

Consciousness and Language


in which we find the mechanisms of causation only at the level of
implementation. Only at that level, the description of the biology
of the system, do we find interactions among physical units in
virtue of which the whole process counts as causal, even if there
is some secondary sense in which the higher-level descriptions
of the system can be said to be of causal significance. Conscious
experience as such does not enter into the characterization of
any of these levels, and it is hard to see how its causal significance could be recognized. Consciousness seems like a kind of
miasma that might pervade the system, but of no significance for
its functioning.
The second line of argument for the elimination of consciousness from the study of language is in the philosophical work of
Gottlob Frege. Freges point was that no one can really know
what experiences someone else is having, and so if consciousness did play a key role in an understanding of language, communication would be impossible (1952). You would be trying
to convey something whose significance ultimately had to do
with the nature of your experiences, and I would have no way of
knowing what you were talking about. So the shareable, public
phenomena of communication and meaning have to be sharply
differentiated from the realm of conscious experience since the
nature of an experience can be known only to the person having it. Freges point was pursued by the later Wittgenstein, who
argued that meaning cannot be grounded in the contents of the
stream of consciousness. Ludwig Wittgenstein (1953) argued
that the conception of meaning as grounded in the idiosyncratic
experiences that each of us has ultimately makes no sense; the
notion of experience that this conception uses is incoherent (see
private language argument). To characterize the meaning
of a sign, you have to look at its role in communication, and there
is no special role here for talk about conscious states.
These two lines of thought, from cognitive science and from
the work of Frege and Wittgenstein, are evidently forceful. There
is, nonetheless, a line of argument that asks: Suppose we do
eliminate consciousness from the study of language. Will we
be left with enough that we can give a recognizable account of
meaning and understanding? The argument was given its classic formulation by John Searle (1980). Searles question was
put as a problem for computational approaches to meaning
and understanding. Suppose, as a premise for reductio, that we
have a computer that understands a language, say Chinese. The
operations of the computer can be described at Marrs levels of
computation, algorithm, and implementation. Fundamentally,
what we have is a machine capable of complex symbol manipulation. Can this complex symbol manipulation add up to understanding? Suppose we have an ordinary speaker of English
who understands no Chinese. We put this speaker into a room
with a massive rule book written in English that tells him what
operations to perform with Chinese symbols when he encounters them. In effect, he mimics the operations of the computer.
We feed Chinese messages into the room, and he manipulates
the symbols and outputs fresh symbols in response. Could any
amount of this kind of thing add up to an understanding of
Chinese? Well, plainly our English speaker has no understanding of Chinese after all this. So where is an understanding of
Chinese to be found? Despite an avalanche of literature written
in response, there has been no consensus about any particular

reply to the argument. It is hard to vanquish the sense that no


amount of this kind of thing is going to be enough on its own to
constitute an understanding of language.
Searle (1990) argued that intentional states must, in principle,
be accessible to consciousness, that insofar as we have what he
called original intentionality rather than merely some observer-relative ascription of intentionality, the state must be one of
which the subject could become aware. You might think this
suggests one diagnosis of what is missing in the Chinese Room,
where there is no such thing as the possibility of anyone becoming aware of the content held by the messages in Chinese. So
these meanings cannot be the contents of anyones intentional
states. That is, in Searles scenario, there is no such thing as anyone grasping the meanings of Chinese statements.
The problem with this diagnosis is that it is hard to see its connection with what Searle himself highlighted as the moral of the
Chinese Room: that no amount of syntactic manipulation can
add up to a grasp of semantics, to a knowledge of what is being
talked about and what is being said about it. It is hard to see how
consciousness of the very intentional state itself does any work in
providing one with knowledge of what it is about. We have so far
no explanation of why consciousness should matter for the grasp
of semantics.
The traditional idea that consciousness plays a role in the
grasp of meaning had been developed in a somewhat different
direction by Bertrand Russell (1912) and G. E. Moore (1903).
Here, the emphasis is not on intentional states being conscious
but, rather, on consciousness of the things about which one is
talking or thinking. These are different phenomena. You might
have a conscious belief about the numbers, without being conscious of the numbers. And you might have unconscious beliefs
about something of which you are conscious, for example, an
unconscious belief that the man before you is dangerous. Russell
argued that all grasp of meaning must be provided by what he
called acquaintance with the objects referred to: We must
attach some meaning to the words we use, if we are to speak significantly and not utter mere noise; and the meaning we attach
to our words must be something with which we are acquainted
(Russell 1912, 25). Acquaintance here is a matter of consciousness
of the objects talked about: We shall say that we have acquaintance with anything of which we are directly aware, without the
intermediary of any process of inference or any knowledge of
truth (ibid., 25). Russells idea here is evidently intuitive. The
problem is that he thought one could only be acquainted with
ones own sense-data and, possibly, oneself. This greatly restricts
the possible topics of conversation and seems to run immediately
into problems in explaining how meaning is communicable. The
natural proposal is that the range of acquaintance should be
broader; it should include the physical objects around one. What
is missing in the Chinese Room is any awareness of the objects
around the room.
It is not immediately obvious how this way of finding a role for
consciousness is supposed to work. If consciousness is thought
of as a matter of having sensations produced by the physical
world, then how exactly could the having of those sensations be
what enabled one to think about the objects around one? If, on
the other hand, consciousness of the objects around one is taken
to be a matter of thinking about those objects, then it simply

211

Consciousness and Language


presupposes capacities for thought and talk about those objects
and cant play any role in explaining those capacities.
Still, there is something intuitive about the idea that what is
missing in the Chinese Room is some awareness of the things
under discussion, that if that were provided, we might have the
foundation for an understanding of the language being used. In
fact, Moore articulated the needed conception of awareness of
objects in a famous article, The refutation of idealism (1903).
Moore thought that in any analysis of sensation, we have to recognize that there are two elements. There is, first, the generic relation, consciousness of, in respect of which all sensations are
alike. This is a relation between the subject and something else,
the object of the sensation. The experiences of a single subject
are always alike in that it is always the same relation, consciousness, that is involved. The subjects experiences differ only in
that the objects of different experiences may be different. So, for
example, an experience of blueness and an experience of greenness differ in the things of which they are experiences (which
things they are encounters with); they are the same in that it is
the same generic relation of consciousness that is in question.
So the picture is
Subject consciousness object

The object varies, but the relation is always the same. Moores
point now is that there is no reason to think that the object must
always be some psychological state. He writes, I am as directly
aware of the existence of material things in space as of my own
sensations (1903, 453). If we can indeed appeal to this generic
relation of consciousness to external phenomena, doesnt this
give us our account of what is needed to provide a semantic
understanding of the terms we use?
For this approach to work, there has to be some response to
the lines of argument noted earlier. First, if consciousness does
do any work in our understanding of language, then it must do
some causal work. It must make some difference to what happens. But if we think of causality as a matter fundamentally of
the mechanistic interactions of physical particles, then how can
consciousness be causally significant? At best, it will seem that
an appeal to consciousness is an appeal to some ghostly quasi
mechanism. However, it is arguable that the trouble here is not
the appeal to consciousness but the mechanistic conception of
causation. The mechanistic conception is independently objectionable (cf, e.g., Woodward 2003). Arguably, we should think of
X causing Y as a matter of what would have happened to Y had
things been different with X. And we can certainly make sense of
the idea that things would have gone differently had one not been
conscious of this or that external phenomenon. So the idea that
the grasp of semantics is provided by experience of the things we
talk about does not seem to face any intractable difficulty over
the possibility of a causal role for consciousness. The idea that
consciousness as such must be a mere epiphenomenon has, in
any case, little intuitive appeal.
There were two classical objections to the idea that consciousness plays a role in our understanding of language. The first was
the point about causality. The second objection was that this
idea would make communication impossible; we would end up
each talking about only our own sensations. But this objection
disappears when we leave behind Russells view that we can be

212

acquainted only with our own sensations. Once we have Moores


picture, on which there is a generic relation of consciousness that
each of us can stand in to ordinary objects and properties, we can
see that consciousness of those objects and properties may be
what makes it possible for us all to be talking and thinking about
the same world. Suppose that, as Moore thought, all experiences
are alike in respect of the generic relation of consciousness and
differ only in their objects. Then, whether your experiences are
the same as mine can be a question only about the objects or
properties of which we are conscious. And we can know that we
are conscious of the very same ordinary physical things and their
properties.
There is, indeed, a traditional metaphysical issue in the background at this point. In the seventeenth century, advocates of
mathematical physics were at pains to stress that in the scientific image of the world, there were only atoms and the void; the
ordinary objects and properties about which we think and talk
had vanished. So the ordinary objects and properties could only
be projections of the mind onto the underlying scientific reality.
Although this idea still enjoys some popularity today, its time has
passed. We are now familiar with the alternative, that the world
can be described at many levels, and there is no particular reason to think that all but one of those levels must be projection of
the mind. In giving the natural history of a species, for example,
we are describing the world at a different level than the level of
basic physics. But we are not describing a projection of the mind.
Similarly, when we describe the world in terms of everyday tables
and chairs and people, we are not describing the world in terms
of basic physics. But we are not describing a projection of the
mind. We are describing objects that are there to be encountered
by us. And our awareness of them makes it possible for us to
think and talk about them.
Continued resistance to the idea that there is a role for consciousness in the study of language seems likely to stem from
something like the following idea. The causal processes in virtue
of which the words of a language constitute signs of the phenomena around us may be mediated by perception, but they do not
thereby have to be mediated by experience. Linguistic terms are
signs of the phenomena around us in something like the sense in
which smoke is a sign of fire. And there is nothing special about
causation of the use of a sign that is mediated by experience; any
kind of causation would do. The trouble with this line of thought
is that it has proven very difficult to try to explain what kind of
causation is needed for language to represent the phenomena
around us (Millikan 1984). It may be that a causal link to the
object is needed only because the prototypical causal links are
those in virtue of which we can be said to be aware of the object
we are talking about. It is, after all, difficult not to be struck by
a thought experiment even simpler than Searles. Suppose we
had a robot that, though not conscious of its surroundings, had
significant behavioral competencies in particular, it could produce language that sounded just like a humans production of
language and suppose it interacted freely with its environment.
It is impossible not to suspect that, just because this robot has no
awareness of anything, it does not have the first idea what it is
talking about. It has no grasp of the semantics of its language.
John Campbell

Consistency, Truth, and Paradox


WORKS CITED AND SUGGESTIONS FOR FURTHER READING

The Meaningless Response

Frege, Gottlob. 1952. On sense and reference. In Translations from the


Philosophical Writings of Gottlob Frege, ed. P. T. Geach and Max Black,
5778. Oxford: Blackwell.
Kripke, Saul. 1980. Naming and Necessity. Cambridge: Harvard University
Press.
Locke, John. [1690] 1975. An Essay Concerning Human Understanding,
ed. P. H. Nidditch. Oxford: Oxford University Press.
Marr, David. 1982. Vision. San Francisco: W. H. Freeman.
Millikan, Ruth. 1984. Language, Thought and Other Biological Categories.
Cambridge, MA: MIT Press.
Moore, G. E. 1903. The refutation of idealism. Mind 12: 43353.
Russell, Bertrand. 1912. The Problems of Philosophy. Oxford: Oxford
University Press.
Searle, John. 1980. Minds, Brains and Programs. Behavioral and Brain
Sciences 3: 41757
. 1990. Consciousness, explanatory inversion, and cognitive science. Behavioral and Brain Sciences 13: 585642.
Wittgenstein, Ludwig. 1953. Philosophical Investigations. New
York: Macmillan.
Woodward, James. 2003. Making Things Happen: A Theory of Causal
Explanation. Oxford: Oxford University Press.

One response to the paradox is that the italicized sentence is not


meaningful. Such a response, while natural, is implausible. Why
isnt the italicized sentence meaningful? One might point to the
conspicuous circularity (in particular, self-reference) in the sentence. Self-reference, however, is hardly sufficient for a meaningless sentence. (Witness the sentence immediately preceding the
italicized sentence! Witness others, e.g., All sentences are sentences, or etc.) Moreover, if no meaningless sentence is true, then
a fortiori the italicized sentence is not true. But, then, we are led to
accept that the italicized sentence is not true led to accept, apparently, the italicized sentence itself. If we should reject any untrue
sentence, then we are now stuck. For this and other reasons, the
meaningless thesis is not a plausible lesson to draw from the paradox. (See suggestions for further reading for a note on Tarski.)

CONSISTENCY, TRUTH, AND PARADOX


The aim of this entry is to convey a flavor of (some) logical and
semantic issues that arise from so-called semantic paradoxes.
There is no aim, given space restrictions, to provide anything like
a history or survey of the relevant issues. The focus, for simplicity, will be on truth-theoretic paradox, although the same issue
the consistency of language arises with other semantic notions
(denotation, satisfaction, etc.).

The Liar Paradox


The Liar paradox is one of the most familiar truth-theoretic paradoxes. It arises from a sentence that says (or may be used to say)
of itself only that it is not true. English seems to have such sentences. By way of illustration, consider the italicized sentence
immediately following this sentence, where CTP abbreviates consistency, truth, and paradox (this very encyclopedia
entry).

Paracomplete Language?
The principle of excluded middle (PEM, sometimes LEM for
law) has it that, for any (declarative) sentence A, the disjunction of A and its negation is logically true (or valid). This is usually put by saying that, where V is disjunction and ~ negation, all
instances of AV~A are valid (in the given language). A paracomplete language is one in which PEM fails, that is, one in which
AV~A is not valid.
A popular response to the Liar paradox (and related semantic
paradoxes) is that it indicates the failure of PEM in English. After
all, premise (1) of the previous argument toward paradox relies
on PEM. Without PEM, the conclusion that some sentences are
true and not true fails to find a sound argument.
Paracomplete responses to the paradox are the most common approaches today. One problem with them is similar to
the meaningless response: They have trouble expressing their
position. After all, if (say) the Liar-instance of PEM fails, then,
presumably, neither the Liar nor its negation is true, in which
case, the Liar (e.g., the italicized sentence) is not true. But the
Liar says that its not true, and so, if it really is not true, it seems
to speak truly.

Paraconsistent Language?

(3) If the italicized sentence in CTP is not true, then the italicized sentence in CTP is true.

Another response, being increasingly discussed, is a so-called


paraconsistent thesis. A paraconsistent language is one in which
arbitrary B does not follow from arbitrary A and ~A. (B does follow from A and ~A in classical logic, and in many paracomplete
languages.) As a result, one can accept that some true sentences
have a true negation; it doesnt follow, in a paraconsistent language, that all sentences are thereby true a genuine absurdity.
Accordingly, the paraconsistentist may truly say that (for example) the italicized sentence is not true, since, according to the
proposal, it is true and so is its negation!

(4) Hence, the italicized sentence in CTP is true and not


true.

Closing Remarks and Curry

The italicized sentence in CTP is not true.

Consider the following argument.


(1) The italicized sentence in CTP is true or the italicized sentence in CTP is not true.
(2) If the italicized sentence in CTP is true, then the italicized
sentence in CTP is not true.

Let us say that a language is inconsistent if theres some sentence


of the language such that both it and its negation are true. The
argument, then, seems to indicate that English is inconsistent.
That English is inconsistent is surprising, at the very least. (Many
have said that the result is literally beyond belief the original
meaning of paradox.)

There are many other responses to paradox, and the sketches


above only crudely gesture in various directions. There are
also other, perhaps more difficult, paradoxes. Currys paradox
involves a conditional, and is largely independent of theories of
negation. Currys paradox involves sentences such as
* If the starred sentence in CTP is true, then everything is true.

213

Constituent Structure

Constraints in Language Acquisition

A standard conditional proof (assume the antecedent, and derive


the consequent) seems to indicate that the starred sentence in
CTP is true, in which case, its antecedent is true. But, then, we
have a true conditional with a true antecedent, and so, by modus
ponens, detach the consequent: Everything is true!
Paracomplete and paraconsistent responses to the Liar offer,
in the first instance, nonclassical theories of negation proposals
about negations logical behavior. Currys paradox requires careful theorizing about conditionals, and so calls for more work than
is provided by standard theories of negation.

Constituents are often assumed to be contiguous strings,


but (4) undermines this assumption since a man whom nobody
knew is a semantic unit.
(4)

This issue is even clearer in freer word order languages, such


as German or Warlpiri:
(5)

Dem Jungen schenken wollte Peter das Buch.


the boy
give
wanted Peter the book
Peter wanted to give the boy the book.
(Uszkoreit 1987, 159)

(6)

Maliki-rli ka
kurtu kartirti-rli paji-rni wita-ngku
dog-ERG PRES child tooth-ERG bite-NPST small-ERG
The small dog is biting the child with its tooth.
(Simpson 1991, 261)

J. C. Beall
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Beall, J. C., ed. 2008. Revenge of the Liar. Oxford: Oxford University Press.
This is a collection of very recent papers on truth and paradox.
Martin, Robert., ed. 1984. Recent Essays on Truth and the Liar Paradox.
Oxford: Oxford University Press. This collects a variety of contemporary approaches to truth, including so-called revision theory and contextual theories not mentioned here.
Tarski, A. 1983. Logic, Semantics, Metamathematics: Papers from 1923 to
1938. Ed. John Corcoran. Indianapolis: Hackett Publishing. Tarskis
classic approach toward defining truth for a language remains influential, but it is highly implausible as an account of truth for natural
languages (as Tarski himself thought).

CONSTITUENT STRUCTURE
words in sentences cluster into groups called constituents (or
phrases). For example, in (1), the dog, barked at a cat, at a cat,
and a cat are constituents:
(1)

The dog barked at a cat.

Among criteria used for constituency are the following:


The words form a semantic unit; for example, the dog refers
to a particular animal.
The words form a phonological unit, with larger prosodic
breaks at constituent boundaries.
The same string of words (or categories) appears in various
contexts; for example, article-noun can appear as subject,
verbal object, prepositional object, and elsewhere.
A string can be replaced by a single word without radically
changing the meaning or structure of the sentence; for example, in (1), the dog could be replaced by it.
A string of words occurs as a conjunct in a coordinate structure, such as the dog and two gerbils.
Unfortunately, such criteria sometimes diverge: In (2), large prosodic breaks come before that, but the relative clauses introduced
by that are grouped semantically with the preceding nouns.
(2)

This is the rat that ate the cheese that lay in the house that Jack
built.

In (3), a new person functions as a constituent syntactically


and prosodically but not semantically since it is contracting
AIDS, not the person, that is new.
(3)

214

Every minute, a new person contracts AIDS.

A man walked in whom nobody knew.

Responses to inconsistencies among constituency criteria


vary. transformational grammar assigns multiple constituent structures to every sentence (Chomsky 1957). Other
responses include abandoning constituent structure altogether
(Hudson 1984) or giving priority to some criteria (Pollard and
Sag 1994).
Many constituents contain one word, called the head, that
determines its distributional properties, its internal structure, and its semantic content. Names like noun phrase, verb
phrase, and so on reflect assumptions about headedness. Noam
Chomsky (1970) proposed a universal schema for the internal
structure of constituents (X-BAR THEORY), varying only in the
position of the head within this schema. Subsequent literature
has explored constituent structures across languages.
While both descriptive and theoretical syntacticians rely
heavily on the notion of constituency, the controversies raised
here remain unresolved.
Thomas Wasow
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bray, Norman W., Geoffrey J. Huck, and Almerindo E. Ojeda. 1987.
Discontinuous Constituency. New York: Academic Press.
Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton.
. 1970. Remarks on nominalization. In Readings in English
Transformational Grammar, ed. Roderick A. Jacobs and Peter S.
Rosenbaum, 184221. Waltham, MA: Ginn.
Hudson, Richard. 1984. Word Grammar. Oxford: Blackwell.
Pollard, Carl, and Ivan A. Sag. 1994. Head-Driven Phrase Structure
Grammar. Chicago: University of Chicago Press.
Simpson, Jane. 1991. Warlpiri Morpho-Syntax: A Lexicalist Approach.
Dordrecht: Kluwer.
Uszkoreit, Hans. 1987. Word Order and Constituent Structure in German.
Stanford, CA: CSLI Publications.

CONSTRAINTS IN LANGUAGE ACQUISITION


One of the goals of developmental psycholinguists is to
understand how children acquire a system of linguistic knowledge that is equivalent to that of adults in the same linguistic
community. What kind of learning mechanisms ensure that
childrens hypotheses about the sequences of sounds they hear

Constraints in Language Acquisition


map onto individuals, events, and concepts in the same way as
they do for adults? Observations of childrens language development reveal that acquisition occurs rapidly and that linguistic
knowledge is often achieved without the support of much, if any,
relevant experience. On the basis of these observations, it has
been proposed that childrens linguistic hypotheses are circumscribed by constraints (sometimes termed biases, assumptions,
or principles).

The Role of Constraints


The role of constraints in language development is to prevent
children from forming misguided hypotheses about the forms
and meanings of linguistic expressions. One way a hypothesis
might be misguided is by being too broad, in the sense that it
allows forms that are not in the language or extends the meanings of linguistic expressions to include individuals, events, and
concepts that are not part of the corresponding meanings generated by other speakers of the language. If children make hypotheses that are too broad, in this sense, then they will make errors
that may prove to be difficult for them to recover from, thereby
making convergence on the system of adult linguistic knowledge
slow and onerous (see learnability). The observation that
children rapidly master many complex linguistic facts suggests
that mistaken hypotheses are somehow avoided. This is where
constraints come in, by placing limits on the kinds of linguistic
hypotheses that children can entertain, so that real-world experience will provide relevant data to confirm a hypothesis, or
redirect children to a new hypothesis. Although the existence of
constraints is widely acknowledged, two issues about the nature
of constraints are subject to controversy. One is whether constraints are learned or are innately specified. A second issue is
whether constraints are specific to a single cognitive domain
(e.g., language) or cut across several cognitive domains. This
issue is called domain specificity.

The Innateness of Constraints


Constraints first assumed prominence in the late 1960s in discussions of structure dependency (Chomsky 1971). Before constraints were introduced, grammars were systems of rules. Rules
are positive statements, indicating forms and meanings that are
possible. In contrast to rules, constraints are often couched in
negative statements, dictating the forms and meanings that are
not possible. Armed with constraints, then, learners are prevented from producing illicit forms and from assigning illicit
meanings.
To take one example, there is a constraint that governs the
reference of pronouns in sentences (see binding ). The constraint applies in the sentence He said John hid the rabbit, where
the constraint dictates that the pronoun he and the name John
cannot refer to the same individual; they must have disjoint
reference (Chomsky 1981). The disjoint reference of the pronoun he and the name John follows from the fact that the pronoun is positioned higher in the constituent structure
than the name (more precisely, the pronoun must c-command
the name). Suppose the pronoun is lower in the constituent
structure than the name (and, hence, does not c-command
it). In such cases, the pronoun and the name should be able to
corefer. Attesting to this is the acceptability of coreference in a

variety of examples: John said he hid the rabbit, When he was


in the woods, John hid the rabbit. In such cases the constraint
that blocks coreference fails to apply, and so the pronoun he is
free to refer to John. This illustrates how a single negative constraint can describe the same set of facts that would require a
host of positive rules, making for a more compact (parsimonious) grammar.
Because constraints are negative statements, they are likely
candidates for being innately specified. Suppose a child lacked
the constraint on coreference. Then, the child would allow the
sentence He said John hid the rabbit to mean that John said that
he, John, hid the rabbit. How could the child learn that the sentence cannot mean this? One way to unlearn something is to
be exposed to so-called negative evidence, such as corrective
feedback. As a matter of fact, however, parents do not provide
consistent corrective feedback (negative evidence) even when
children do make errors (Brown and Hanlon 1970; Morgan and
Travis 1989). Without negative evidence, it is difficult to see how
constraints could be learned. The alternative is to suppose that
constraints are innately specified, as part of human biology.
There are interesting empirical consequences of the innateness
hypothesis: Innate constraints are expected to emerge early
(despite their apparent complexity), and to be universal (see
Crain and Thornton 1998; innateness and innatism). The
issue of innate specificity versus learning crops up in discussions
of all kinds of constraints, including constraints on how children
learn the syntactic frames in which verbs can appear, and how
they learn word meanings. We discuss these topics in turn.

Constraints on Argument Structure


In some linguistic frameworks, human languages are characterized by constructions (see construction grammars). For
example, one set of verbs can appear in the dative construction (e.g., gave, as in John gave a book to the museum). Many of
the same verbs can appear in another construction, called the
double-object construction (John gave the museum a book). In
fact, changing the sentence from the dative to the double object
does not change its basic meaning, or communicative function.
However, suppose a child formed the following generalization that all verbs that can appear in the dative construction
can appear in the double-object construction, without a change
in basic meaning. This generalization is flawed because there
are verbs that can appear in the dative but not in the doubleobject construction. One exceptional verb is donate. Notice that
John donated a book to the museum is fine, but John donated
the museum a book is not acceptable. To prevent learners from
forming illicit double-object constructions for verbs like donate,
a constraint can be introduced. The constraint allows a structural option for a verb (say, the double-object word order) to
be entered into childrens grammars only if there is evidence
for that word order in the input to children. A proposal of this
kind is the unique argument-structure preference proposed by
Martin Braine and Patricia Brooks (1995). As Pinker points out,
[T]he need for negative evidence in language acquisition can be
eliminated if the child knows that when he or she is faced with a
set of alternative structures fulfilling the same function, only one
of the structures is correct unless there is direct evidence that
more than one is necessary (1984, 113).

215

Constraints in Language Acquisition


A similar uniqueness constraint has been advanced to explain
how children avoid errors in learning the past tense form of a verb.
The simple past tense rule add -ed (with its phonological variants) provides the right answer for many verbs, but this rule, too,
is plagued with exceptions; it outputs the incorrect forms comed
and bringed, for example, rather than came and brought. Gary
Marcus and colleagues (1992) proposed that the past tense forms
of regular verbs (lifted, walked, showed) are formed by the simple past tense rule, but irregular past tense forms (e.g., came and
brought) must be learned from the input and entered as exceptions into the childs mental dictionary (cf. Pinker 1999). This is
where a constraint comes into the story. Once an irregular past
tense form is entered in the lexical entry for a verb, a constraint
prevents the child from accessing the past tense rule to produce
regular forms like comed and bringed. Essentially, the constraint
ensures that a lexical entry contains only a single past tense form
unless there is direct evidence for more than one form.

Constraints on the Meanings of Words


Constraints also figure prominently in the literature on word
learning (see lexical acquisition). One family of constraints
assists children in associating labels with objects in the world.
The complexity of associating labels to objects was driven home
by W. V. O. Quine (1960), who invites us to consider how we
would interpret an expression used by a speaker of another language in the presence of a passing rabbit. Suppose the speaker
utters gavagai as the rabbit passes by. Quine asks how we can
be sure that gavagai is being used to refer to the rabbit and not
some property of the rabbit, such as food, or furry, white,
or even undetached rabbit parts. Until we become native
speakers of the language with gavagai, there is what Quine calls
an indeterminacy of translation between that language
and our own.
Constraints on word learning are part of the solution to the
indeterminacy of translation. One constraint, called mutual exclusivity, ensures that children assign only one label per category
(e.g., Markman and Wachtel 1988). Mutual exclusivity guides
childrens initial hypotheses, but in many cases it is overridden as
it becomes clear that it has exceptions. For example, a child who
has mastered dog as the label for the pet at home will soon hear
that the dog is also labeled animal, so both labels will be incorporated into the childs mental dictionary. Some researchers suppose that mutual exclusivity is domain specific, applying just to
word learning. An alternative claim is advanced by Paul Bloom
(2000), who proposes that inferences about the communicative intentions of others, and not mutual exclusivity, are used
to derive childrens associations of labels to unfamiliar objects.
Such inferences are likely to cross the boundaries of cognitive
domains, including logical reasoning in addition to language.
Labeling objects is just one aspect of word learning. Children
also have to achieve the mapping for abstract words and for function words (called closed class vocabulary items) and not just for
nouns and verbs (called open class vocabulary items). In these
cases, the environmental input clearly has less impact. Consider,
for example, how a child learns the meaning of the logical word
or (see semantics, acquisition of). In classical logic, a statement of the form a or b is true if (i) only A is true, (ii) only B is
true, and (iii) both A and B are true. So, the or of classical logic

216

Construction Grammars
is inclusive-or (including iii). For most sentences that children
experience, however, one or the other of the expressions surrounding or (its disjuncts) is false (excluding iii): Eat your veggies, or youll have to go to bed, Is his name Ted, or Fred? This
leads to the expectation that children should initially attribute
the exclusive-or meaning to the word or. In fact, there is evidence
that despite the input, children initially interpret or as inclusiveor as in classical logic. One possibility is that classical logic (or a
universal grammar) imposes constraints on childrens initial interpretations of logical words. On this scenario, children
initially assign or the truth conditions associated with inclusiveor, and later learn to limit these truth conditions to those associated with exclusive-or, based on principles of conversation. This
could be an example of an innate constraint but one that may or
may not be domain specific.
To summarize, constraints typically direct children to begin with
narrow hypotheses (i.e., to start small) and only broaden these
hypotheses if the input demands it. Constraints function quite differently in different parts of the linguistic system. Some constraints
are viable candidates for being domain specific and innately specified, but others may be domain general and learned.
Stephen Crain, Rosalind Thornton
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bloom, Paul. 2000. How Children Learn the Meaning of Words.
Cambridge, MA: MIT Press.
Braine, Martin, and Patricia Brooks. 1995. Verb argument structure and
the problem of avoiding an overgeneral grammar. In Beyond Names
for Things: Young Childrens Acquisition of Verbs, ed. M. Tomasello and
W. E. Merriman, 35376. Hillsdale, NJ: Lawrence Erlbaum.
Brown, Roger, and Camille Hanlon. 1970. Derivational complexity and
order of acquisition in child speech. In Cognition and the Development
of Language, ed. J. Hayes, 153. New York: Wiley.
Chomsky, Noam. 1971. Problems of Knowledge and Freedom. New
York: Pantheon.
. 1981. Lectures in Government and Binding. Dordrecht, the
Netherlands: Foris.
Crain, Stephen, and Rosalind Thornton. 1998. Investigations in Universal
Grammar. Cambridge, MA: MIT Press.
Marcus, Gary, Steven Pinker, Michael Ullman, Michelle Hollander, T. John
Rosen, and Fei Xu. 1992. Overregularization in Language Acquisition.
Monographs of the Society for Research in Child Development, no. 57.
Chicago: University of Chicago Press.
Markman, Ellen, and G. Wachtel. 1988. Childrens use of mutual exclusivity to constrain the meanings of words. Cognitive Psychology
20: 12157.
Morgan, James, and Lisa Travis. 1989. Limits on negative information in
language input. Journal of Child Language 16: 53152.
Pinker, Steven. 1984. Language Learnability and Language Development.
Cambridge: Harvard University Press.
. 1999. Words and Rules. New York: Basic Books.
Quine, W. V. O. 1960. Word and Object. Cambridge, MA: MIT Press.

CONSTRUCTION GRAMMARS
There has been a broad convergence in many quarters in recent
years toward a view of grammar in which constructions play a central role; approaches that share this view are here referred to as
construction grammars. Construction grammars view linguistic

Construction Grammars

Simple words (filled or partially


filled)

e.g., the, theory, blue, re-V

(3) An emphasis is placed on psychological validity. A linguistic theory must interface naturally with what we know
about acquisition, processing, and historical change.

Complex word

e.g., sh taco, rehouse

(4) An emphasis is placed on subtle aspects of the way we


construe of events and states of affairs.

Idiom (filled)

e.g., cock and bull story

Idiom (partially filled)

e.g., the apple of


<someones> eye

(5) Constructions are understood to be learned on the basis


of the input and general cognitive mechanisms and are
expected to vary to some degree cross-linguistically.

Ditransitive (double object)


construction

Subj V Obj1 Obj2 (e.g., She


gave him a kiss; He baked
her an apple pie.)

(6) Cross-linguistic generalizations are explained by appeal


to general cognitive constraints, together with the functions
of the constructions involved.

Passive

Subj aux VPpast participle


(PPby)
(e.g., The man was hit by a
meteor.)

(7) Language-specific generalizations across constructions are captured via inheritance networks much like those
that have long been posited to capture our nonlinguistic
knowledge.

Table 1. Examples of constructions, varying in size and complexity

(8) The totality of our knowledge of language is captured by a


network of constructions: a construct-i-con.
patterns of varying complexity as instances of conventional pairings of form and meaning. These pairings include words (with
or without open slots), idioms (with or without open slots), and
fully general abstract phrasal patterns (with or without any fixed
words). Examples of constructions at varying degrees of complexity are given in Table l. (See Goldberg 2006 for arguments
that the shared formal properties and closely related semantics
of dative and benefactive ditransitives warrant treating them as
instances of the same construction.)
Any linguistic pattern is recognized as a construction so long
as some aspect of its form or some aspect of its function is not
strictly predictable from its component parts or from other constructions recognized to exist. In addition, most constructionists
argue that patterns are stored even if they are fully predictable as
long as they occur with sufficient frequency.
The emphasis on the pairing of function with form is what
sets construction grammars apart from both other generative
approaches (which tend to downplay function) and other functional approaches (which tend to downplay form). At the same
time, constructionists bring these two approaches together in
some ways. They recognize the importance of two major questions that have been brought to the fore by generative grammarians (see generative grammar):
(1) How can all of the complexities of language be learned such
that we are able to produce an open-ended set of utterances?
(2) How are cross-linguistic generalizations (and language
internal generalizations) accounted for?
Moreover, constructionists recognize that the answers to these
questions rely heavily on traditional functionalist methodology
and findings that emphasize the usage-based nature of language and the importance of general cognitive processes.
Construction grammars, broadly conceived, each share at
least most of the basic tenets listed that follow:
(1) All levels of description are understood to involve form
function pairings, including morphemes or words, idioms, partially lexically filled patterns, and fully abstract phrasal patterns.
(2) A what you see is what you get (WSWYG) approach to
syntactic form is adopted.

Constructionists have traditionally emphasized unusual


phrasal patterns such as those in Table 2. As an example of an
unusual pattern, consider the Covariational Conditional construction in Table 2. The construction is interpreted as involving an independent variable (identified by the first phrase) and a dependent
variable (identified by the second phrase). The normally occurs
with a head noun, but in this construction it requires a comparative phrase. The two major phrases resist classification as either
noun phrases or clauses. The requirement that two phrases of this
type be juxtaposed is another nonpredictable aspect of the pattern. Because the pattern is not strictly predictable, a construction
is posited that specifies the particular form and function involved.
Research has revealed subtle syntactic and semantic properties of this and other construction types in Table 2 (e.g.,
Culicover 1999; Jackendoff 2002; Lambrecht 1990; Michaelis and
Lambrecht 1996; Williams 1994). The existence of these clearly
learned, partially productive, syntactically constrained patterns
leads to the implication that much more of grammar may be
learned on the basis of the input than had been generally recognized by the generative approach. That is, if these unusual
patterns could be learned, why should we assume that the more
frequent, regular patterns could not possibly be?
In fact, constructionists have offered construction-based
accounts of many of the more core aspects of grammar,
including argument structure (e.g., Goldberg 1995), control
phenomena (Culicover and Jackendoff 2005), aspectual (see
aspect) interpretation (Michaelis 2004), raising (Langacker
1992), existential constructions (Lakoff 1987), and island constraints (Deane 1991; Goldberg 2006).
Constructionists aim to provide accounts of such phenomena without appealing to underlying levels of representation,
traces, or phonetically null functional projections. That is, a
WSWYG approach to form is adopted. Beyond methodological
parsimony, this is due to the fact that there generally exist subtle
functional differences between surface forms (Goldberg 2002).
Therefore, surface forms are generated directly. Note that surface
form need not specify a particular word order, nor even particular grammatical categories, although there are constructions
that do specify these features.

217

Construction Grammars

Table 2. Examples of partially idiosyncratic constructions


Mad Magazine construction

Him, a doctor?!

N P N construction

time after time; day after


day

Time away construction

Dancin the night away

Whats X doingY?!

Whats that y doing in my


soup?!

Nominal extraposition construction

Its amazing the


difference!

Enough already construction

Enough with the


examples!

Stranded preposition construction

Who did he give that to?

Covariational conditional
construction

The more you have, the


more you want

Constructions are combined freely to form actual expressions


as long as they can be construed as not being in conflict. That
is, an actual expression typically involves the combination of at
least a half dozen different constructions. The observation that
language has an infinite creative potential is accounted for, then,
by the free combination of constructions.
Most constructionist approaches aim to provide motivation for
each construction posited. Motivation aims to explain why it is at
least possible and at best natural that this particular formmeaning correspondence should exist in a given language. Motivation
is distinct from prediction: Recognizing the motivation for a construction does not entail that the construction must exist in that
language or in any language. It simply explains why the construction makes sense or is natural (cf. Haiman 1985; Lakoff 1987;
Goldberg 1995). Functional and historical generalizations count
as explanations, but they are not predictive in the strict sense, just
as parallel generalizations in biology are not predictive. That is,
language, like biological evolution, is contingent, not deterministic. Just as is the case with species, particular constructions are
the way they are not because they have to be that way but because
their phylogenetic and ontogenetic evolution was motivated by
general forces.

Varieties of Construction Grammars


There are several variations of construction grammars, including
the following:
(1) SCxG: sign-based construction grammar (Fillmore 1999;
Fillmore, Kay, and OConnor 1988; Kay 2002; Kay and Fillmore
1999; Sag, Wasow, and Bender 2003)
(2) CG: cognitive grammar (e.g., Langacker 1987a,
1987b, 1988, 1991, 2003)
(3) RCxG: radical construction grammar (e.g., Croft 2001)
(4) ECxG: embodied construction grammar (e.g., Bergen
and Chang 2005)
(5) CCxG: cognitive construction grammar (e.g., Bencini and
Goldberg 2000; Goldberg 1995, 2006; Lakoff, 1987)
(6) Fluid constructon grammar (e.g., Steels and DeBeule 2006)
(7) Simpler syntax (Culicover and Jackendoff 2005)

218

C. J. Fillmore and P. Kay first coined the term construction


grammar. Their early work on idioms and idiomatic phrasal patterns such as let alone, even, and Whats X doing Y? laid the foundation for many of the variations of construction grammar that
have since developed. Yet their version, sign-based construction
grammar (SCxG), has developed quite distinctly from the other
construction grammars. Key differences include the fact that
SCxG is not uniformly usage-based nor does it generally seek
motivation for the relationship between form and function.

A Comparison with Mainstream Generative Grammar


Proposals
Certain mainstream generative grammar frameworks share the
basic idea that some type of meaning is directly associated with
some type of form, independently of particular lexical items (cf.
also Borer 1994, 2003; Hale and Keyser 1997; Marantz 1997). To
the extent that syntax plays a role in contentful meaning, these
other approaches are constructionist, and they are occasionally
referred to that way in the literature. However, the approaches
are fundamentally different from the type of constructionist
approaches just outlined. For example, these mainstream generative accounts do not adopt a nonderivational (monostratal)
approach to syntax, but appeal instead to underlying levels of
representation in which constituents (or entities that are never
realized) move around abstract trees. Moreover, these accounts
emphasize rough paraphrases instead of speakers detailed construals of situations. Empirical challenges faced by the accounts
are discussed in some detail in Goldberg (2006).
Adele E. Goldberg

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Bencini, G. M. L., and A. E. Goldberg. 2000. The contribution of argument
structure constructions to sentence meaning. Journal of Memory and
Language 43: 64051.
Bergen, B. 2004. The psychological reality of phonaesthemes. Language
8.2: 290311.
Bergen, B., and N. Chang. 2005. Embodied construction grammar
in simulation-based language understanding. In Construction
Grammar(s): Cognitive and Cross-Language Dimensions, ed. Jan-Ola
Ostman and Mirjam Fried, 14790. Philadelphia: John Benjamins.
Borer, H. 1994. The projection of arguments. In University of
Massachusetts Occasional Papers in Linguistics. Vol. 17. Ed. E. Benedicto
and J. Runner, 1947. Amherst: GLSA, University of Massachusetts
Press.
. 2003. Exo-skeletal vs. endo-skeletal explanations. In The Nature
of Explanation in Linguistic Theory, ed. J. Moore, and M. Polinsky,
3167. Chicago: CSLI and University of Chicago Press.
Croft, W. 2001. Radical Construction Grammar: Oxford: Oxford University
Press.
Culicover, P. W. 1999. Syntactic Nuts: Hard Cases, Syntactic Theory and
Language Acquisition. Oxford: Oxford University Press.
Culicover, P. W., and R. Jackendoff. 2005. Syntax Made Simple(R).
Oxford: Oxford University Press.
Deane, Paul. 1991. Limits to attention: A cognitive theory of island phenomena. Cognitive Linguistics 2: 163.
Fillmore, C. J. 1999. Inversion and constructional inheritance. In
Lexical and Constructional Aspects of Linguistic Explanation, ed. Gert
Webelhuth, Jean-Pierre Koenig, and Andreas Kathol, 11328. Stanford,
CA: CSLI.

Contact, Language
Fillmore, C. J., P. Kay, and M. C. OConnor. 1988. Regularity and idiomaticity in grammatical constructions: The case of let alone. Language
64: 50138.
Goldberg, A. E. 1995. Constructions: A Construction Grammar Approach
to Argument Structure. Chicago: Chicago University Press.
. 2002. Surface generalizations: An alternative to alternations.
Cognitive Linguistics 13.4: 32756.
. 2006. Constructions at Work: The nature of generalization in language. Oxford: Oxford University Press.
Haiman, John. 1985. Iconicity in Syntax. Cambridge: Cambridge
University Press.
Hale, K., and J. Keyser. 1997. On the complex nature of simple predicators. In Complex Predicates, ed. A. Alsina, J. Bresnan, and P. Sells,
2965. Stanford, CA: CSLI.
Jackendoff, R. 2002. Foundations of Language. Oxford: Oxford University
Press.
Kay, P. 2002. English subjectless tagged sentences. Language
78.3: 45381.
Kay, P., and C. J. Fillmore. 1999. Grammatical constructions and linguistic generalizations: The whats X doing Y? construction. Language
75.1: 134.
Lakoff, G. 1987. Women, Fire, and Dangerous Things: What Categories
Reveal about the Mind. Chicago: University of Chicago Press.
Lambrecht, K. 1990. What, me worry? Mad Magazine sentences revisited. Proceedings of the 16th Annual Meeting of the Berkeley Linguistics
Society. Berkeley, CA: Berkeley Linguistics Society, 21528.
Langacker, Ronald. 1987a. Foundations of Cognitive Grammar. Vol 1.
Stanford, CA: Stanford University Press.
. 1987b. Nouns and verbs. Language 63: 5394.
. 1988. A usage-based model. In Topics in Cognitive Linguistics,
ed B. Rudzka-Ostyn, 12761. Philadelphia: John Benjamins.
. 1991. Foundations of Cognitive Grammar. Vol. 2. Stanford,
CA: Stanford University Press.
. 1992. Reference point constructions. Cognitive Linguistics 4: 139.
. 2003. Construction grammars: Cognitive, radical and less so.
Paper presented at the International Cognitive Linguistics Conference,
Logrono, Spain.
Marantz, A. 1997. No escape from syntax: Dont try morphological analysis in the privacy of your own lexicon. In University of Pennsylvania
Working Papers in Linguistics. Vol. 4.2. Ed. A. Dimitriadis and L. Siegel,
20125. Philadelphia: University of Pennsylvania Press.
Michaelis, L. 2004. Implicit and explicit type-shifting in construction
grammar. Cognitive Linguistics 15: 167.
Michaelis, L. A., and K. Lambrecht. 1996. The exclamative sentence type
in English. In Conceptual Structure, Discourse and Language, ed A. E.
Goldberg, 37598. Stanford, CA: CSLI.
Sag, I. A., T. Wasow, and E. M. Bender. 2003. Syntactic Theory: A Formal
Introduction. Stanford, CA: Center for the Study of Language and
Information.
Saussure, F. de. [1916] 1959. Course in General Linguistics. Trans W.
Baskin. New York: Philosophical Library.
Steels, L., and J. DeBeule. 2006. A (very) brief introduction to fluid
construction grammar. Paper presented at the Third International
Workshop on Scalable Natural Language Understanding, New York
City.
Williams, E. 1994. Remarks on lexical knowledge. Lingua 92: 734.

CONTACT, LANGUAGE
Language contact occurs when individuals who regularly use
different language varieties communicate with each other.
Language variety should be understood in a very broad sense,
including varieties that are traditionally considered to be different

languages, as well as varieties of the same language, since the linguistic effects of either sort of contact are similar.
The social contexts in which language contact occurs are varied and have been common throughout human history. Most of
us, even if we are ourselves monolingual, interact with people
who are bilingual or bidialectal and, thus, are participants in
language contact situations whether we are aware of it or not.
Language contact can result in a wide variety of possible
outcomes for the language varieties involved, ranging from no
discernible effects to the borrowing of a few vocabulary items to
profound structural change. It may even result in the creation of
entirely new language varieties.

Outcomes of Language Contact


Language contact occurs when speakers of different language
varieties communicate. In general, for any two varieties of
speech, we can say that the relationship between the varieties is either one of autonomy or heteronomy. To say that they
are autonomous means that the two varieties are independent
socially and politically: English and Mandarin are autonomous
with respect to each other. However, to say that a variety is heteronomous with respect to another means that it is in some way
dependent or connected to it socially, politically, or both. So,
regional dialects of English are heteronomous with respect to
standard English and one another.
Heteronomy is not the same as mutual intelligibility: Standard
Czech and standard Slovak are more or less mutually intelligible
varieties, but they are definitely autonomous. Local Italian dialects may not be mutually intelligible but are heteronomous with
respect to standard Italian.
Heteronomous varieties frequently come into contact: in
national institutional settings, as a result of migration or trade,
and in colonial tabula rasa situations (i.e., where there
were no varieties of the language spoken in the region before).
One result of such contact may be the creation of a koin. The
original koin (the Koin) was a variety of Ancient Greek that
had come to supplant other, local Greek dialects during the
Hellenistic and Roman periods. This dialect was based mostly
on the Athenian dialect but included many elements from other
dialects and involved a certain amount of simplification: the
disappearance of irregularities in favor of structurally regular
forms.
The term koin has come to be used for any variety that supplants heteronomous varieties and that serves as a means of
intercommunication between speakers of these varieties. This
comes about as a result of dialect leveling: the loss of distinctive
features in favor of features with a high degree of mutual intelligibility and/or high prestige. Sometimes this involves a fair
amount of dialect mixture, though this neednt be the case. Where
dialect mixture is involved, the process of creating the koin can
be referred to as koinization. Koinization has probably been a
fairly common feature of the history of languages.
Standard dialects these days typically fill the role of a koin.
Standard dialects sometimes arise spontaneously through a process of koinization; sometimes they are created by a deliberate
mixture of varieties through the actions of a language academy or
a government commission (language policy), and sometimes
they are a regional or social variety selected for the purpose.

219

Contact, Language
Whether varieties are autonomous or heteronomous, linguistic elements of various sorts may be exchanged between them.
This exchange is referred to as borrowing. Virtually any linguistic
feature can be borrowed: vocabulary, grammatical morphemes,
grammatical constructions, semantic relations, sounds, and so
on. Vocabulary borrowing is the most common, but its easy to
find instances of borrowing of virtually any grammatical element.
Borrowing, particularly of vocabulary, can occur even when contact between speakers of the source and target languages is fairly
casual; when languages coexist in bilingual situations, however,
really intensive borrowing can take place.
bilingualism implies communicative skill in two autonomous varieties. (The term bidialectalism refers to communicative skill in two heteronomous varieties.) When bilingual
situations persist over a long period, convergence may take
place: The two varieties may converge toward each other, but
more usually one variety converges toward another, reflecting
local political or social dominance. If the varieties are sufficiently
different to begin with, and the convergence is extensive, the
result may be metatypy, a complete change in language type.
Cases of metatypy have not been commonly attested, but they
certainly exist: Amharic, a Semitic language spoken in Ethiopia,
has a grammar that is rather different from its Semitic kin due
to metatypic convergence with the Cushitic languages spoken in
the region prior to the arrival of Semitic languages.
At the beginning, bilingual situations are usually characterized by varying degrees of imperfect learning: situations where
in learning variety B, speakers of variety A learn B imperfectly,
making various sorts of grammatical errors in B, speaking B with
an accent, and so on. These linguistic features may become permanent components of the bilingual communitys command of
language B and may ultimately affect the speech of others speaking language B. If the bilingual community undergoes language
shift if speakers of A cease to speak A and adopt B instead the
results of imperfect learning are referred to as substratic influences on the B variety they speak. The particular sort of English
spoken in Ireland, for example, is often said to be the result, at
least in part, of this sort of substratic influence, in this case from
the Irish language to the English now spoken in the country.
Superstratic influence is also possible. A superstratic language
is one with high prestige, either in all formal contexts or in some
specified domains. Classical languages (Latin, classical Greek,
Sanskrit, Koranic Arabic, classical Tamil, etc.) can serve as superstrata, but so can living languages when these languages have
sufficient prestige. French has served as a superstratic language
for Europe, a role now filled largely by English. Chinese served
as a superstratum for Japanese, Arabic for much of the Moslem
world, Russian for other languages within the old Soviet Union,
and so on. Superstratic languages do not require many speakers
within communities for their influence to be strong: In the last
few centuries, only a minority of people in Europe learned Latin
to any significant degree, yet Latin influence on the languages
of Europe remains profound and extends beyond the extensive
borrowing of vocabulary to the borrowing of syntactic constructions, rhetorical strategies, and so on.
The discipline of second language acquisition refers
to linguistic systems that arise in the course of learning another
language as an adult as an interlanguage. In the ideal case, the

220

grammar of the interlanguage will eventually become identical


to (or nearly identical to) that of the target language. In most
cases, however, the interlanguage differs from the target language in significant ways, reflecting imperfect learning. In many
cases, the interlanguage may be a very simple, rudimentary system, consisting of a few vocabulary items and simple phrases if
the need for interlanguage-based communication is restricted
to a narrow range of activities, such as simple commercial or
workplace transactions. When such interlanguages are used by
a number of people and stabilize to a degree, they are usually
referred to as pidgins.
Pidgins typically take their vocabulary primarily from a single
language, referred to as the lexifier language. English is the lexifier language for the various pidgin Englishes that developed in
many parts of the world during the colonial period. Pidgins are
characterized by a very large degree of simplification vis--vis the
lexifier language: Even when most of the vocabulary derives from
a given language, the grammatical complexities of the language
are seldom found in the pidgin, which instead has a very simple
grammatical structure.
Of the set of stable interlanguages that can arise in contact
situations, only those with a considerable degree of grammatical simplification vis--vis the lexifier language are referred to as
pidgins. When the language varieties spoken by the people creating the interlanguage are similar particularly when they are
close enough to be heteronomous one would never describe
the interlanguage as a pidgin, even when there is some simplification involved: The task of learning the lexifier variety would
be sufficiently easy so that the radical simplification associated
with pidgins would not occur. (A koin may be the product of
simplification, but it is not a pidgin.) Similarly, with autonomous
varieties that are similar, one would not describe stable interlanguages as pidgins: An interlanguage developed between speakers
of Romance languages or Slavic languages would not be considered a pidgin. For a variety to be described as a pidgin, the native
varieties spoken by the people involved in its creation must be
sufficiently different to make learning the other variety relatively
difficult, though the degree of imperfect learning associated with
pidgins reflects restricted opportunities for learning as well.
Further, it seems that the conditions for the creation of stable
pidgins as opposed to the creation of bilingual situations
seem to be the product of economic systems associated with
states, suggesting that they may have been rare or nonexistent in
ancient times. The colonial period was an especially fertile time
for the creation of pidgins.
Stable pidgins typically originate in social situations where
they are used only in a very restricted set of social situations, for
example in commercial transactions. But if they persist over a
long period, they may come to be used in a wide range of social
contexts, in which case the pidgin may acquire a relatively large
vocabulary and a relatively large, stable set of grammatical constructions. The pidgin language Tok Pisin, the national language
of Papua New Guinea, is such a language. Such pidgins may
come to be acquired natively by children as their first language.
In many situations, this is a gradual process since pidgins arise
in multilingual situations, and therefore there are other languages that could serve as the native languages for some or even
all of the children in the community. For some communities,

Contact, Language

Context and Co-Text

however, the process may be fairly abrupt: most adult members


of the community speak the pidgin to one another and children
grow up learning only the pidgin. Creolization refers to a social
process by which a stable pidgin acquires native speakers; the
result of creolization is a creole.
Speakers of pidgins and creoles may remain in contact with
native speakers of the lexifier language, or they may not. If they
remain (or come to be) in contact with such speakers, and if the
lexifier language is a prestige language for instance, a language
of administration and/or of the social elite the pidgin or creole
may become heteronomous with the lexifier language. In such
cases, the pidgin or creole may borrow vocabulary and/or grammatical constructions from the lexifier language, becoming more
like it in the process. This, to some degree, is the reverse of the
pidginization process. If, in particular, the lexifier language is the
official language, serving as the standard dialect in the region
where the pidgin or creole is spoken, the pidgin or creole may be
perceived by its speakers and by the local authorities as a dialect
of the lexifier language, and a process of dialect leveling may take
place similar to what was described under the label of koinization. When this happens, we may find that there is a continuum
of speech varieties in the community, ranging from relatively
pure versions of the pidgin or creole to forms of the language
that resemble (or are identical with) the standard dialect. Such
situations, when they affect creoles, are referred to as post-creole
continua. The pidgin or creole may eventually lose its distinctive status as speakers come to speak versions of the language
that are no longer distinguished by the results of a pidginization
process. This leveling process, when it affects creoles, is referred
to as decreolization. A variety that descends from a creole but
has undergone extensive decreolization is referred to as a postcreole.
The processes that we have been discussing pidginization
and dialect leveling can take place in tandem. That is, its possible, for example, that a language could acquire new native
speakers via pidginization and subsequent creolization, while at
the same time undergoing dialect leveling (koinization) in favor
of the creole-based dialects. This has happened, more often than
is usually acknowledged, in the historical development of languages. Languages that have undergone this sort of development
can be referred to as creoloids. Since the social dynamics of language shift situations can vary considerably, we have to examine
each instance of language shift to determine whether or not its
result is a creoloid. Afrikaans, the Germanic language spoken in
South Africa, is often described as a creoloid.
Michael Noonan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Thomason, Sarah Grey. 2001. Language Contact: An Introduction.
Washington, DC: Georgetown University Press.
Thomason, Sarah Grey, and Terrence Kaufman. 1988. Language Contact,
Creolization, and Genetic Linguistics. Berkeley: University of California
Press.
Trudgill, Peter. 2000. Sociolinguistics. London: Penguin Books.
Weinreich, Uriel. [1953] 1968. Languages in Contact. The
Hague: Mouton.
Winford, Donald. 2003. An Introduction to Contact Linguistics.
Oxford: Blackwell.

CONTEXT AND CO-TEXT


The term context is used to refer very generally to the extralinguistic circumstances in which language is produced as a text
and to which the text is related the setting in which the language is used, for example, and the participants involved. But
such circumstances are many and indeterminate, and only when
they relate to the text in the realization of meaning do they count
as context. Many circumstantial features may have no bearing
whatsoever on the meaning that is intended by a text or how it
is interpreted. The question is: How does one establish which
attendant circumstances are contextually significant and which
are not?
The importance of taking context into account as a matter
of principle in the definition of meaning has been long established. Early in the last century, the anthropologist Bronislaw
Malinowski argued that an understanding of the way in which
language functions as a mode of action depends on establishing
a relationship with its context of situation (Malinowski 1923).
Subsequently, the linguist J. R Firth reformulated the notion as a
suitable schematic construct to apply to language events (1957,
182) This construct makes mention of the relevant features of
participants and the relevant objects, but leaves unanswered
the key question of how relevance is to be determined.
Context is a selection of circumstantial features that are recognized by the language user as relevant in that they key into
text to achieve communication. One set of criteria for determining relevance can be found in the conditions for realizing
pragmatic meaning as proposed in the theory of speech-acts
(Searle 1969). A piece of text, the uttering of a particular linguistic
expression, for example, can be said to realize a particular illocutionary force to the extent that circumstantial features are
taken to satisfy the conditions that define the illocution. Thus, the
illocutions of threat and promise have the conditions in common
of reference to a future event and one controlled by the first person, but they differ as to whether the event has a negative effect
(threat) or positive effect (promise) on the second person. Hence,
the utterance I will call again tomorrow could be interpreted as
either a threat or a promise, depending on the contextual factors
of who said it to whom and in what circumstances.
The recognition of relevance comes about because language
users are familiar with such conditions as part of their extralinguistic sociocultural knowledge. But familiarity with illocutionary conditions is only one kind of sociocultural knowledge that is
brought to bear in the recognition of contextual significance. The
world we live in is made familiar by projecting two kinds of order
onto it: linguistic encoding, on the one hand, and sociocultural
convention, on the other. Communication involves an interaction between them: We make texts with the first with a view to
keying them into the second. Sociocultural conventions take the
form of schemata (see schema): customary representations of
reality, in various degrees culture-specific, modes of behavior
and thought that are socially established as normal. Contexts are
features of a particular situation that are identified as instantiations of these abstract configurations of experience that are realized and recognized in text. These schematic constructs are not,
however, static and fixed since once they are engaged they can
be extended and changed. Although communication depends

221

Context and Co-Text

Control Structures

on some schematic convergence to get off the ground at all, it can


then develop its own creative momentum.
Although context is generally understood as an extratextual
phenomenon, apart from text but a crucial concomitant to it, the
term is also often used, misleadingly, to refer to the intratextual
relations that linguistic elements contract with each other within
text. An alternative, and preferable, term for this is co-text.
Co-textual relations occur between linguistic elements at different levels. William Labov shows the tendency for segments of
spoken utterance at the morpho-phonemic level, for example, to
vary according to the phonetic and morphological environment in which they co-textually occur, and he is able to specify
variable rules for their occurrence. These are distinct from other
variable rules that Labov postulates, rules that have to do with
contextually motivated variation where speakers adjust their
pronunciation in relatively formal situations in approximation
to prestige social norms (Labov 1972). Co-textual variation is
a property of text and in itself has no social significance as discourse. Contextual variation, on the other hand, decidedly does.
Co-textual relations at the lexico-grammatical level have
attracted particular interest over recent years in the field of
corpus linguistics. Computers now provide the means for
collecting and analyzing vast quantities of text and for identifying in detail the regularities of co-textual patterning that occur
(Sinclair 1991). One such pattern is that of collocation, the regular occurrence of one word in the environment of another. But
co-textual patterning extends beyond the appearance of pairs of
words in juxtaposition and is also manifested in word sequences
of relative degrees of fixity. The identification of such co-textual
relations has led to the recognition that text is essentially formulaic in structure (Wray 2002).
Whereas contextual relations bring about pragmatic
effects, co-textual relations of this lexico-grammatical kind have
semantic consequences to the extent that the mutual conditioning of meaning across co-occurring words becomes established as a conventional encoding. Another kind of semantic
linking is brought about by the co-textual function of cohesion
(Halliday and Hasan 1976). Here, there is a copying of one or
more semantic features from an antecedent expression on to
an expression that follows. Thus, a pronoun like she would link
cohesively with a noun phrase like the lady in red occurring earlier in a text in that it copies the features of singular and female.
It should be noted, however, that the co-textual link of cohesion,
being semantic, does not guarantee that the appropriate pragmatic reference will be achieved. There may be more than one
antecedent to which the copying expression may semantically
relate. Co-textual cohesive links, therefore, do not themselves
result in referential coherence; the latter depends on contextual factors (Blakemore 2001).
H. G.Widdowson

Halliday, M. A. K, and R. Hasan. 1976. Cohesion in English.


London: Longman.
Labov, W. 1972. Sociolinguistic Patterns. Pittsburgh: University of
Pennsylvania Press.
Malinowski B. 1923. The problem of meaning in primitive languages. In
The Meaning of Meaning, ed. C. K. Ogden and I. A.Richards, 296336.
London: Routledge and Kegan Paul.
Schriffrin, D. 1994. Approaches to Discourse. Oxford: Blackwell.
Searle, J. R. 1969. Speech Acts. Cambridge: Cambridge University Press.
Sinclair, J. M. 1991. Corpus, Concordance, Collocation. Oxford: Oxford
University Press.
Widdowson, H. G. 2004. Text, Context, Pretext: Critical Issues in Discourse
Analysis. Oxford: Blackwell.
Widdowson, H. G.. 2007. Discourse Analysis. Oxford Introductions to
Language Study. Oxford: Oxford University Press.
Wray, A. 2002. Formulaic Language and the Lexicon. Cambridge:
Cambridge University Press.

CONTROL STRUCTURES
In many languages, when the subject of an embedded clause is
identical in reference (coreferential) with some noun phrase in
the main clause, the former may (or must) be left syntactically
unexpressed. Thus, sentence (1a) can be paraphrased as (1b),
where the understood subject of the embedded clause corresponds to the pronoun he in (1a). This unexpressed subject is
standardly notated by PRO, as represented in (1c). The referential dependence of PRO on George is expressed by the sharing of
an index (here, subscript i). This dependence is called control
(originally equi-NP deletion, see Rosenbaum 1967).
(1) a. George hoped that he would meet the Pope.
b. George hoped to meet the Pope.
c. Georgei hoped [PROi to meet the Pope].

Universally, PRO can only occur in a subject position.


Furthermore, in most languages, PRO can only occur in nonfinite clauses. However, the latter is not a universal condition,
as controlled clauses in the Balkan languages, for example, are
systematically finite. Deriving the distribution of PRO is a fundamental issue in the theory of control ever since Chomsky
(1981).
Although occasionally challenged, the existence of PRO
receives strong empirical support in languages with case concord (Sigursson 1991), where PRO can be shown to bear the
same morphological case that an overt subject does. Further
evidence for PRO is provided by pairs like (2). Secondary predicates (like angry) cannot be predicated on arguments absent
from the syntax, like the implicit agent of serve. The fact that
they can be predicated on the understood subject of an infinitive implies that the latter is syntactically present, even if phonetically null.

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

(2) a. Michael served dinner angry / *Dinner was served angry.


b. Michaeli hated [PROi to serve dinner angry].

Blakemore, D. 2001. Discourse and relevance theory. In The Handbook


of Discourse Analysis, ed. D. Schiffrin, D. Tannen, and H. E. Hamilton,
10018. Oxford: Blackwell.
Firth, J. R. 1957. Papers in Linguistics 193451. Oxford: Oxford University
Press.

Examples (1c) and (2b) illustrate obligatory control. In contrast,


when the nonfinite clause occurs as a subject, the reference of
PRO is free and can pick a remote linguistic antecedent or no
antecedent at all (arbitrary PRO). This situation is called nonobligatory control.

222

Conversational Implicature
(3) a. Janei admitted that it was likely that [PROi perjuring herself]
was a mistake.
b. [PROarb to blame everything on fate] is all too common.

Thus, theories of control must explain, at a minimum: i) where


PRO must/can/cannot occur, ii) what syntactic configurations
require obligatory versus nonobligatory control, and iii) the lexical/semantic/pragmatic factors that affect the choice of controller in particular environments. This unique combination makes
control an area where separate modules of grammar lexicon,
syntax, semantics, and pragmatics converge.
Idan Landau
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht,
the Netherlands: Kluwer.
Landau, Idan. 2000. Elements of Control: Structure and Meaning in
Infinitival Constructions. Dordrecht, the Netherlands: Kluwer.
Manzini, M. Rita. 1983. On control and control theory. Linguistic
Inquiry 14: 42146.
Rosenbaum, Peter. 1967. The Grammar of English Predicate Complement
Constructions. Cambridge, MA: MIT Press.
Sigursson, Halldr A.. 1991. Icelandic case-marked PRO and the licensing of lexical arguments. Natural Language and Linguistic Theory
9: 32763.

CONVERSATIONAL IMPLICATURE
The British philosopher Herbert Paul Grice observed that the total
significance of an utterance embraces not only what is said but
what is implied. His term of art for the latter was implicature,
and he identified conversational implicature as an important
type of implicit meaning or signification.
Grice used the following example to introduce this type of
implicit meaning in his 1967 William James lectures at Harvard: A
and B discuss a mutual friend, C, who has recently started working at a bank. When A asks B how C is getting on, B replies, Oh
quite well, I think; he likes his colleagues, and he hasnt been to
prison yet. Given knowledge of English and of contextual factors, A can readily grasp that B has said that C is getting on well,
likes his colleagues, and hasnt been to prison yet. A may also
understand that B has implied something else with the remark
about prison, and, according to Grice, a rational reconstruction
of the bases of this understanding reveals a complex inferential
process a process based on principles that the persons involved
probably would not be able to articulate unless they had studied
the Gricean literature.
At the top of Grices list is the cooperative principle,
roughly, the idea that it is rational for participants in conversations to advance the accepted purpose or direction of the talk
exchange to which they contribute. As it is routinely recapitulated, Grices theory specifies that in addition to this basic presumption about rational cooperation, hearers should act and
think in terms of conversational maxims or imperatives, which
include the following:
Maxim of Quality. Make your contribution true; so do not
convey what you believe false or unjustified.
Maxim of Quantity. Be as informative as required.

Maxim of Relation. Be relevant.


Maxim of Manner. Be perspicuous; so avoid obscurity and
ambiguity, and strive for brevity and order.
Grice consistently allowed that this list is not exhaustive, and
at the end of his career, he raised some additional issues. Quality,
he suggests in his Retrospective Epilogue (1989, 3702), differs from the other maxims in being essential to the making of
a genuine contribution. Nor are the maxims independent of one
another.
Grice contends that an implicatum can be conveyed by obeying the maxims, as well as by the flouting of a maxim (1989,
301). When what is said is patently irrelevant, false, uninformative, or obscure, the hearer is incited to search for some speakers
intention that does contribute to the purpose of the conversation.
In one of his descriptions of the inferential pattern whereby the
hearer works out a conversational implicature, Grice has the
hearer reason as follows: S has said that p; I presume he is observing the cooperative principle, and p does not on its own suit the
purposes of the conversation, so he must have been implicating
some other proposition, q; the speaker knows and (knows that I
know that he knows) that I can see that the supposition that he
thinks that q is required; he intends me to think, or is willing to
allow me to think, that q; and so he has implicated that q.
To apply this pattern to our example, A assumes that B was
being cooperatively rational (informative, sincere, relevant, perspicuous) in making the obscure remark about C not having been
to prison and so must draw upon background beliefs to come up
with the point of the remark. This background could pertain to
what A and B mutually believe about C, such as the idea that C
has venal inclinations. The remark about prison may be interpreted, then, as a hyperbolic comment meant to evoke this trait.
Grice states that a conversational implicature can be canceled without contradiction. For example, B could coherently
add: but of course there is no real danger of C going to prison.
Implicatures can also be reinforced, which would be the case
should B add: so lets hope C does not get caught. To implicate
something insincerely can be dangerously misleading but does
not amount to lying: The hearer cannot reasonably complain
that B has performed the illocutionary action of stating or asserting that their friend has venal inclinations.
Grice distinguishes between conversational and conventional
implicature. Sentences, as opposed to utterances, implicate
what speakers or writers who follow the linguistic conventions
would normally use the sentence to implicate. He also contends
that there are generalized conversational implicatures that
are not conventional. One of his examples is that if someone
says X went into a house, it is normally but not conventionally implied that the house is not Xs own (1989, 378). A detailed
neo-Gricean account of generalized conversational implicature
for utterance-type meaning has been developed by Stephen C.
Levinson (2000), who argues that an implicature is generalized
just in case it is implicated by default or, in other words, unless
there are unusual contextual assumptions that prevent the implicature from being appropriate.
Grice describes implicature as a pervasive feature of discourse and extends his account to cover metaphor, irony,
and indirect speech-acts. His exploration of implicature was

223

Conversational Implicature
linked to larger philosophical themes, including his defense of a
causal theory of perception and contentions about ambiguity
and presupposition in ordinary language use. Grice argues,
for example, that the word or is not ambiguous in English since
the exclusive interpretation (according to which or means that
either but not both disjuncts is true) can be understood as a conversational implicature and not as a second meaning of or.
According to the thesis known as Grices Razor, it is better to
posit conversational implicatures than ambiguity (and in some
cases, presupposition) because implicatures can be derived
more economically from the independently motivated principles
of cooperative rationality.
Although it is widely acknowledged that implicature is an
important phenomenon, questions are raised about the explanatory and descriptive value of Grices theory. Wayne A. Davis
(1998) argues that Grices maxims and cooperative principle
predict a range of implicatures that do not actually occur, while
other implicatures that do occur cannot be derived from them.
He also argues that the theory has no genuine explanatory payoffs. The proximal causes of a speakers conversational moves
are that persons attitudes, not general tendencies to cooperate or an audiences presumptions about the latter. According
to Davis, Grice wrongly assumes that the production and recognition of implicature are processes explicable in terms of the
same principles and maxims. This premise is misleading if what
a speaker implicates or means is not caused by what others presume or know about that speaker. To implicate a meaning that
extends beyond what one says is to say something with certain
intentions, and the speakers intentions do not directly depend
on what others know or presume. Having and expressing intentions is one thing, whereas communicating them to others is
something else. Grice appears to assume that audience uptake of
a certain kind is necessary or even sufficient to the realization of
communicative intentions.
Grices explicit analysis of conversational implicature can
be read as indicating that the very existence of implicature (as
opposed to its successful uptake or understanding by some audience) requires the presumption, on the part of a hearer, that the
speaker has observed or acted in accordance with the cooperative principle. Thus, Grice writes that S implicates q only if he
is presumed to be observing the conversational maxims (1989,
30). On an alternative reading, the actual hearers presumptions
and other beliefs are not necessary, since some implicatures
are made by a speaker but remain unrecognized by the target
audience. What Grice has proposed is an account of successfully communicated implicatures, but not implicature tout court.
Jennifer Saul (2002, 241) suggests that what matters for Grice
is not what particular hearers actually think, but what they are
required to think. Grice indeed stressed that his focus was on
the rationality or irrationality of conversational conduct (1989,
369). The thesis that implicature must be calculable or capable
of being worked out can, then, be taken as belonging to a normative theory of the conditions under which speakers can successfully realize the rational intention to implicate rather than
to state some thought. Yet it is unclear why the norms of communicative rationality should apply to both noncommunicative
and communicative linguistic behavior. Is it persuasive to argue
that it is simply impossible for a speaker to have implicated some

224

proposition because that would have been irrational? In other


words, why could there be no irrational implicatures?
Other challenges to Gricean theory target the interest and
adequacy of the normative account of the hearers recognition of
implicature. Kim Sterelny (1982, 1913) observes that it is knowledge of the speaker that is crucial to the success of this kind of
interpretive project, not knowledge of conversational principles,
maxims, rules, or general tendencies. Interpreters do sometimes
discern the intentions of uncooperative, strategic, and even idiosyncratic interlocuters who violate the norms of rational, cooperative speech. Implicatures that are generated and understood
may be a prevalent feature of non-Gricean discursive exchanges,
that is, exchanges that diverge very significantly from the norm
of cooperative communicative activity. The comprehension of
implicature is assisted by the existence of various conventionalized forms, many of which vary from culture to culture, as Anna
Wierzbicka (1991) has documented.
Davis argues that the interpretation of implicit meaning does
not depend on any one specialized or characteristic pattern of reasoning, concluding that any principle general enough to hold in
all cases of implicature will be too general to yield specific predictions (1998, 99). This criticism also applies to the versions of the
principle of relevance advanced by Sperber and Wilson (1986) as
the successor to Grices bundle of maxims. It seems highly dubious to suppose that what people imply, and what others effectively take them to be implying, in all discourse is determined by a
quest for communicative efficiency defined as the maximization
of information conveyed per unit of processing cost.
An alternative to the Gricean recourse to broad psychosocial
principles is to focus on the role of conventions in both the generation and understanding of the speakers implicature. Often
when we intend to imply one thing by saying another we rely
upon some conventional, established idiom. For example, it is
idiomatic that S could have done y normally implies that S did
not do y, whereas the nearly synonymous S was able to do y
implies that S did do y (Davis 1998, 378). Thus, Bernard was
able to make the final putt can be used to implicate conventionally that Bernard made the putt, and the speakers intention and
corresponding implicature can be grasped through knowledge
of the convention without recourse to the complex inferences
Grice postulated. As an alternative to the thesis that interpreters
reason from cognitive states to implicatures, Nicholas Asher and
Alex Lascarides (2003) argue that reasoning about implicature is
based on rhetorical structures understood as speech-act types.
Paisley Livingston
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Asher, Nicholas, and Alex Lascarides. 2003. Logics of Conversation.
Cambridge: Cambridge University Press.
Brown, Penelope, and Stephen C. Levinson. 1987. Politeness: Some
Universals in Language Usage. Cambridge: Cambridge University
Press.
Davis, Wayne A. 1998. Implicature: Intention, Convention, and Principle
in the Failure of Gricean Theory. Cambridge: Cambridge University
Press.
Grice, Herbert Paul. 1975. Logic and conversation. In Syntax and
Semantics. Vol. 3. Ed. Cole and Morgan, 4158. New York: Academic
Press.

Conversational Repair
. 1989. Studies in the Way of Words. Cambridge: Harvard University
Press.
Kasher, Asa, ed. 1998. Pragmatics: Critical Concepts. London: Routledge.
Levinson, Stephen C. 2000. Presumptive Meanings. Cambridge, MA: MIT
Press.
Saul, Jennifer. 2002. Speaker meaning, what is said, and what is implicated. Nos 36: 22848.
Sperber, Dan, and Deirdre Wilson. 1986. Relevance: Communication and
Cognition. Cambridge: Harvard University Press.
Sterelny, Kim. 1982. Against conversational implicature. Journal of
Semantics 1: 18794.
Wierzbicka, Anna. 1991. Cross-Cultural Pragmatics: The Semantics of
Human Interaction. New York: Mouton de Gruyter.

CONVERSATIONAL REPAIR
Conversational repair (hereafter, repair) refers to a common
practice in the interactive social organization of conversations
in which speakers suspend the smooth progressivity of the talk
to deal with some ostensible problem in speaking, hearing, or
understanding the talk. Repair does not always involve hearable
errors or mistakes that require correction. Therefore, the term
repair, rather than correction, is used to capture the more general
domain of such occurrence in conversation analysis.
The organization of conversation is a turn-taking system in
which speakers take turns to converse. A repair may be done
by the speaker of the trouble source in the same turn (sameturn self-repair). Or it may be done by anyone but the speaker.
Furthermore, a repair may be initiated by the speaker of the
trouble source or by others. Repair is often carried out with repetition/recycling, replacement, or restructuring of the utterance,
although not all repair attempts may be successful. Studies find
that self-repair prevails even when a repair is initiated by others.
Next is a brief discussion of same-turn self-repair, for which
an emerging utterance may be stopped, aborted, recast, continued, or redone. Such repair often involves self-initiation with
some nonlexical initiators, such as cutoffs, sound stretches, uhs,
and so on, followed by repair. Following is an example containing two instances of repair with cutoffs, replacements, insertion,
and repetition/recycling (the asterisk indicates where repair
initiates).
(1) And tshe-* this girls fixed up onna da- * a blind da:te.

In the first instance of repair, the speaker cuts off the pronoun
tshe- (i.e., the repairable; the - indicates glottalized cutoff) and
replaces it with a full noun phrase, this girl. The second instance
is where date is cut off to introduce a modifier by recycling the
entire noun phrase: a blind date.
Repair is highly patterned, with some basic mechanisms
occurring cross-linguistically. But specifics of the mechanisms
differ. For instance, recycling often occurs at a turn beginning
when the utterance overlaps with the ending of the previous
speakers turn. However, the syntactic unit that is recycled differs
from one language to another: Some allow repetition of single
words, whereas some require larger syntactic units to be recycled (e.g., in the example, the entire noun phrase is recycled).
Furthermore, speakers of a tone language such as Mandarin
make tone-related recycles. Therefore, repair mechanisms are
constrained by the grammar of individual languages.

Conversation Analysis
Conversation analysis finds that the grammar of repair is
vital for syntax-for-conversation. Repair is closely related to
syntax because it affects the shape and/or components of
a sentence. Syntax organizes elements through which talk
is constructed. Syntax-for-conversation cannot exist without
repair because speakers constantly search for the next item
due for the interactive needs of the conversation. The study of
repair, therefore, demonstrates how interaction and grammar
shape each other.
Liang Tao
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Fox, Barbara, Markoto Hayashi, and Robert Jasperson. 1996. Resources
and repair: A cross-linguistic study of the syntactic organization
of repair. In Interaction and Grammar, ed. Elino Ochs, Emanuel
Schegloff, and Sandra Thompson, 185237. Cambridge: Cambridge
University Press.
Schegloff, Emanuel, Gail Jefferson, and Harvey Sacks. 1977. The preference for self-correction in the organization of repair in conversation.
Language 53: 36182.

CONVERSATION ANALYSIS
Conversation analysis (CA) is the study of talk (and other conduct) in human interaction that began with the pioneering work
of Harvey Sacks (1995) and his collaborators Emanuel Schegloff
and Gail Jefferson (e.g., Sacks, Schegloff, and Jefferson 1974;
Schegloff, Jefferson and Sacks 1977). CA seeks to establish technical specifications of the practices people use to co-construct
orderly and mutually understandable courses of action. These
specifications constitute a cumulative, empirically derived body
of knowledge that is foundational to CA as a discipline. Since its
beginnings within sociology in the late 1960s and early 1970s,
CA has become hugely influential, both as an emerging discipline in its own right and across the fields of sociology, psychology, anthropology, linguistics, and education. It is increasingly
applied in studies of institutional and organizational interaction
(including news interviews, court proceedings, emergency and
help-line calls, and doctorpatient interaction; see Drew and
Heritage 1992; Heritage and Maynard 2006) and in sociological studies of the operation of social norms and the reproduction of culture (especially related to gender [see gender and
language] and sexualities; Kitzinger 2000, 2005). Although
it originated in the analysis of talk from American-English speakers, CAs basic findings have now been replicated across many
other languages.
The intellectual roots of CA lie in a synthesis of the sociological
traditions established by Erving Goffman and Harold Garfinkel
traditions that, like other broadly social constructionist theoretical frameworks, offer models of people as agents and of a
social order grounded in contingent, ongoing, interpretive work
(see Heritage 1984). CA aims to build a science of social action,
rather than to contribute to the study of language per se. It relies
on analysis of recordings of naturally occurring human interaction (i.e., not invented or hypothetical data and not data generated by researchers via interviews or in laboratories). Recordings
are transcribed according to a distinctive transcription notation system (Jefferson 2004), but it is the recordings themselves

225

Conversation Analysis
(and not transcripts of them) that are the primary data. Sound
files are increasingly being made available on the World Wide
Web (see http://www.sscnet.ucla.edu/soc/faculty/schegloff/
sound-clips.html for sound clips from Schegloffs publications),
enabling readers of published work to access the original data.
Much early CA was based only on audio recordings (since the
technology for video recording was not yet available, but see
Goodwin 1981), which precluded analysis of such interactional
features as gesture, body deployment, and gaze. Video recordings of face-to-face interactions are now the norm. Although new
data are continually being collected, several core data sets have
been shared within the CA community since the 1970s (e.g., the
telephone conversation known as Two Girls [TG], which can
be accessed at http://www.cambridge.org/9780521532792,
Appendix 2). These shared data are widely used in teaching,
frequently reanalyzed for new phenomena, and appear in publications by a range of different authors. Analysis of these kinds
of ordinary conversations is the point of departure for studying
more specialized communicative contexts (the legal process, the
medical encounter) in which social institutions are talked into
being (Heritage 1984).
Conversation analysis has produced few theoretical manifestos but has, rather, concentrated on fine-grained empirical
studies of interaction. These studies rest upon three fundamental theoretical assumptions (Heritage 1984): i) Talk is a form of
action; that is, people use it to do things like complaining, complimenting, disagreeing, inviting, telling, and so on; ii) action
is structurally organized; that is, turns at talk are systematically
related to one another, such as (for example) when an acceptance follows an invitation or a self-deprecation follows a compliment (see adjacency pair); and iii) talk creates and maintains
intersubjectivity; that is, a first speaker understands, by what a
second speaker does, how that second speaker heard his or her
first turn as when a second speaker produces a turn hearable as
an answer, thereby showing herself /himself to have heard the
prior turn as a question.
The focus in CA research is on identifying generic orders of
organization in talk-in-interaction that are demonstrably salient
to the participants analyses of one anothers turns at talk in
the progressively unfolding interaction. Data are rarely coded
or quantified since manifest similarities in talk may turn out to
have very different interactional meanings. Key discoveries of CA
include turn-taking, action formation, sequence organization,
repair, word selection, and overall structural organization, each
of which is now sketched out.

Turn Taking
The classic paper by Sacks, Schegloff, and Jefferson (1974) presents a model to describe the practices whereby people (mostly)
speak one at a time. Summarized very simply, the model proposes that the building blocks out of which turns are composed
(turn constructional units or TCUs) can be whole sentences,
phrases, sometimes just single words, or even nonlexical
items which, in context, are recognizable to a co-participant
as possibly constituting a complete turn. Each speaker is initially entitled to just one TCU, after which another speaker has
the right (and sometimes the obligation) to speak next. As a
speaker approaches the possible completion of a first TCU in a

226

turn, transition to a next speaker can become relevant: This is a


transition relevance place. Turn-taking organization is designed
to minimize turn size, such that a turn with one (and only one)
TCU is the default, and extended turns with lengthy and/or
multiple TCUs are accomplishments. This has important implications for the analysis of overlapping talk and of longer turns
at talk (including, but not limited to, storytelling), both of which
have been extensively researched. The model also encompasses
speaker-selection techniques in multiparty interaction.

Action Formation
Researchers have focused on how speakers deploy talk (and other
conduct) in order to fashion a turn designed to be recognizable to
their recipients as doing a particular action, that is, how people
do complaining, or inviting, or declining, and so on (Atkinson
and Heritage 1984). Since CA (unlike speech-act theory) starts
from the analysis of singular episodes of human interaction and
undertakes to understand action as the co-participants understand it, one outcome of this kind of analysis is a very detailed
understanding of how (for example) complaining or inviting
are done that often departs from vernacular understandings.
Another outcome is the discovery of actions that have no vernacular name (e.g., confirming an allusion; Schegloff 1996a).

Sequence Organization
The most basic type of sequence involves two turns at talk by different speakers, the first constituting an initiating action (first
pair part) and the second an action responsive to it (second pair
part): for example, an invitation and an acceptance or declination of it; a news announcement and a news receipt (see adjacency pair). Most initiating actions can be followed by a range
of sequentially relevant (i.e., appropriately fitted) next actions,
some of which further the action of the prior turn (e.g., accepting
an invitation) and are termed preferred responses, and others of
which do not (e.g., rejecting an invitation) and are termed dispreferred. The basic two-turn adjacency pair sequence can be and
frequently is expanded. Pre-expansions are turns that come
before and are recognizably preliminary to some other action;
for example, a turn such as What are you doing tonight? can be
recognizable in context as preliminary to an invitation (hence, a
pre-invitation); a turn such as Guess what is virtually dedicated
to preannouncement. Insert expansions come between the first
and second pair parts, for example, between an invitation and
the acceptance or declination of it (Do you wanna come round
tonight? / What time? / About six. / Okay where the invitation and its acceptance are separated by an insert sequence).
Postexpansions come after the second pair part and may accept
or assess it. For example, You want me to bring you anything?
(offer: first pair part) / No, no nothing (declination: second
pair part) / Okay (acceptance of the declination, expanding the
sequence to a third turn). The authoritative work on adjacency
pairs and expansions of them, the organization of preference
and dispreference, and other types of sequence organization is
Schegloffs (2007) primer.

Repair
Interactional co-participants must manage troubles in speaking, hearing, and/or understanding talk if the interaction is not

Conversation Analysis
to founder when trouble arises (see conversational repair).
Repair is a method for fine-tuning a turn in the course of its production and for maintaining intersubjectivity. Researchers have
shown some of the practices that speakers use across a range of
different positions in talk, both in repairing their own talk (e.g.,
by deleting, inserting, or replacing a word; Schegloff, Jefferson,
and Sacks 1977) and in initiating repair on the talk of others (e.g.,
with open-class repair initiations like huh?; Drew 1997). Most
repairs are completed by the speaker of the trouble source in the
same turn (more accurately, the same TCU) as the trouble source
but can be delayed to third turn or third position, or even later
(Schegloff 1992).

Word Selection
Turns at talk are composed of lexical items selected from among
alternatives. For example, when English-language speakers refer
to themselves, they can often select between I or we (the latter
choice sometimes being used, for example, to index that they are
speaking on behalf of an organization or a couple). Alternatively,
they can self-reference in distinctive (marked) ways (e.g., selfnaming or self-description) that perform analyzable actions.
Likewise, explicit self-reference in (so-called zero-anaphora)
languages in which this is not required has been shown to be
interactionally meaningful (see the Lerner and Kitzinger 2007
collection on selection issues in self-reference). Category-based
reference to nonpresent persons also involves choices between
alternatives (Schegloff 1996b); for example, law enforcement
officers can be referred to as police or cops, and speakers selection of one or the other may be responsive to whether the speaker
is appearing in court (Jefferson 1974) or talking with adolescent
peers (Sacks 1995). CA explores how word selection is done as
part of turn design and how it informs and shapes the understanding achieved by the turns recipient.

Overall Structural Organization


Talk-in-interaction is organized into phases, for example, most
obviously, openings and closings (Schegloff and Sacks 1973).
Within ordinary conversation, however, matters are comparatively fluid. Within organizational talk, by contrast, there are
component phases or activities that characteristically emerge in a
particular order. Acute doctorpatient interactions, for example,
have a highly structured overall organization (opening, presenting complaint, examination, diagnosis, treatment, and closing;
Heritage and Maynard 2006), and doctors and patients conduct
can be analyzed for the way in which they orient to and negotiate the boundaries of each phase of the interaction. Many recent
studies draw on analyses of overall structural organization as
part of research designed to be of practical use by organizations
in improving the quality of their services.

Conclusion
Although much research remains to be done, there is, for each of
these orders of organization, an established set of core findings,
foundational to the discipline of CA. An outstanding bibliographical source of information about CA is available on the Ethno/CA
Web site maintained by Paul ten Have at http://www2.fmg.uva.
nl/emca/resource.htm.
Celia Kitzinger

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Atkinson, J. Maxwell, and John Heritage, eds. 1984. Structures of Social
Action. Cambridge: Cambridge University Press.
Clayman, Steven, and John Heritage. 2002. The News Interview.
Cambridge: Cambridge University Press.
Drew, Paul. 1997. Open class repair initiators in response to sequential
sources of trouble in conversation. Journal of Pragmatics 28: 69101.
Drew, Paul, and John Heritage, eds. 1992. Talk at Work: Interaction in
Institutional Settings. Cambridge: Cambridge University Press. A classic collection of studies exploring the application of conversation
analysis to the study of language and interaction in applied settings,
including doctorpatient consultation, legal hearings, news interviews,
and emergency calls. For more recent work in talk in organizational
settings, see Heritage and Maynard (2006) and Clayman and Heritage
(2002).
Goodwin, Charles. 1981. Conversational Organization. New
York: Academic Press.
Heritage,
John.
1984.
Garfinkel
and
Ethnomethodology.
Cambridge: Cambridge University Press.
Heritage, John, and Douglas Maynard, eds. 2006. Communication in
Medical Care. Cambridge: Cambridge University Press.
Jefferson, Gail. 1974. Error correction as an interactional resource.
Language in Society 2: 18199.
. 2004. Glossary of transcript symbols with an introduction. In
Conversation Analysis, ed. Gene Lerner, 1331. Amsterdam: John
Benjamins.
Kitzinger, Celia. 2000. Doing feminist conversation analysis. Feminism
and Psychology 10: 16393.
. 2005. Heteronormativity in action. Social Problems
52.4: 47798.
Lerner, Gene. 2004. Conversation Analysis: Studies from the First
Generation. Amsterdam: John Benjamins. A collection of early but
previously unpublished research by many of the central figures in the
development and advancement of CA.
Lerner, Gene, and Celia Kitzinger, eds. 2007. Referring to self and others
in conversation. Discourse Studies 9.4 (Special Issue).
Sacks, Harvey. 1995. Lectures on Conversation.Vols. 1 and 2.
Oxford: Blackwell. Useful for understanding the early beginnings of
CA, these two volumes present lectures, transcribed and edited by Gail
Jefferson, from one of the founders of conversation analysis, as delivered to classes at the University of California between 1965 and 1972.
Each volume has an introduction by Emanuel Schegloff.
Sacks, Harvey, Emanuel A. Schegloff, and Gail Jefferson. 1974. A simplest systematics for the organization of turn-taking for conversation.
Language 50: 696735.
Schegloff, Emanuel A. 1992. Repair after next turn. American Journal of
Sociology 95: 12951345.
Schegloff, Emanuel A.. 1996a. Confirming allusions: Toward an empirical account of action. American Journal of Sociology 104.1: 161216.
. 1996b. Some practices for referring to persons in talk-ininteraction. In Studies in Anaphora, ed. Barbara Fox, 43785.
Amsterdam: John Benjamins.
. 2007. Sequence Organization in Interaction: A Primer in
Conversation Analysis. Vol 1. Cambridge: Cambridge University Press.
A landmark text providing the definitive introduction to sequence
organization and capsule reviews of other key concepts such as turns,
actions, and repair, each of which will constitute the subject matter of
forthcoming primers by this leading authority on CA.
Schegloff, Emanuel A., Gail Jefferson, and Harvey Sacks. 1977. The preference for self-correction in the organization of repair in conversation. Language 53: 36182.
Schegloff, Emanuel A., and Harvey Sacks. 1973. Opening up closings.
Semiotica 7.4: 289327.

227

Cooperative Principle

Core and Periphery

COOPERATIVE PRINCIPLE
Introduced by the British philosopher Herbert Paul Grice
(191388), the cooperative principle and related maxims are part
of his theory of conversational implicature.
Grice formulates the principle as an imperative: Make your
contribution such as required, at the stage at which it occurs, by
the accepted purpose or direction of the talk exchange in which
you are engaged (1975, 45). He observes that it is a well-recognized empirical fact that this ceteris paribus principle applies
to all talk exchanges that do not consist of wholly disconnected
remarks, and he adds that he would like to be able to argue
that the principle is grounded in rationality. To that end, he suggests that persons participating in conversational exchanges
do so with certain shared purposes, such as exchanging information and influencing and being influenced by others. These
shared purposes, Grice suggests, are in general only realized if
the exchanges are conducted in accordance with the cooperative principle. For those who know this, it is rational to behave in
accordance with the cooperative principle and to expect others
to do so as well. Thus, this principle and presumption are rational, given assumptions about shared conversational ends, effective means to those ends, and rationality.
Many researchers (e.g., Brown and Levinson 1987; Clark
1996) describe the cooperative principles applications and take
up thorny questions about its relation to the associated maxims
of quantity, quality, relation, manner, and politeness. There is
disagreement as to whether norms of conversational etiquette
derive from, are complementary to, or are in tension with the
cooperative principle. Asa Kasher (1976, 1982) argues against
the assumption of shared conversational purposes and contends
that the cooperative principle is superfluous since the needed
maxims can be derived from a more fundamental principle of
rational behavior: Given a goal, adopt the most effective and
least costly means to its realization. Wayne A. Davis (1998) contends that the cooperative principle lacks explanatory value and
is hopelessly ambiguous among normative, motivational, behavioral, and cognitive readings. As it is only the speakers motives
and beliefs that are causally involved in the intentional production of implicit meanings, what the speaker implicated or implicitly expressed does not depend on the thoughts or presumptions
of the audience. In other words, Grice erred when he made the
hearers presumption that the speaker observes the cooperative
principle a condition on the speakers expressing one thing by
saying something else.
Paisley Livingston
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brown, Penelope, and Stephen C. Levinson. 1987. Politeness: Some
Universals in Language Usage. Cambridge: Cambridge University
Press.
Clark, Herbert H. 1996. Using Language. Cambridge: Cambridge
University Press.
Davis, Wayne A. 1998. Implicature: Intention, Convention, and Principle
in the Failure of Gricean Theory. Cambridge: Cambridge University
Press.
Grice, Herbert Paul. 1975. Logic and conversation. In Syntax and Semantics.
Vol. 3. Ed. Cole and Morgan, 4158. New York: Academic Press.

228

Grice, Herbert Paul.. 1989. Studies in the Way of Words. Cambridge: Harvard
University Press.
Kasher, Asa. 1976. Conversational maxims and rationality. In Language
in Focus: Foundations, Methods, and Systems, ed. A. Kasher, 197211.
Dordrecht, the Netherlands: Reidel.
Kasher, Asa. 1982. Gricean inference revisited. Philosophica
29: 2544.

CORE AND PERIPHERY


Mainstream generative grammar makes two basic divisions
among linguistic phenomena. The first is the traditional division
between grammar and the lexicon, taken to be the locus of all
irregularity. The second (Chomsky 1981) distinguishes between
two parts of the grammar itself, the core and the periphery.
The core rules represent the deep regularities of language. The
periphery represents marked exceptions, such as irregular verbs,
for which there are no deep regularities.
The coreperiphery distinction (henceforth C/P) is related
to markedness. For Noam Chomsky (1965), markedness is a
graded phenomenon that reflects relative centrality, naturalness,
simplicity, ease of learning, and related notions. The introduction
of C/P can be seen as a distillation of the notion markedness hierarchy into a binary distinction. The consequence is a dramatic
conceptual simplification, which ties naturally to the characterization of universal grammar in terms of parameters and
a related perspective on the language acquisition device
and learnability (see also syntax, universals of). On this
view, the core is part of the human biological endowment for language, and the value of a parameter is set by the learner on the
basis of minimal linguistic input. One important consequence of
C/P, particularly in syntactic theory, is that it has focused considerable attention on understanding how languages do and do
not realize phenomena such as argument structure and wh-interrogatives. Another is that it has led to the uncovering of a wide
range of empirical phenomena in the attempt to integrate apparent exceptionality, idiosyncrasy, and counterexamples into a
general framework of universals and parametric restrictions.
Despite the value of such a simplification, Chomsky himself notes that we do not expect to find chaos in the theory of
markedness, but rather an organized and structured system,
building on the theory of core grammar (1981, 216), and that
marked structures have to be learned on the basis of slender
evidence too, so there should be further structure to the system
outside of core grammar (1981, 8). It is in fact not clear that C/P
is a principled distinction and that it reflects anything beyond
generality of function and frequency of use. Apparent syntactic
idiosyncrasies beyond the level of individual words are learned,
they display various degrees of specificity, and native speakers
have sharp and reliable intuitions about them. Furthermore,
Occams razor demands that it be shown that a learning mechanism that can acquire the peripheral cases cannot also acquire
the core. Hence, C/P may be nothing more than a rough and
tentative distinction, one drawn for working purposes (and
nothing more than that) (Chomsky 1993, 1718).
Here are some illustrations of peripheral phenomena
(Culicover and Jackendoff 2005). First, there are words that go in
the wrong place.

Core and Periphery


Enough modifies adjectives and adverbs, alternating with
so, too, and as. However, unlike these, it follows its head: so/
too/as/*enough big; big enough. As a nominal modifier, it can
go either before or after its head: much/more/sufficient/enough
pudding; pudding *much/*more/*sufficient/enough.
The quantifiers galore and aplenty also go after the head
rather than before it, obligatorily: money galore, *galore money.
Responsible, unlike other adjectives, can occur either before or
after its head. Notwithstanding parallels other prepositions, such
as despite, in spite of, and regardless of in its semantics, but it
can go on either side of its complement noun phrase (NP). The
related word aside goes on the right of its complement; aside
from goes on the left.
Each of these cases constitutes an idiosyncratic departure
from strict x-bar theory.
There is sluice-stranding too. (1a) means the same as (1b).
(1) a. John went to NY with someone, but I couldnt find out who
with.
b. John went to NY with someone, but I couldnt find out who
John went to NY with.

(1a) is a case of sluice-stranding, where an isolated whphrase stands in place of an understood indirect question.
It contains not only the wh-phrase but also a preposition
from whose complement the wh-phrase has apparently been
moved. It is technically possible to derive this construction
through some combination of wh-movement and deletion.
The difficulty is that sluice-stranding is both more productive
and more restricted than a derivational account would suggest.
Sluicing in general is possible where the purported extraction
site normally forbids extraction (Ross 1969). (2a) illustrates for
ordinary sluicing of a prepositional phrase; (2b) illustrates for
sluice-stranding.
(2)

I saw a fabulous ad for a Civil War book, but I cant remember


a. by whom.
b. who by.
c. * by whom I saw a fabulous ad for a Civil War book.
d. *who I saw a fabulous ad for a Civil War book by.

On the other hand, sluice-stranding severely constrains what


combinations of wh-word and preposition are acceptable, while
sluicing is productive.
(3)

(4)

Normal pied-piped preposition in sluicing:


but I couldnt figure out
a. with/to/from/for/next to/about/beside whom.
b. with/for/from/of/on/in/about/at/before/into/near beside
what.
c. for/by/with how much.
d. to/from/near where.
e. with/to/from/next to/about/beside which (book).
Sluice-stranding:
but I couldnt figure out
a. who with/to/from/for/*next to/*about/*beside.
b. what with/for/from/of/on/in/about/
at/*before/*into/*near/*beside.
c. how much for/*by/*with.

d. where to/from/*near.
e. * which (book) with/to/from/next to/about/beside. There are

other cases as well (Culicover 1999):


(5) a. no matter (how heavy the load/what the cost/the difficulty)
b. -ever [as in whatever the cost]
c. the comparative correlative (the more he eats the hungrier
he gets)
d. would rather
e. had better
f. infinitival relatives [as in someone with whom to speak;
*someone who to speak with]
g. parasitic gaps
h. Not-topics (not in my car (you wont))
i. Italian loro
j. dative NP in English
k. the possibility of clitic climbing
l. English tags

For any apparently peripheral phenomenon, further research


may show that its properties follow from general principles without construction-specific stipulations, or that there may be some
irreducible idiosyncrasy.

Conclusions
Syntactic constructions appear to be ranged on a continuum
from words through idioms through truly idiosyncratic constructions through more general but still specialized constructions to the most general corelike structures and principles of
universal grammar. It is likely that certain peripheral constructions may be related to the core in systematic ways, say, by
relaxing certain conditions of core grammar (Chomsky 1986,
147). But C/P per se, however valuable heuristically, may not
merit genuinely theoretical status. (Cf. head-driven phrase
structure grammar and construction grammar.)
The implication for learning is that the learner stores current
analyses of novel utterances in the lexicon, with idiosyncratic
and general properties (see lexical acquisition). The learning procedure attempts to construct more general lexical entries
on the basis of positive experience, where common parts of existing lexical entries are retained and differing parts are replaced by
a variable. The resulting lexical entry functions as a schema or
rule that encompasses existing entries and permits construction
of new utterances. In turn, this schema, along with others, may
be further abstracted into a still more general schema by replacing further dimensions of variation with variables (Tomasello
2003), producing in the limit grammatical rules of full generality
where warranted (see also syntax, acquisition of).
Peter W. Culicover
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA:
MIT Press.
. 1981. Lectures on Government and Binding. Dordrecht, the
Netherlands: Foris.
. 1986. Knowledge of Language. New York: Praeger.
. 1993. A minimalist program for linguistic theory. In The View
from Building Twenty, ed. Kenneth Hale and Samuel J. Keyser, 152.
Cambridge, MA: MIT Press.

229

Corpus Callosum
Culicover, Peter W. 1999. Syntactic Nuts. Oxford: Oxford University
Press.
Culicover, Peter W., and Ray Jackendoff. 2005. Simpler Syntax.
Oxford: Oxford University Press.
Ross, John R. 1969. Guess who. In Proceedings of the Fifth Annual
Meeting of CLS, ed. Robert I. Binnick et al., 25286. Chicago: Chicago
Linguistics Society,
Tomasello, Michael. 2003. Constructing a Language. Cambridge: Harvard
University Press.

CORPUS CALLOSUM
Although language processing relies predominantly on left
hemisphere networks, certain functional units are also localized in the right hemisphere. As a result, there is a strong
need for an interaction between the two hemispheres during
most language-related processes. The neuronal basis for this
interaction is provided by a brain structure located between the
two hemispheres, the so-called corpus callosum (CC), which is
the major interhemispheric fiber tract. The more than 200 million
axons forming the CC originate from nearly all cortical regions,
including the language areas, and they primarily link homologue
regions of the hemispheres. The fibers cross the interhemispheric
gap ordered by their cortical origin. Due to functional specialization of the cerebral cortex, this anatomical organization also
establishes a functional topography within the CC. Thus, different subregions of the tract are related to specific functional
networks.
Viewed in a midsagittal section of the brain (see Figure 1),
two subregions of the CC seem to be particularly relevant for language processing: Fibers passing through portions of the anterior
CC connect the language production network situated in the left
inferior frontal cortex (see frontal lobe and brocas area)
with its contralateral homologue, while axons in the posterior
CC interconnect the cortical areas in the temporal lobes (see
also wernickes area) which are responsible for language
perception.

The functional relevance of the CC in general, and in language


processing in particular, was impressively demonstrated by Roger
W. Sperry and Michael S. Gazzaniga in their research on patients
with a complete surgical transsection of the CC (Gazzaniga
2000). In an everyday situation, these patients are able to process language in a seemingly appropriate way. However, when
tested with special experimental paradigms, the lack of interhemispheric communication becomes obvious, indicating that
an intact CC is not obligatory but seems necessary for achieving
optimal and efficient language processing.
The exact role of the callosal axons in interhemispheric interaction is still a matter of debate, however. At least two different
classes of possible callosal functioning can be distinguished. The
CC might be seen either as 1) a channel to exchange information
between the two hemispheres (information transfer function) or
as 2) a mechanism through which one hemisphere exerts inhibitory or excitatory influence on the ongoing processing in the
opposite hemisphere (modulatory function).
Information transfer becomes important whenever one
hemisphere needs to access information that is available only in
the other hemisphere. This might be the sensory input initially
transferred to only one hemisphere (e.g., visual input from the
lateral periphery of the visual field) or the outcome of preceding unilateral processing steps. An instructive example related
to language is the interplay of (right-hemispheric) prosodic
and (left-hemispheric) syntactic information processing during speech comprehension (see speech perception). In an
electroencephalographic (EEG) study, Angela D. Friederici,
D. Yves von Cramon, and Sonja A. Kotz 2007, 135) examined
how patients with lesions in the CC respond to a mismatch
between the syntactic and prosodic structure of a sentence.
While the healthy control subjects and patients with anterior CC
lesions showed a clear difference between their EEG responses
to prosodically correct and incorrect sentences, no such effect
was found in patients with lesions in the posterior CC. Thus,
the destruction of the direct connections between left and right

Figure 1. A midsagittal view of the brain


acquired with magnetic resonance imaging. The
characteristic cross-sectional shape of the corpus callosum (CC) is indicated. CC subregions
connecting frontal (1) and temporal (2) language
networks are marked by hatched areas.

230

Corpus Linguistics
temporal lobes seems to prevent the interhemispheric exchange
required to integrate prosodic and syntactic information.
One often-quoted modulatory role of the CC is the functional
inhibition of the contralateral hemisphere while the ipsilateral
hemisphere is engaged in a task for which it is specialized. The
advantage of such an inhibitory mechanism might be the reduction of interfering influence coming from the opposite hemisphere. A finding recently published by Alexander Thiel and
coworkers (2006) could be interpreted in this vein. The authors
measured the activation of the left and right inferior frontal gyrus
(IFG) in a verb generation task using positron emission tomography (see neuroimaging). In some of the administered trials,
repetitive transcranial-magnetic stimulation, a method to induce
a temporary disruption of ongoing neuronal activity, was simultaneously applied over the left IFG. Besides a reduction of the
activation in the stimulated left IFG, this virtual brain lesion also
induced a relative increase in the response measured in the right,
nonstimulated IFG. Thus, the suppression of the left IFG area
seems to result in a disinhibition of its contralateral homologue.
The studies cited here illustrate that not only the exchange of
information but also the coordination of bihemispheric processing
is supported by transcallosal connections. Furthermore, a recent
functional imaging study has shown that interindividual differences, which can be found in size and micro-architecture of the
CC, have consequences for language processing (Westerhausen
et al. 2006, 80). Here, the degree of activation differences between
left and right inferior frontal language areas (in a word production task) appeared to be directly related to differences in the fiber
architecture of the callosal connection. Whether structural CC
differences between individuals also trigger differences in performance or are even associated with language disorders (as was
hypothesized for dyslexia) has still to be confirmed.
Ren Westerhausen
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Friederici, Angela D., D. Yves von Cramon, and Sonja A. Kotz. 2007. Role
of the corpus callosum in speech comprehension: Interfacing syntax
and prosody. Neuron 53: 13545.
Gazzaniga, Michael S. 2000. Cerebral specialization and interhemispheric communication Does the corpus callosum enable the human
condition? Brain 123.7: 12931326.
Thiel, Alexander, Birgit Schumacher, Klaus Wienhard, Stefanie Gairing,
Lutz W. Kracht, Rainer Wagner, Walter F. Haupt, and Wolf-Dieter
Heiss. 2006. Direct demonstration of transcallosal disinhibition in
language networks. Journal of Cerebral Blood Flow and Metabolism
26.9: 11227.
Westerhausen, Ren, Frank Kreuder, Sarah Dos Santos Sequeira, Christof
Walter, Wolfgang Woerner, Ralf A. Wittling, Elisabeth Schweiger, and
Werner Wittling. 2006. The association of macro- and microstructure of the corpus callosum and language lateralisation. Brain and
Language 97: 8090.

CORPUS LINGUISTICS
This term refers to linguistic research that uses corpus data as
the primary object of study. The term, therefore, describes a
methodology rather than a field of linguistics; corpus research
has been carried out in most areas of formal and applied

linguistics, including phonetics, phonology, morphology, syntax, semantics, pragmatics, discourse analysis (linguistic), sociolinguistics, language acquisition,
PSYCHOLINGUISTICS, HISTORICAL LINGUISTICS, dialectology, and lexicography.

Corpus Data as an Object of Study


It is appropriate to begin a discussion of corpus linguistics with the
question of whether the language found in corpora is a legitimate
object of study. Corpora, after all, contain performance data.
Noam Chomsky (1957 and elsewhere) and others have argued
that linguists should model competence rather than performance; this has been widely interpreted to mean that the source
of linguistic data should be introspective judgments, rather than
naturally occurring spoken or written text. Additional arguments
commonly put forward against the use of corpus data are 1) that
performance may be affected by factors that are not linguistic in
nature, such as memory limitations and the speakers state of
mind, degree of tiredness, and so on, and 2) that performance
data include utterances that are judged ungrammatical by native
speakers of the language. In response to the first argument, introspective judgments can also be affected by nonlinguistic factors.
Grammaticality judgments often depend on context: Utterances
may seem unacceptable in isolation but perfectly natural in the
proper context, for example, embedded in discourse within a
corpus. The inability to imagine an appropriate context is clearly
irrelevant to grammaticality, but may also be affected by nonlinguistic factors, such as tiredness. Thus, introspective judgments
may lead to the wrong results (Bresnan 2007). In response to the
second, the number of ungrammatical utterances in a corpus is
usually small (Labov 1969).
In their favor, corpora provide at least two types of information that is not easily available via speaker judgments: frequency
data, which have applications ranging from lexical studies to
research into language impairment; and historical and longitudinal data, which can be used to model language acquisition and
language change.

Defining Characteristics of Corpora


Corpora themselves vary widely in size, form, and content, and
in fact, almost any collection of data (a single text, a collection
of the works of a single author, speech recorded from a single
individual at a specific time) could be considered a corpus.
But modern corpora are usually assumed to have the following
characteristics:
1. They are representative samples of the language under
investigation and have a finite size. Complete corpora for modern spoken languages are of course impossible to construct,
because the number of utterances is constantly increasing as
the language is used by its speakers. The goal of corpus builders is to collect a sample that provides a good picture of the
possible utterances of the language, including both rare and
common constructions with representative frequencies. Most
corpora contain a broad range of texts from different authors/
speakers and genres. There are, of course, corpora that by their
very nature cannot be representative in this way. These include
corpora of dead languages, where the texts are finite and

231

Corpus Linguistics

Creativity in Language Use

restricted to those that have survived over time, and corpora


of child language during the period of acquisition, which are
intentionally restricted to a specific type of speech and a specific class of speakers.
2. Modern corpora are machine-readable; that is, they exist
as computer files and can be transmitted and manipulated
electronically. This characteristic has two consequences: First,
corpora can be searched quickly and easily; second, they can
be annotated with linguistic and extralinguistic information to
make them more useful.
3. Modern corpora are publicly available and are considered
standard tools for research in particular languages. This characteristic has major implications for language study: Empirical
results can be replicated and verified; studies can more easily
build on one another, since they are working from the same
empirical base; and differing results must be attributed to different methodology or different interpretation, rather than to
different databases. Thus, corpora may have the overall effect
of raising the quality of linguistic research.

Corpora and Their Annotation


One benefit of large corpora can also be a disadvantage: Having
a million words or more of text available is of very little use if
the corpus cannot be easily searched. Even for lexical studies,
the researcher must use concordances and other software to
determine and then collect all of the variant forms and spellings of individual words in the corpus. For syntactic research,
both part of speech and structural annotation is necessary.
Consider the standard problem of retrieving relative clauses in
a corpus of modern standard English. Relative clauses may be
introduced by that, which or who/whom, or even by nothing
at all:
(1)

a. the book that I read


b. the book which I read
c. the book I read

Searching lexically for that and which/who/whom will not only


miss clauses like (1c) but will also find the examples in (2), which
are not relative clauses at all:
(2)

a. I like that book.


b. Which one do you want?

These problems are termed low recall (missing wanted data


like [1c]) and low precision (getting unwanted data like [2ab])
by Ann Taylor 2007. They can be solved by annotating corpora
for part of speech and (abstract) syntactic structure, and by
using a search engine designed for corpus annotations.
The construction and use of publicly available corpora have
revolutionized the way that empirical linguistic research is
conducted. Rather than spending most of their time collecting data, linguists can now concentrate on asking questions,
retrieving the relevant data quickly and easily from corpora,
and constructing analyses. Large searchable corpora are now
publicly available for many different languages, both written
and spoken, historical and contemporary, in many different
styles and registers.
Susan Pintzuk

232

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Bresnan, Joan. 2007. A few lessons from typology. Linguistic Typology
11.1. Available online at: http://www.stanford.edu/bresnan.
Chomsky, Noam. 1957. Syntactic Structures. The Hague, Paris: Mouton.
Labov, William. 1969. The logic of non-standard English. Georgetown
Monographs on Language and Linguistics 22.
McEnery, Tony, and Andrew Wilson. 1997. Corpus Linguistics.
Edinburgh: Edinburgh University Press. Contains a full discussion of
many of the topics introduced in this entry.
Sampson, Geoffrey, and Diana McCarthy. 2004. Corpus
Linguistics: Readings in a Widening Discipline. London and New
York: Continuum. A variety of corpus studies.
Taylor, Ann. 2007. The York-Toronto-Helsinki parsed corpus of Old
English prose. In Using Unconventional Digital Language Corpora. Vol.
2. Ed. J. C. Beal, K. Corrigan, and H. Moisl. Basingstoke, UK: PalgraveMacmillan. A detailed description of a historical English corpus, its
morphosyntactic annotation scheme, and its search engine.

CREATIVITY IN LANGUAGE USE


Linguists speak of creativity in two senses. One is what Noam
Chomsky called the generativity of language. The other is the
imaginative use of language in novel ways.
Chomsky described language as a system wherein a finite
set of formal rules (plus vocabulary) can generate an indefinite
number of hierarchical structures or sentences. Many of
these will be first-time new. This, said Chomsky, implies that a
(behaviorist) theory based on surface probabilities cannot predict language use, nor explain the nested dependencies within
sentences.
The imaginative use of language wasnt stressed by Chomsky.
In calling himself a cartesian linguist, he noted that predecessors such as Wilhelm von Humboldt had also stressed the creativity of language. However, Humboldt was referring not to the
formal generativity of syntax but to the linguistic expression of
novel thoughts (Boden 2006, 9.iv.fg). This includes straightforward sentences/phrases conveying new facts or ideas and also
imaginative uses such as metaphor, analogy, and poetic
metaphor, or imagery.
Cognitive science has defined various creative informationprocessing mechanisms underlying those imaginative uses.
The basic principles of mental association are implemented
by connectionism. This models the fluidity of concepts
(Hofstadter and Mitchell 1993), and (in parallel distributed
processing [PDP] networks) their definition by Wittgensteinian
family-resemblances, rather than necessary and sufficient
rules. They illuminate, for instance, how Coleridge could produce the imagery in The Ancient Mariner (Boden 2004, 12546).
conceptual blending theory (Fauconnier and Turner
2002) outlines how various metaphors and analogies could arise.
Classical artifical intelligence (AI) has identified some hierarchical conceptual structures in longterm memory, including general and culture-specific assumptions about human motivation
(Boden 2004, 17092). And it has modeled the generation of jokes
(see verbal humor) of the form What do you get if you cross
an x with a y? (Binsted and Ritchie 1997). Compare: Q. What
do you get if you cross a sheep with a kangaroo? A. A woolly
jumper. (This joke doesnt work in American English, wherein
jumpers are called sweaters.)

Creoles
As that example illustrates, the tacit knowledge of language
(syntax, semantics, phonetics, morphology, categorization, dialect, orthography) and also of the world
that is needed to use language imaginatively is richly detailed
and widely various. Both hearer and speaker must possess this
knowledge if the creative usage is to be understood.
Margaret A. Boden
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Binsted, Kim, and G. D. Ritchie. 1997. Computational rules for punning
riddles. Humor: International Journal of Humor Research 10: 2576.
Boden, M. A. 2004. The Creative Mind: Myths and Mechanisms. 2d ed.
London: Routledge.
Boden, M. A.. 2006. Mind as Machine: A History of Cognitive Science.
Oxford: Oxford University Press.
Fauconnier, G. R., and Mark Turner. 2002. The Way We Think: Conceptual
Blending and the Minds Hidden Complexities. New York: Basic Books.
Hofstadter, D. R., and Melanie Mitchell. 1993. The copycat project: A model of mental fluidity and analogy-making. In Advances
in Connectionist and Neural Computation Theory. Vol. 2: Analogical
Connections. Ed. Keith Holyoak and John Barnden, 31112. Norwood,
NJ: Ablex.
Humboldt, Wilhelm von. [1836] 1988. On Language: The Diversity
of Human Language-Structure and Its Influence on the Mental
Development of Mankind. Trans. Peter Heath. Cambridge: Cambridge
University Press.

CREOLES
SocioHistorical, Terminological, and Epistemological
Background
The concept Creole has not been operationalized with rigorous
and reliable criteria in linguistic theory. At best, it is a sociohistorically and politically motivated concept, often misidentified
as linguistic (DeGraff 2005b, 2009; Mufwene 2008).
Etymologically, the word Creole derives from the Portuguese
crioulo and/or Spanish criollo raised in the home (from criar
to raise, to breed). In Caribbean history, the labeling of biological species, including humans, as Creole seems to have
preceded the labeling as Creole of certain speech varieties.
Both uses referred to nonindigenous varieties that developed
locally, in contrast to their counterparts from Europe and Africa.
The original uses of the word were thus devoid of any specific
structural correlates (Mufwene 2001, 311; Chaudenson and
Mufwene 2001, Chap. 1).
In keeping with this original usage and to avoid circularity and the sort of controversial linguistic assumptions that
are noted in Mufwene 2008, I here ostensively use Creole as a
label for certain speech varieties that became emblematic of the
newly created communities the Creole communities on and
around colonial Caribbean plantations. These are the classic
Creole languages.
Caribbean Creole languages developed mostly among
Europeans and Africans via language acquisition by adults and
children in a complex mix of language-contact settings. The
complex sociohistorical factors therein included a continuum
of social divides and power asymmetries (Chaudenson and
Mufwene 2001). One end of this continuum was marked by

drastic opposition and inequality between the dominant and


dominated groups speakers of the European superstrate and
the African substrate languages, respectively. At the opposite
end, the superstrate and substrate speakers had relatively intimate interactions, especially during the settlement period when
substrate speakers were outnumbered by, and in relatively close
contact and interdependence with, superstrate speakers and
then, throughout the colonial period, among and around the
groups that played an intermediate buffer role race- and classwise. These continua would entail, throughout colonial history,
corresponding continua of second-language (L2) learner varieties of the superstrate language. These non-native varieties,
alongside native varieties, of the superstrate language would
in turn become the target for increasingly numerous cohorts of
native Creole speakers (DeGraff 2002, 37494; 2005b, 2009).
My working assumption is uniformitarian: Normal processes
of first- and second-language acquisition (L2A) and use
have underlaid the formation of Creoles as they have the formation of non-Creoles. The sociohistorical evidence, as documented
by (e.g.) Salikoko Mufwene (2008), suggests that Caribbean
Creoles were not seeded by any sort of structureless pidgins
(i.e., these Creoles were not created with input from early Pidgins
allegedly spoken by the parents of the first generation of Creole
speakers). Such early Pidgins as the immediate predecessors of
Caribbean Creoles have never been documented, and neither
does the contemporary structural evidence support the postulation of such Pidgins as the primary ancestors of Caribbean
Creoles (see the following).

Creole Exceptionalism
The term Creole exceptionalism (DeGraff 2003) covers a subset of long-standing hypotheses whereby Creole languages
constitute a sui generis class on phylogenetic and/or structural
grounds. Here is a sample:
(i) Creoles are degenerate offshoots of their European
ancestors;
(ii) Creoles
genealogy;

are

special

hybrids

with

exceptional

(iii) Creoles are the only contemporary languages with a history of abnormal transmission that deprives them of any
structurally full-fledged ancestors.
(iv) The Pidgin-to-Creole transition recapitulates the transition from pre-human protolanguage to human language.
(For a fuller development of these arguments, see DeGraff 2005a,
2009.)

Creoles as Degenerate Offshoots? Its only in the latter part of


the twentieth century that linguists started refuting the received
wisdom that Creoles are structurally impoverished variants of
their European norms. In Julien Vinsons scientific dictionary
(1889, 3456), Creole languages result from the adaptation of
a language, especially some Indo-European language, to the
(so to speak) phonetic and grammatical genius of a race that
is linguistically inferior. The resulting language is composite,
truly mixed in its vocabulary, but its grammar remains essentially Indo-European, albeit extremely simplified. For Leonard
233

Creoles
Bloomfield (1933, 472), The creolized language has the status of
an inferior dialect of the masters speech.
Even in the latter half of the twentieth century, certain linguists claimed that structural linguistic factors, related to (e.g.)
morphological simplicity and a vocabulary [that] is extremely
poor, are among the greatest obstacles to the blossoming of
Creoles (Valdman 1978, 345; cf. Whinnom 1971, 110; Samarin
1980, 221; Seuren and Wekker 1986; and Quint 1997, 58). Pieter
Seuren (1998, 292) has elevated the alleged extraordinary simplicity of Creole languages to historical universal.
There is no reliable empirical or theoretical basis for the claim
that Creole languages are uniformly less complex than their
European ancestors. For example, certain aspects of my native
Haitian Creole (HC) signal an increase in complexity to the extent
that these properties of HC have no counterpart in French, HCs
European ancestor (DeGraff 2001b, 284). Furthermore, HC, like
any other language, expands its vocabulary as needed, via productive affixation, neologisms, borrowings, and so on (DeGraff
2001a; Fattier 1998).
CREOLES AS SPECIAL HYBRIDS? Lucien Adams (1883) hybridologie linguistique hypothesis posited different linguistic
templates for different races. The latter belong to distinct evolutionary rungs, with their respective linguistic templates ranked
in a corresponding hierarchy of complexity. Upon language
contact, these templates will cross-fertilize (i.e., hybridize),
and the most primitive grammar (in this scenario, the grammar of the lower race of speakers, i.e., the non-European
speakers) imposes an upper bound of complexity on the hybrid
grammar. In such scenario, the European contribution to the
hybridization of European and non-European languages is limited to superficial traits, such as the phonetic shapes of words
only these shapes, and not the complex grammars of European
languages, can be acquired by the allegedly inferior minds of
the non-Europeans.
Claire Lefebvres (1998) relexification hypothesis is far
removed from Adams race-theoretical postulates. For Lefebvre,
it is because the Africans in Haiti had very limited access to
French that they were virtually unable to learn any aspect of
French grammar. Thus, they could only overlay French-derived
phonetic strings on their native substrate grammars, with the latter being kept nearly intact in the original Creole languages.
Consider again HC. Current results from L2A research predicts that HC structure would have indeed evolved under some
influence from the substrate languages. L2A research also documents that adult learners at every stage acquire more than phonetic strings from their target. Unsurprisingly, HC instantiates,
alongside substrate-influenced patterns, a wide range of superstrate-derived properties that apparently have no analogues in
the substrate languages (DeGraff 2002).
Adams and Lefebvres proposals share one non-uniformitarian assumption, namely, that Creole creators, unlike L2 learners elsewhere, were unable to learn anything abstract about
their target language. Yet the lexicon and morphology of HC
demonstrate that Creole creators were able to segment and
parse target speech (here, French), including affixes. Such segmentation and parsing contradict the claim that the creators of
HC could not access any abstract property of French grammar.

234

Segmentation and parsing of target speech necessarily tap into


substantial aspects of target grammar.
CREOLIZATION AS ABNORMAL/BROKEN TRANSMISSION AND
CREOLES AS LIVING FOSSILS? In keeping with the postulated
congruence in nineteenth-century philology between the evolution of races and that of languages, Alfred de Saint-Quentin
([1872] 1989, 40) considered it a property of emerging languages
to be naive and claimed Guyanais Creole as a spontaneous
product of the human mind, freed from any kind of intellectual
culture. Similarly, Isle de France Creole was considered an
infantile language for an infantile race (Reinecke 1980, 11).
In twentieth-century linguistics, the abnormal/broken transmission doctrine excludes Creoles from the scope of the comparative method and turns them into new linguistic phyla
without ancestry (Thomason and Kaufman 1988).
This doctrine seems related to another myth of origins, as
writers in cultural studies (see deconstruction and critical discourse analysis) might put it that of Creoles as
contemporary (quasi-)replicas of human language at its evolutionary incipience (Bickerton 1990, 171, Chap. 5; 1998, 354;
Bickerton and Calvin 2000, 149). In Derek Bickertons scenario,
the hypothetical Pidgin-to-Creole cycle recapitulates the evolution of Homo erectuss protolanguage into the most primitive
instantiations of Homo sapienss language: What happened [in
the formation of Hawaiian Creole] was a jump from protolanguage to language in a single generation (Bickerton 1990, 171).
In this scenario, one sui generis process that allegedly disrupts normal language transmission and leads to catastrophic
language genesis is some form of radical pidginization. The latter is claimed to obliterate virtually all stable structural patterns,
including morphology (Bickerton 1999, 69, n. 16), and to lead to a
structureless early pidgin. Such a Pidgin is putatively unlike any
full-fledged human language and more like the hypothetical protolanguage of Homo erectus, our prehistoric hominid ancestors
(Bickerton 1990, 169, 181; 1998, 354; Bickerton and Calvin 2000,
149). This early Pidgin, by definition, is non-native, unstable, and
used as an emergency lingua franca across languages. This early
Pidgin is argued to abruptly seed the Creole when the former
becomes the acquisition target for the first generation of locally
born children (see Bickerton 1999, 49) in a way similar to how
Homo erectus protolanguage seeded the early forms of human
language as spoken by the first cohorts of Homo sapiens.
How could the documented pidgins of modern humans and
the hypothetical protolanguage of Homo erectus evince any
enlightening similarity? How could the hypothetical Pidgin-toCreole transition in modern history resemble the evolution in
prehistory from Homo erectuss structureless protolanguage to
Homo sapienss full-fledged human language? If the transition
from Homo-erectus protolanguage to Homo-sapiens human
language is a reflex of brain reorganization via natural selection
in the course of human evolution, then Bickertons hypothetical
Pidgin-to-Creole cycle has nothing to say about such brain reorganization and its linguistic structural consequences. Indeed
Pidgins, under any definition, reflect mental properties of Homo
sapiens. Acquisition data suggest that learners at every age and
stage, including Pidgin speakers, have access to the same faculty
of language as any other human being (Mufwene 2008, ch. 5).

Creoles
The broken transmission and linguistic fossils doctrines
are further undermined by a vast range of comparative data and
empirical and theoretical observations. As mentioned earlier,
there is ample evidence for systematic lexical and morpho-syntactic correspondences between radical Creoles and their European
lexifiers from the onset of Creole formation onward (Fattier 1998;
DeGraff 2001a, 2005b, 2009; Mufwene 2008). There is also ample
evidence for transfer from the African substrate languages into
Creole grammars. This is as expected given the aforementioned
facts of Caribbean history and the results from L2A research. The
sort of structureless pidgin that is an essential ingredient in the
traditional Pidgin-to-Creole scenario renders mysterious any systematic set of structural correspondences between Creoles and
their ancestor languages. Besides, the magnitude of structural
gaps in the history of non-Creole languages seems comparable
to, and sometimes even greater than, that of their counterparts
in Creole diachrony (DeGraff 2005b, 2009), pace Thomason and
Kaufman (1988, 812, 206) and Thomason (2002, 105).
If the rigorous criteria of the Comparative Method [CM]
include the establishment of recurring phonological correspondences in morphemes of identical or similar meanings, including
much basic vocabulary the establishment of systematic morphosyntactic correspondences (Thomason 2002, 103), then the
available evidence puts Caribbean Creoles squarely in the scope
of the CM (DeGraff 2005b, 2009; Mufwene 2008, pace Thomason).
Such evidence militates against the postulation, in Creole formation, of an exceptional and abnormal break in transmission
with subsequent creation of all new linguistic structure from the
hypothetical scraps of a Pidgin.

The End of Creole Exceptionalism?


Creolization differs from language change on sociohistorical and political, not linguistic, grounds. For example, conquered
peoples involved in forming Caribbean Creoles may have spoken more languages than their counterparts in the formation
of, say, the Romance languages. Furthermore, oppression in
the Caribbean was correlated with race. Caribbean Creoles and
Romance languages thus evolved in distinct ecologies, with
Caribbean vernaculars ending up disfranchised for sociohistorical reasons. Creolization is a social, not a structural, process (Mufwene 2001, 138). The individual speakers engaged in
language contact, whether in the genesis of Creole or Romance
languages, would have made use of the same [mental] process
adopted [for the] formation of [their respective new] language
(Greenfield 1830, 51 f). If so, Creole grammars do not, and could
not, form a typological class that is aprioristically and fundamentally distinguishable from non-Creole grammars (DeGraff 2005b,
2009; Mufwene 2008).
Michel DeGraff
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Adam, Lucien. 1883. Les idiomes ngro-aryen et malo-aryen: Essai
dhybridologie linguistique. Paris: Maisonneuve et cie.
Bickerton, Derek. 1990. Language and Species. Chicago: University of
Chicago Press.
Bickerton, Derek.. 1998. Catastrophic evolution: The case for a single
step from protolanguage to full human language. In Approaches to the

Evolution of Language, ed. J. Hurford, M. Studdert-Kennedy, and C.


Knight, 34158. Cambridge: Cambridge University Press.
. 1999. How to acquire language without positive evidence. In
Language Creation and Language Change, ed. Michel DeGraff, 4974.
Cambridge, MA: MIT Press.
Bickerton, Derek, and William Calvin. 2000. Lingua Ex
Machina: Reconciling Darwin and Chomsky with the Human Brain.
Cambridge, MA: MIT Press.
Bloomfield, Leonard. 1933. Language. New York: H. Holt and Co.
Chaudenson, Robert, and Salikoko Mufwene. 2001. Creolization of
Language and Culture. London: Routledge.
DeGraff, Michel. 2001a. Morphology in Creole genesis: Linguistics and
ideology. In Ken Hale: A Life in Language, ed. M. Kenstowicz, 53121.
Cambridge, MA: MIT Press.
. 2001b. On the origin of Creoles: A Cartesian critique of NeoDarwinian linguistics. Linguistic Typology 5.2/3, 213310.
. 2002. Relexification: A reevaluation. Anthropological Linguistics
44.4: 321414.
. 2003. Against Creole exceptionalism. Language 79: 391410.
. 2004. Against Creole exceptionalism (redux). Language
80: 8349.
. 2005a. Linguists most dangerous myth: The fallacy of Creole
exceptionalism. Language in Society 34: 53391.
. 2005b. Morphology and word order in creolization and beyond.
In Handbook of Comparative Syntax, ed. G. Cinque and R. Kayne, 249
312. New York: Oxford University Press.
. 2009. Language acquisition in creolization and, thus, language
change. Language and Linguistic Compass 3:888971.
Fattier, Dominique. 1998. Contribution ltude de la gense dun
crole: LAtlas linguistique dHati, cartes et commentaires. Ph.D. diss.,
Universit de Provence. Distributed by Presses Universitaires du
Septentrion, Villeneuve dAscq, France.
Greenfield, William. 1830. A Defence of the Surinam Negro-English
Version of the New Testament. London: Samuel Bagster.
Lefebvre, Claire. 1998. Creole Genesis and the Acquisition of
Grammar: The Case of Haitian Creole. Cambridge: Cambridge
University Press.
Mufwene, Salikoko. 2001. The Ecology of Language Evolution.
Cambridge: Cambridge University Press.
. 2008. Language Evolution. London: Continuum.
Quint, Nicolas. 1997. Les les du Cap-Vert aujourdhui: Perdues dans
limmensit. Paris: LHarmattan.
Reinecke, John. 1980. William Greenfield, a neglected pioneer creolist.
In Studies in Caribbean Language, ed. L. Carrington, 112. SaintAugustine, Trinidad: Society for Caribbean Linguistics.
Saint-Quentin, Alfred de. [1872] 1989. Introduction lhistoire de
Cayenne , with tude sur la grammaire crole by Auguste de SaintQuentin. Antibes: J. Marchand. 1980 edition: Cayenne: Comit de la
culture, de lducation et de lenvironnement, Rgion Guyane.
Samarin, William. 1980. Standardization and instrumentalization of
Creole languages. In Theoretical Orientations in Creole Studies, ed.
A. Valdman and A. Highfield, 21336. New York: Academic Press.
Seuren, Pieter. 1998. Western Linguistics: An Historical Introduction.
Oxford: Blackwell.
Seuren, P., and Herman W. 1986. Semantic transparency as a factor in
Creole genesis. In Substrata Versus Universals in Creole Genesis, ed.
P. Muysken and N. Smith, 5770. Amsterdam: Benjamins.
Thomason, Sarah. 2002. Creoles and genetic relationship. Journal of
Pidgin and Creole Languages 17: 1019.
Thomason, Sarah, and Terrence Kaufman. 1988. Language
Contact, Creolization, and Genetic Linguistics. Berkeley and Los
Angeles: University of California Press.

235

Critical Discourse Analysis


Valdman, Albert. 1978. Le crole: Structure, statut et origine. Paris: ditions
Klincksieck.
Vinson, Julien. 1889. Croles. In Dictionnaire des sciences anthropologiques, ed. A. Bertillon, 3457. Paris: Doin.
Whinnom, Keith. 1971. Linguistic hybridization and the special case of
pidgins and creoles. In Pidginization and Creolization of Languages,
ed. D. Hymes, 91115. Cambridge: Cambridge University Press.

CRITICAL DISCOURSE ANALYSIS


This term has been used since the 1990s by a group of academics
initially in the United Kingdom, but also increasingly in the rest
of Europe, Australia, South America and more recently in Asia.
The various practitioners of critical discourse analysis (CDA)
who would associate themselves with this label have in common
some concept of what it means to be critical, various notions of
discourse influenced strongly by sociology and social theory, and
a range of descriptive methods borrowed from various linguistic
theories. Broadly speaking, what all CDA practitioners are concerned with is the way language is integrated with society, but
unlike most sociolinguists, they espouse an overtly ethical or
political stance in engaging with this question. While its goal is to
increase understanding of the relationship between society and
language, CDA does not in general contribute to the description
or theorization of human language systems.
The term critical was first used to characterize an approach
to language study that was dubbed critical linguistics by Roger
Fowler and his colleagues (1979) and by Gunther Kress and
Robert Hodge ([1979] 1993). These scholars took some inspiration
from George Orwells informal critique of the use of language in
political life and his dystopian fantasy of newspeak in the novel
Nineteen Eighty-Four. But they also acknowledged intellectual
debts to Valentin Voloshinov (see dialogism and heteroglossia) and to Frankfurt School critical theory, especially the
work of Jrgen Habermas (ideal speech situation). The initial impetus of critical linguistics was, to some extent, grounded
in the Enlightnement philosophical notion of critique. However,
in many respects, the work produced by critical linguistics and
its successor CDA has been colored by, even tainted by, the
everyday sense in which one speaks negatively of criticizing a
person or group of persons. Among other ideas, critical linguists
held that the use of language could lead to mystification, which
analysis could elucidate. For example, a missing by-phrase in
English passive constructions might be seen as an ideological
means for concealing or mystifying reference to an agent. The
same is claimed for nominalizations such as destruction or arrest,
which have neither tense nor aspect and can also appear without
a by-phrase specifying an agent. Conspicuous clustering of synonymous or near-synonymous lexical items around a particular topic, or overlexicalization, is felt to indicate some critical
problematic social process or institution. Analysis of the referents associated with different kinds of participant roles in clauses
(e.g., actor, goal, beneficiary) is regarded as a way of detecting
patterns in the way social relations, especially power relations,
are represented. The most significant principle of critical linguistics, carried over into CDA, is the important observation that use
of language is a social practice, that is, a form of action constitutive of and constituted by social processes and structures.

236

The label critical linguistics has given way to critical discourse analysis, as the field developed to include wider areas
of social concern and more social theory. In particular, in addition to the writings of Marx and Marxian writers (see marxism
and language ), CDA practitioners have often made appeal
to Antonio Gramsci, Michel Foucault, and the British sociologist Anthony Giddens. The label CDA became associated with
the work of, in particular, Norman Fairclough (e.g. 1989, 1992),
Teun van Dijk (e.g. 1993, 1998, 2005) and Ruth Wodak (1996;
Wodak et al. 1999). Faircloughs work, mainly cast within a neoMarxian mold, has developed detailed concepts and models of
discourse, intertextuality, and genre, while leaving the
linguistic dimension of discourse comparatively undeveloped.
His work can be characterized as social theory that gives full
recognition to the constitutive role of language in society, in the
form of interlinked discursive practices. Van Dijks work is
rooted in formal discourse analysis of the 1970s with a cognitive
tendency, and has sought to provide deeper discourse-based
understanding of major pragmatic notions such as context,
as well as of the social-theoretic notion of ideology. Wodaks
work has developed the discourse historical method (Wodak
and Meyer 2001), a methodology for empirical investigation
that advocates the study of intersecting texts and samples of
talk collected from various milieus and representing various
genres, for which analysis of the historical context is regarded
as crucial. Numerous other scholars of the same period, whose
accomplishments it is not possible to describe here, produced
work that was overtly or implicitly critical, particularly in
France, Germany, the Netherlands and Belgium, the Hispanic
world, and Australia.
The CDA literature has introduced a number of key concepts and claims. The principle that discourse is constitutive of
social processes and structures has been extended to include
other semiotic systems, notably pictorial ones (Kress and van
Leeuwen 2001). Discourses, in the plural, are understood as relatively stable uses of language serving the structuring of social
life, organizations, and political systems. Such discourses consist of interlinked discursive practices or genres. Discourses
may be of various kinds and operate in different ways. Thus,
political ideologies, scientific worldviews, ethical systems, and
the like are said to represent the world in particular ways. Genres
regulate interaction and thus control social behavior, examples
being interviews, news broadcasts, medical consultations, educational examinations, and so forth. When these discursive
practices are viewed as interlinked, they involve intertextuality and interdiscursivity, leading to the colonization of one
discourse by another and to hybrid genres a process regarded
as integral to and an index of social change. Discursive practices viewed as a complex network are referred to as an order of
discourse (Fairclough 1992). A crucial claim of CDA is that the
details of particular instances of text and talk are related in complex ways to these structures, structures regarded as embodying
power.
CDA has not produced a theory of language that explains how
meanings produced by utterances have the tight connection to
social structures, which is often claimed. There are nonetheless
numerous examples of the description, analysis, and interpretation of utterances in the CDA literature, and these are dependent

Critical Discourse Analysis


on existing linguistic frameworks. In critical linguistics, early
transformational and generative grammar was used as
a framework for explaining the supposedly ideological effects of
the language forms found in texts. One questionable claim was
that a linguistic system that is, the syntax, lexicon, and if not
the phonology then the writing system of a language could be
inherently ideological (Kress and Hodge [1979] 1993), an idea
revived in equally questionable fashion by Robert Hodge and
Kam Louie (1998) in their discussion of Chinese. A more lasting
influence has been M. Hallidays systemic-functional grammar,
which CDA has drawn on for its classification of clause types, its
model of modality and theory of register. This framework
has the advantage of being formulated within a social-semiotic
perspective. However, it has the disadvantage that it is inadequate for the analysis of certain textual phenomena that CDA
writers have wanted to talk about, particularly such theoretically
difficult areas as implied meanings and metaphor. The reason for
this is that systemic-functional grammar has an encoding model
of linguistic meaning and fails, as do certain kinds of CDA work,
to take account of the fact that language understanding depends
on nonlinguistic knowledge. By contrast, alongside these theoretical frameworks, cognitive approaches have increasingly
provided theoretical resources, as is shown by the work of van
Dijk (e.g., 1998, 2005), who has drawn on models from cognitive psychology, and Paul Chilton (1996, 2005), who has drawn
on conceptual metaphor theory and blending theory (see
conceptual blending).
CDA has itself been the focus of principled criticism for at
least three reasons. One is its claim to be socially committed and
objective (Widdowson 2005). Another criticism has come from
within CDA itself on the basis of the view that CDA has been
too negative and should examine, even advocate, positive
analysis of discourse that it approves of (Martin 2004). Kieran
OHalloran (2003) has criticized CDAs inadequate notion of
cognition, in particular its symbolicism that is, the assumption that mental processes correspond to and can be influenced
by the manipulation of symbols. Chilton (2005) criticizes CDAs
failure to engage with developments in cognitive linguistics
and evolutionary psychology, arguing also that a critical faculty
may be universal and that CDA may exaggerate the power of
discourse. The fact remains, however, that the notion of critical
analysis of discourse remains increasingly influential throughout the world.
In general, it can be argued that, taken as a whole, CDA writing provides something more like a social theory than a linguistic
theory, although this is not to overlook many examples of perspicacious analysis of individual utterances in their sociopolitical
contexts, for example, analyses of leadership speeches, various
media genres, teacher pupil exchanges, and the like. This caseby-case approach may in itself be a disadvantage, since CDA is
premised on claims about entire discourse networks, a problem addressed by Wodaks proposals for an empirical method
based on the analysis of linked instances of different genres
within organizational structures (Wodak et al. 1999; Wodak and
Meyer 2001). The most general characteristic of CDA has been
not its linguistics but its ethically or politically committed stance.
Indeed, a number of CDA practitioners make an emancipatory
mission the core element of their work, and this is evident in the

special focus in many CDA works on issues such as racism, militarism, media bias, marketization, and gender.
This survey necessarily neglects much work that has a critical
stance without subscribing to the CDA label. Such work would
include the writing on language and gender that has arguably had
an impact on modern social behavior (for an overview see Eckert
and McConnell-Ginet 2003). It would also include work on society and on controversial political matters in North America (G.
Lakoff 1996); R. Lakoff 1990; Lemke 1995). Further, it should also
include the emerging scholarship on discourse in its sociopolitical context in China (cf. Gu 2001) and on a smaller scale in the
Middle East and Africa. It may be the case that something like the
CDA approach emerges in periods of significant socioeconomic
or political change: In some respects, CDA may be considered to
have the character of a social movement.
Paul Chilton
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bourdieu, Pierre. 1982. Ce que parler veut dire: Lconomie des changes
linguistiques. Paris: Fayard.
Chilton, Paul. 1996. Security Metaphors: Cold War Discourse from
Containment to Common House. New York: Peter Lang.
. 2004. Analysing Political Discourse: Theory and Practice.
London: Routledge.
. 2005. Missing links in mainstream CDA: Modules, blends
and the critical instinct. In A New Agenda in (Critical)l Discourse
Analysis: Theory and Interdisciplinarity, ed. Ruth Wodak and Paul
Chilton 1950. Amsterdam: Benjamins.
Eckert, Penelope, and Sally McConnell-Ginet. 2003. Language and
Gender. Cambridge: Cambridge University Press.
Fairclough, Norman. 1989. Language and Power. London: Longman.
. 1992. Discourse and Social Change. Cambridge, UK: Polity.
Fowler, Roger, Gunther Kress, Robert Hodge, and Tony Trew, eds. 1979.
Language and Control. London: Routledge.
Goatly, A. 2007. Washing the Brain: Metaphor and Hidden Ideology.
Amsterdam: Benjamins.
Gu, Yueguo. 2001. The changing orders of discourse in a changing
China. In Studies in Chinese Linguistics. Vol 2. Ed. Haihua Pan, 3158.
Hong Kong: Linguistic Society of Hong Kong.
Hodge, Robert, and Kam Louie. 1998. The Politics of Chinese Language
and Culture. London and New York: Routledge.
Kress, Gunther, and Robert Hodge. [1979] 1993. Language as Ideology.
London: Routledge.
Kress, Gunther, and Theo van Leeuwen. 2001. Multimodal Discourse: The
Modes and Media of Contemporary Communication. London: Arnold.
Lakoff, George. 1996. Moral Politics: What Conservatives Know That
Liberals Dont. Chicago: University of Chicago Press.
Lakoff, Robin. 1990. Talking Power: The Politics of Language in Our Lives.
New York: Basic Books.
Lemke, Jay. 1995. Textual Politics: Discourse and Social Dynamics.
London: Taylor & Francis.
Martin, Jim R. 2004. Positive discourse analysis: Power, solidarity and
change, Revista Canaria de Estudios Ingleses. 49: 179200.
OHalloran, Kieran. 2003. Critical Discourse Analysis and Language
Cognition. Edinburgh: Edinburgh University Press.
Saussure, Louis de, and Peter Schulz, eds. 2005. Manipulation and
Ideologies in the Twentieth Century: Discourse, Language, Mind.
Amsterdam: Benjamins.
Van Dijk, Teun. 1993. Elite Discourse and Racism. Newbury Park,
CA: Sage.
. 1998. Ideology. London: Sage.

237

Critical Periods
. 2005. Contextual knowledge management in discourse production: A CDA perspective. In A New Agenda in (Critical) Discourse
Analysis: Theory and Interdisciplinarity, ed. Ruth Wodak and Paul
Chilton, 71100. Amsterdam: Benjamins
Widdowson H. G. 2005. Text, Context, Pretext: Critical Issues in Discourse
Analysis. Oxford: Blackwell.
Wodak, Ruth. 1996. Disorders of Discourse. London: Longman.
Wodak, Ruth, and Michael Meyer. 2001. Methods of Discourse Analysis.
London: Sage.
Wodak, Ruth, Rudolf de Cillia, Martin Reisigl, and Karen Liebhart. 1999.
The Discursive Construction of National Identity. Edinburgh: Edinburgh
University Press.

CRITICAL PERIODS
The idea of a critical period for language acquisition is one of the
most debated issues in language acquisition theory. One reason
for controversy in the empirically based discussion of the critical
period hypothesis (CPH) has been the different understandings
of what a critical period actually means and what effects it might
have on language. Another more basic philosophical cause for
lack of unanimity is the notions central symbolic role in the
naturenurture divide and the concomitant ideological preferences among researchers to stress biological or environmental
aspects of language development. It is probably fair to say that
the rhetorical tone that is sometimes noticeable in debates on
the CPH originates to a considerable extent from the various
epistemological commitments of the protagonists, nativist or
constructivist, cognitive or social constructivist, general cognitive or modular, and so on.
The CPH is relevant for all kinds of language learning. When
much more of the discussion is about second language learning (see second language acquisition) than about first
language learning, it is a reflection of the fact that first language
learning, except in cases of isolation from language input, starts
from age zero. This means that data from delayed first language
acquisition are rare. On the contrary, for second language acquisition (SLA), massive data from language learning at different
phases of the life span are available. Most of this entry, therefore,
addresses second language acquisition rather than first language
acquisition.
The notion of critical period has been used by ethologists to
explain the fact that the development of several aspects of species-specific behavior are dependent on early stimulus exposure
or experiences. A critical period can generally be defined, therefore, as a time span in early life during which the organism is
responsive to those stimuli in the external environment that are
crucial or relevant for a behavior (or capacity) to develop eventually in keeping with a species-specific standard. If the organism
does not encounter or experience the particular stimuli during
the time span for sensitivity, that behavior will either not develop
at all or eventually reach an end state that differs from the species-specific ultimate standard. In addition to maturationally
constrained learning in humans and other species, there are two
other kinds of learning, namely, learning that occurs with equal
success at any time over the lifespan and learning that becomes
effective only at later phases of cognitive development.
Examples of maturationally constrained behaviors range
from imprinting in geese and song learning in songbirds (see

238

birdsong and human language) to bonding in sheep and


vision in cats and primates. The effects of maturation on minute
details of behavior are continuously being mapped out, not least
in neurobiological research. Adding to the classic results on sensitive periods for human vision obtained by the 1981 Nobel medicine laureates David Hubel and Torsten Wiesel, one example
is the more recent understanding obtained from experimental
studies on rhesus monkeys that certain irreversible visual disorders, such as impaired vision of specific movements or gaze
holding, result from failed visual experience during a three-week
neonatal sensitive period (Boothe 1997). As the visual systems
of rhesus monkeys and humans are in principle identical, these
results are claimed to be interpretable for humans where the
three-week neonatal sensitive period for monkeys corresponds
to a three-month period for humans.
Human language as a system of communication is an
extremely complex type of behavior. As in other kinds of complex
behavior, such as vision, it is highly likely that maturation affects
some but not all aspects of language acquisition. Different details
of language may be constrained by different phases of maturation, something that is covered by the notion of multiple critical
periods as suggested, for example, by H. W. Seliger (1978). Much
language learning occurs with reasonable ease over the whole life
span, for example, the learning of new vocabulary. What seems
to be the key distinguishing parameter between child learners
and adult learners is the fact that young learners in the majority
of cases seem to be able to reach an overall proficiency level in
the second language, phonetics and phonology included,
that allows them to be taken for native speakers of that language,
while this is extremely rare in adult learners. Therefore, an obvious candidate for what may be maturationally constrained in
language learning is the ability to reach ultimate nativelikeness.
A central role for nativelikeness was outlined already in Eric
Lennebergs original formulation of the CPH in his volume
Biological Foundations of Language:
[A]utomatic acquisition from mere exposure to a given language seems to disappear [after puberty], and foreign languages
have to be taught and learned through a conscious and labored
effort. Foreign accents cannot be overcome easily after puberty.
However, a person can learn to communicate at the age of forty.
This does not trouble our basic hypothesis on age limitations
because we may assume that the cerebral organization for language learning as such has taken place during childhood, and
since natural languages tend to resemble one another in many
fundamental aspects the matrix for language skills is present.
(1967, 176)

Lennebergs formulation, in actual fact, addressed many of the


issues that have been researched and debated over the years: 1)
the difference between the automatic, or implicit, acquisition
assumed to be possible within the critical period and conscious,
or explicit, learning postulated to be the only remaining option
for late learners; 2) puberty as the end point for a critical period;
3) the ability to reach nativelike ultimate proficiency for L2 learners who start at ages below that point; and 4) the effect that any
early language learning can have on subsequent languages.
In addition, albeit not mentioned in this particular quotation,
Lenneberg proposed 5) lateralization as the neural mechanism

Critical Periods
that could explain an end point for the critical period at puberty.
However, such a role for lateralization was soon demonstrated
not to be correct: The widely accepted theory in the 1960s that
lateral specialization is progressive, increasing with age from
infancy to adolescence has long been abandoned in the face of
accumulated evidence that indicates that cerebral lateral specialization is established from early infancy or even during
fetal development (Paradis 2004, 107).
Lennebergs proposals were based mainly on general informal
observations, but subsequent research has provided theoretical
frameworks and substantial empirical data that, taken together,
can be given an interpretation that is compatible with the predictions of the hypothesis. The first point, on the distinction between
(implicit) acquisition and (explicit) learning, has been addressed
in a series of theoretical discussions about language learning differences between children and adults (see DeKeyser 2003), with
some of its most well known exponents in S. Krashens (1988)
acquisition-learning hypothesis, R. Bley-Vromans (1989) fundamental difference hypothesis, and S. Felixs (1985) competition
hypothesis. Also, the various perspectives on access to universal grammar (UG) (full/direct, partial/indirect, or no access)
in UG-framed SLA theories (see White 2003) can be translated
into a CPH framework (cf. Pinkers 1994 use-it-then-loose-it
hypothesis).
As for the second point, a multitude of studies have singled
out puberty or early adolescence as an end point for high or
nativelike proficiency levels in a second language. Several other
studies have pointed to a discontinuity at earlier ages, in particular around age six, especially for phonology and grammar (Long
2005; see, especially, Johnson and Newport 1989).
In relation to point three on obtained nativelikeness in
younger and older starters, empirical observations are by and
large compatible with the CPH; that is, child learners frequently
have results in the range of native controls, whereas this is rare,
or even claimed never to have been demonstrated, in adult
learners (Hyltenstam and Abrahamsson 2003; Long 2005). In
many studies, however, nativelikeness is not focused upon. It is,
rather, the level of ultimate attainment, nativelike or not, among
younger and older learners that is correlated with age of onset
(AO). One of the most robust results in CPH-related research is a
strong negative correlation between AO and ultimate attainment
in a second language.
In addition, in relation to Lennebergs fourth point, the
hypothesis correctly seems to predict the fact that delayed exposure to an L1 more severely affects the level of ultimate attainment than delayed exposure to an L2. Case studies of abused
children who have not been exposed to any language before
puberty (Curtiss 1977) show severe limitations in the development of grammar and pronunciation, whereas learning a second
language from the same age allows high levels of proficiency in
that language. It has frequently been pointed out that deprivation data are difficult to interpret in terms of a critical period for
language, but similar results have been obtained from studies
of delayed first language exposure to American Sign Language
(ASL) (e.g., Mayberry 1993).
A number of counterarguments to the CPH have also been
put forward over the decades. There is a current consensus
that some of these are, in fact, not valid arguments, as they deal

either with phenomena that are not covered by the hypothesis


or issues that are not decisive components of it. An example
of the former are early objections to the CPH based on results
showing that younger learners are, in fact, not better than older
learners in initial rate of learning a second language (Snow and
Hoefnagel-Hhle 1977). The CPH is not about what happens in
initial stages of second language learning but indeed about longterm impacts, that is, what is ultimately attainable in language
learning. An example of objections to nondecisive components
is the type of criticism that says that if lateralization is not the
cerebral mechanism behind the differential behavior of children
and adults (Krashen 1973), there can be no critical period. The
CPH is not dependent on lateralization as such; other cerebral
mechanisms may be at work.
More substantial criticisms to the CPH have frequently
focused on the correlation between AO and ultimate attainment.
For various reasons, most prominently because correlation does
not equal cause, results showing a strong negative correlation
between AO and ultimate proficiency are not accepted by everyone as evidence of maturational constraints. It has been suggested that there may be other factors of a social or psychological
nature, such as length of residence, input frequency, motivation,
general cognitive changes, or L1 use, that would explain these
correlations (cf. Flege, Frieda, and Nozawa 1997). However, in
studies that have used statistical measures that are able to assess
the relative weight of different dependent variables, it has consistently been shown that AO is the strongest and often only factor
in predicting ultimate attainment. Results are often in the range
of 50 percent of the variance explained by AO, while other factors
add only 26 percent in explaining the variance (see DeKeyser
and Larson-Hall 2005). As these other factors do not correlate
highly with achieved ultimate attainment, and as none of them
has a convincing link to age, this leaves us with maturation as the
strongest candidate for explaining the correlation.
In addition, it has been claimed that AO and ultimate attainment correlations exhibit patterns of a linear decrease, rather
than one of discontinuity, which would be expected for the
CPH: At the end of the period, there should be an obvious offset, after which we would expect a flattening of the interaction.
Studies such as E. Bialystok and B. Miller (1999) obtained results
indicating a linear decline through all AOs, and D. Birdsong
and M. Molis (2001) saw age effects among postpuberty and
adult learners generally. These authors interpret their results as
evidence against the CPH and suggest general age-dependent
cognitive factors as causes. However, another possible background for a linear decrease might be a combined effect where
maturation plays the dominating role up through puberty and
adolescence and where social/psychological factors become
more important for explaining the variation after maturation is
complete (Birdsong 1999; Hyltenstam and Abrahamsson 2003).
It is probably fair to say that the issue of discontinuity is far from
solved.
Finally, counterarguments have addressed the issue of
nativelikeness evidence for the CPH. Some authors claim that
nativelike ultimate attainment among late learners is not as rare
as has been previously thought, something that would indeed
falsify the hypothesis. Birdsong (2005) reviewed frequencies of
L2 participants who performed in the range of native speakers

239

Critical Periods

Culture and Language

and claimed that studies show up to 15 percent of postpuberty-investigated samples that reach this level. At the other end,
many studies have shown that nativelike ultimate attainment is
far from always obtained among learners with prepuberty AOs
(Abrahamsson and Hyltenstam 2009).
Research related to the idea of a critical period for language
is central to language acquisition theory, in particular for the
understanding of defining differences between first and second
language acquisition. Intensive efforts at understanding the factors behind age differences in language acquisition outcomes
have been made over the last 15 years, but the field is far from
approaching a consensus about the role of maturational constraints or critical periods. However, current claims for reconsidering definitions of concepts, analytical instruments, and
research methodologies (see, e.g., Birdsong 2005; Hyltenstam
and Abrahamsson 2003; Long 2005) are promising for decisive
steps forward in the near future.
Kenneth Hyltenstam
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abrahamsson, N., and K. Hyltenstam. 2009. Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language 59: 249306.
Bialystok, E., and B. Miller. 1999. The problem of age in second-language acquisition: Influences from language, structure, and task.
Bilingualism: Language and Cognition 2: 12745.
Birdsong, D. 2005. Interpreting age effects in second language acquisition. In Handbook of Bilingualism: Psycholinguistic Perspectives, ed. J.
Kroll and A. M. B. De Groot, 10927. Cambridge: Cambridge University
Press.
Birdsong, D., ed. 1999. Second Language Acquisition and the Critical
Period Hypothesis. Mahwah, NJ: Lawrence Erlbaum.
Birdsong, D., and M. Molis. 2001. On the evidence for maturational
constraints in second-language acquisition. Journal of Memory and
Language 44: 23549.
Bley-Vroman, R. 1989. What is the logical problem of foreign language
learning? In Linguistic Perspectives on Second Language Acquisition,
ed. S. Gass and J. Schachter, 4168. Cambridge: Cambridge University
Press.
Boothe, R. G. 1997. A neonatal visual deprivation syndrome. Perception
26: 766.
Curtiss, S. 1977. Genie: A Psycholinguistic Study of a Modern-day Wild
Child. New York: Academic Press.
DeKeyser, R. M. 2003. Implicit and explicit learning. In Doughty and
Long 2003, 31348.
DeKeyser, R. M., and J. Larson- Hall. 2005. What does the critical
period really mean? In Handbook of Bilingualism: Psycholinguistic
Approaches, ed. J. F. Kroll and A. M. B. De Groot, 88108. Oxford: Oxford
University Press.
Doughty, C., and M. Long, ed. 2003. Handbook of Second Language
Acquisition. Oxford: Blackwell.
Felix, S. 1985. More evidence on competing cognitive systems. Second
Language Research 1: 4772.
Flege, J. E., E. M. Frieda, and T. Nozawa. 1997. Amount of native-language (L1) use affects the pronunciation of an L2. Journal of Phonetics
25: 16986.
Hyltenstam, K., and N. Abrahamsson. 2003. Maturational constraints in
SLA. In Doughty and Long 2003, 53988.
Johnson, J. S., and E. L. Newport. 1989. Critical period effects in second language learning: The influence of maturational state on the

240

acquisition of English as a second language. Cognitive Psychology


21: 6099.
Krashen, S. 1973. Lateralization, language learning, and the critical
period: Some new evidence. Language Learning 23: 6374.
. 1988. Second Language Acquisition and Second Language
Learning. Boston: Prentice-Hall.
Lenneberg, E. 1967. Biological Foundations of Language. New York: Wiley
and Sons.
Long, M. H. 2005. Problems with supposed counter-evidence to the
critical period hypothesis. International Review of Applied Linguistics
43: 287317.
Mayberry, R. I. 1993. First-language acquisition after childhood differs from second-language acquisition: The case of American Sign
Language. Journal of Speech and Hearing Research 36: 125870.
Paradis, M. 2004. A Neurolinguistic Theory of Bilingualism.
Amsterdam: Benjamins.
Pinker, S. 1994. The language instinct: How the mind creates language.
New York: Morrow.
Seliger, H. W. 1978. Implications of a multiple critical periods hypothesis for second language learning. In Second Language Acquisition
Research, ed. W. Ritchie, 1119. New York: Academic Press.
Snow, C., and M. Hoefnagel-Hhle. 1977. Age differences in the pronunciation of foreign sounds. Language and Speech 20: 35765.
White, L. 2003. On the nature of interlanguage representation: Universal
grammar in the second language. In Doughty and Long 2003, 1942.

CULTURE AND LANGUAGE


Culture and language are connected in a myriad ways. proverbs, politeness, linguistic relativism, cooperative
principle, metaphor, metonymy, context and co-text,
semantic change, discourse (see discourse analysis
[foucaultian] and discourse analysis [linguistic]),
ideology and language, print culture, oral culture,
literacy, sociolinguistics, and speech-acts are just some
of the entries in this encyclopedia that deal with some obvious
connections between culture and language. Several disciplines
within the language sciences attempt to analyze, describe, and
explain the complex interrelations between the two broad areas.
(For a brief and clear survey, see Kramsch 1998.)

Culture and Language as Meaning Making


Can we approach this vast variety of topics from a more unified
perspective than is traditionally done and currently available?
The relationship between culture and language can be dealt with
if we assume that both culture and language are about making
meaning. This view of culture comes closest to that proposed by
Clifford Geertz, who wrote: Man is an animal suspended in webs
of significance he himself has spun. I take culture to be those
webs, and the analysis of it to be therefore not an experimental
science in search of law but an interpretative one in search of
meaning (1973, 5). In this spirit, I suggest that we approach both
culture and language as webs of significance that people both
create and understand. The challenge is to see how they are created and understood often in multiple and alternative ways.
We have a culture when a group of people living in a social,
historical, and physical environment make sense of their experiences in a more or less unified manner. This means, for example,
that they understand what other people say, they identify objects
and events in similar ways, they find or do not find behavior

Culture and Language


appropriate in certain situations, they create objects, texts, and
discourses that other members of the group find meaningful,
and so forth. In all of these and innumerable other cases, we have
meaning making in some form: not only in the sense of producing and understanding language but also in the sense of correctly
identifying things, finding behavior acceptable or unacceptable,
being able to follow a conversation, being able to generate meaningful objects and behavior for others in the group, and so forth.
Meaning making is a cooperative enterprise (linguistic or otherwise) that always takes place in a large set of contexts (ranging
from immediate to background) and that occurs with varying
degrees of success. People who can successfully participate in
this kind of meaning making can be said to belong to the same
culture. Spectacular cases of unsuccessful participation in joint
meaning making are called culture shock.
This kind of meaning-based approach to culture can be
found in George Lakoffs (1996) work on American politics, Mark
Turners (2001) investigations into the cognitive dimensions
of social science, and Zoltn Kvecsess (2005) study of metaphorical aspects of everyday culture. Gary Palmer makes such a
meaning-based approach the cornerstone of what he calls cultural linguistics and applies it to three central areas of anthropological linguistics: Boasian linguistics, ethnosemantics, and the
ethnography of speaking (1996, 45).
What is required for meaning making? The main meaningmaking organ is the brain/mind. The brain is the organ that
performs the many cognitive operations that are needed for
making sense of experience and that include categorization,
figure-ground alignment, framing knowledge, metaphorical
understanding, and several others. Cognitive linguists and cognitive scientists in general are in the business of describing these
operations. Cognitive linguists believe that the same cognitive
operations that human beings use for making sense of experience in general are used for making sense of language. On this
view, language is structured by the same principles of operation
as other modalities of the mind. However, these cognitive operations are not put to use in a universally similar manner; that is,
there can be differences in which cognitive operations are used
to make sense of some experience in preference to another, and
there can be differences in the degree to which particular operations are utilized in cultures. This leads to what is called alternative construal in cognitive linguistics (see Langacker
1987). Moreover, the minds that evolve on brains in particular
cultures are shaped by the various contexts (historical, physical,
discourse, etc.) that in part constitute cultures (Kvecses 2005).
This leads to alternative conceptual systems.
Many of our most elementary experiences are universal.
Being in a container, walking along a path, resisting some physical force, being in the dark, and so forth, are universal experiences that lead to image schemas of various kinds (Johnson
1987; Lakoff 1987). The resulting image schemas (container,
source-path-goal, force, etc.) provide meaning for much of
our experience either directly or indirectly in the form of conceptual metaphors. Conceptual metaphors may also receive
their motivation from certain correlations in experience, when,
for instance, people see correlations between two events (such
as adding to the content of a container and the level of the substance rising), leading to the metaphor MORE IS UP (see Lakoff

and Johnson 1980). When meaning making is based on such elementary human experiences, the result may be (near-)universal
meaning (content) though under a particular interpretation
(construal), that is, conceived of in a certain manner, to use
Hoyt Alversons phrase (1991, 97).
Language, on this view, consists of a set of linguistic signs, that
is, pairings of form and meaning (which can range from simple
morphemes to complex syntactic constructions). Learning
a language means the learning of such linguistic signs. Thus,
language can be regarded as a repository of meanings stored in
the form of linguistic signs shared by members of a culture. This
lends language a historical role in stabilizing and preserving a
culture. This function becomes especially important in the case
of endangered languages (see extinction of languages),
and it often explains why minorities insist on their language
rights (see language policy).
Members of a culture interact with one another for particular purposes. To achieve their goals, they produce particular discourses. Such discourses are assemblies of meanings that relate
to particular subject matters. When such discourses provide a
conceptual framework within which significant subject matters are discussed in a culture, and when they function as latent
norms of conduct, the discourses can be regarded as ideologies (see, e.g., Charteris-Black 2004; Musolff 2004; Goatly 2007).
Discourse in this sense is another source of making meaning in
cultures. A large part of socialization involves the learning of how
to make meaning in a culture.

Three Examples of Meaning Making


As the first example, consider how people make sense of the spatial orientation of objects around them. What we find in language
after language is that speakers conceptualize the spatial orientation of objects relative to their own bodies (Levinson 1996). This
means that they operate with such orientations as right and left
or in front of and behind. Both pairs of concepts make use of the
human body in order to locate things in space. Thus, we can say
that the window is on my left and that the church is in front of
us. If we did not conceptualize the human body as having right
and left sides and if we did not have a forward (and backward)
orientation aligned with the direction of vision, such sentences
would not make too much sense. But in our effort to understand
the world we do rely on such conceptualization. This is called an
ego-centered, or relativistic, spatial orientation system.
Since so many of the worlds languages have this system and
because the system is so well motivated in our conception of
the human body, we would think that the ego-centered system
is an absolute universal and that no culture can do without it.
However, as Stephen Levinson (1996) points out, this is just a
myth. The native Australian language of Guugu Yimithirr has a
radically different system:
Take, for example, the case of the Guugu Yimithirr speakers
of N. Queensland, who utilize a system of spatial conception
and description which is fundamentally different from that
of English-speakers. Instead of concepts of relativistic space,
wherein one object is located by reference to demarcated regions
projected out from another reference object (ego, or some landmark) according to its orientation, Guugu Yimithirr speakers use

241

Culture and Language


a system of absolute orientation (similar to cardinal directions)
which fixes absolute angles regardless of the orientation of the
reference object. Instead of notions like in front of, behind,
to the left of, opposite, etc., which concepts are uncoded in
the language, Guugu Yimithirr speakers must specify locations
as (in rough English gloss) to the North of, to the South of,
to the East of, etc. The system is used at every level of scale,
from millimeters to miles, for there is (effectively) no other system available in the language; there is simply no analogue of the
Indo-European prepositional concepts. (Levinson 1996, 180)

Thus, according to Levinson, the Guugu Yimithirr speakers


must carry a mental map in their head of everything surrounding them, with the map aligned for the four quadrants. With the
help of such a mental map, they can identify the location of any
object with a high degree of precision, far exceeding the ability
of speakers of languages that have a relativist system of spatial
reckoning.
The second example deals with the cognitive process of categorization. We can suggest that there is a close connection
between the nature of our categories and many important cultural and social issues. The classical view of categories is based
on the idea of essential features. In that view, the members of
the category must share certain essential features. In the new
rival view, categories are defined not in terms of necessary
and sufficient conditions (i.e., essential features) but with
respect to prototypes and various family resemblance
relations to these prototypes.
How do we make sense of social debates? The emergence,
existence, and often the resolution of cultural and social issues
may hinge on how we think about the nature of our categories.
To see how this is possible, let us consider the concept of art. The
discussion of the structure of the concept of art can shed light
on why art has been a debated category probably ever since its
inception and particularly in the past two centuries. Kvecses
(2006) examines some of the history of the category of art in
the past 200 years on the basis of the Encyclopedia Britannica
(2003). What he finds in this history is that the category undergoes constant redefinition in the nineteenth and twentieth
centuries. Different and rival conceptions of art challenge the
traditional view that is, the most prevalent conservative
view. Impressionism, cubism, surrealism, pop art, and the like
are reactions to the traditional view and to each other. But what
is the traditional view of art?
The traditional conception of art can be arrived at by examining those features of art that are challenged, negated, or successfully canceled by the various movements of art. For example,
most people believe that a work of art represents objective reality.
This feature of art is canceled by the art movements of impressionism, expressionism, and surrealism. Another feature of art
that most people take to be definitional is that a work of art is
representational, that is, it consists of natural figures and forms.
This feature is effectively canceled by symbolism, cubism, and
abstract art. Finally, most believe that a work of art is a physical
object. This feature is canceled by conceptual art.
As can be seen, even those features of art that many would take
to be definitional for all forms of art (such as the one that art represents objective reality, the one that it is representational, and

242

the one that it is some kind of physical object) can be explicitly


negated and effectively canceled. This is how new art movements
were born out of a successful new definition. More importantly,
there are always some people who do not accept the definition
that most people take to be definitional. This small but significant minority can constantly challenge, undermine, or plainly
negate every one of the features that the majority take to be definitional and essential. If they were essential, they could not be
so easily challenged and canceled. We can suggest that the concept of art has a central member the traditional conception
and many noncentral ones. The noncentral ones may become
the prototypes of art for some people, and then these new prototypes can be further challenged. Concepts like art assume a
prototype-based organization, and it is their very structure that
invites contestation. We can only understand the nature of the
widespread phenomenon of cultural and social debates if we
study and understand the nature of our categories that give rise
to and invite debates by virtue of their very structure.
Our third example has to do with how we represent knowledge in the mind. Categories are mentally represented as frames,
schemas, or mental models (see, e.g., Schank and Abelson
1977; Fillmore 1982; Langacker 1987; Lakoff 1987). We can use
the following working definition of frames: A frame is a structured mental representation of a coherent organization of
human experience.
Frames are important in the study of almost any facet of life
and culture and not just language. The world as we experience
it is always the product of some prior categorization and framing
by ourselves and others. A crucial aspect of framing is that different individuals can interpret the same reality in different ways.
This is the idea of alternative construal mentioned earlier.
How do we categorize the various objects and events we
encounter in the world? Clearly, many of our categories are
based on similarity (especially of the family resemblance kind)
among members of a category. That is, many categories are held
together by family resemblances among the items that belong
to a particular category. In this sense, most of our conventional
categories for objects and events are similarity-based ones. For
example, the things that one can buy in a store are commonly
categorized on the basis of their similarity to one another; thus,
we find different kinds of nails (short and long ones, thick and
thin ones, etc.) in the same section of a hardware store. They
form a similarity-based category. However, we can also find nails
in other sections of the store. Some nails can occur in sections
where, for example, things for hanging pictures are displayed.
Clearly, a nail is not similar to any of the possible things (such as
picture frames, rings, short strings, adhesive tapes, maybe even a
special hammer) displayed in this section. How is it possible that
certain nails appear in this section? Or, to put it in our terms, how
is it possible that nails are put in the same category with these
other things? The answer is that in addition to similarity-based
categories, we also have frame-based ones. That is to say, categories can be formed on the basis of the things that go commonly
and repeatedly together in our experience. If we put up pictures
on the wall by first driving a nail into the wall and then hanging
the picture frame on the nail by means of attaching a metal ring
or a string on the frame, then all of the things that we use for this

Culture and Language

Cycle, the

purpose may be placed in a single category. But this category will


be frame-based not similarity-based.
Now there can be differences across and even within cultures in the use of this meaning-making device. An interesting
example is provided by a study by J. Glick (1975) conducted
among the Kpelle of Liberia. Kpelle farmers consistently sorted
objects into functional groups (such as knife and orange and
potato and hoe), rather than into conceptual categories (such
as orange and potato and knife and hoe). The former is what we
would call a frame-based categorization, whereas the latter is a
similarity-based one. On the whole, Westerners prefer to categorize objects on the basis of similarity. When Glick asked the
Kpelle how a fool would categorize the objects, they came up
with such neat similarity-based piles. Clearly, cultures can differ in the use of meaning-making devices, and these differences
may produce differences in the use of categories and language
in general.

Conclusion
Culture and language are connected in many ways, and the
interconnections can be studied from a variety of different
perspectives. Following Geertz, I tried to develop a view of the
relationship that is based on how we make sense of our experiences linguistic or otherwise. Recent cognitive science and cognitive linguistics provide us with new ideas and methodological
tools with which we can approach the issue of meaning making
in cultures, both in its universal aspects and in its infinite crosscultural variety.
Zoltn Kvecses
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alverson, Hoyt. 1991. Metaphor and experience: Looking over the
notion of image schema. In Beyond Metaphor: The Theory of Tropes
in Anthropology, ed. J. Fernandez, 94117. Stanford, CA: Stanford
University Press.
Charteris-Black, Jonathan. 2004. Corpus Approaches to Critical Metaphor
Analysis. Houndsmill, UK: Palgrave Macmillan.
Encyclopedia Britannica Ready Reference. 2003. Chicago: Encyclopedia
Britannica. Electronic version.
Fillmore, Charles. 1982. Frame semantics. In Linguistics in the Morning
Calm, 111137. Hanshin: The Linguistic Society of Korea.
Foley, William A. 1997. Anthropological Linguistics: An Introduction.
Oxford and Malden, MA: Blackwell.
Geertz, Clifford. 1973. The Interpretation of Cultures. New York: Basic
Books.
Gibbs, Raymond W. 2006. Embodiment and Cognitive Science. New
York: Cambridge University Press.
Glick, J. 1975. Cognitive development in cross-cultural perspective. In
Review of Child Development Research. Vol. 4. Ed. F. Horowitz, 595
654. Chicago: University of Chicago Press.
Goatly, Andrew. 2007. Washing the Brain: Metaphor and Hidden Ideology.
Amsterdam: John Benjamins.
Johnson, Mark. 1987. The Body in the Mind. Chicago: University of
Chicago Press.
Kimmel, Michael. 2001. Metaphor, Imagery, and Culture: Spatialized
Ontologies, Mental Tools, and Multimedia in the Making. Ph.D. diss.,
University of Vienna.
Kvecses, Zoltn. 2005. Metaphor in Culture: Universality and Variation.
Cambridge: Cambridge University Press.

. 2006. Language, Mind, and Culture: A Practical Introduction.


Oxford and New York: Oxford University Press.
Kramsch, Claire. 1998. Language and Culture. Oxford: Oxford University
Press.
Lakoff, George. 1987. Women, Fire, and Dangerous Things.
Chicago: University of Chicago Press.
Lakoff, George.. 1996. Moral Politics: How Liberals and Conservatives
Think. Chicago: University of Chicago Press.
Lakoff, George, and Mark Johnson 1980. Metaphors We Live By.
Chicago: University of Chicago Press.
Lakoff, George, and Mark Johnson. 1999. Philosophy in the Flesh: The
Embodied Mind and Its Challenge to Western Thought. New York: Basic
Books.
Langacker, Ronald. 1987. Foundations of Cognitive Grammar: Theoretical
Prerequisites. Vol. 1. Stanford, CA: Stanford University Press.
Levinson, Stephen C. 1996. Relativity in spatial conception and description. In Rethinking Linguistic Relativity, ed. J. Gumperz and S. C.
Levinson, 177202. Cambridge: Cambridge University Press.
Musolff, Andreas. 2004. Metaphor and Political Discourse: Analogical
Reasoning in Debates about Europe. London: Palgrave Macmillan.
Palmer, Gary. 1996. Toward a Theory of Cultural Linguistics. Austin: Texas
University Press.
Schank, Robert, and Robert Abelson. 1977. Scripts, Plans, Goals, and
Understanding. Hillsdale, NJ: Lawrence Erlbaum.
Shore, Bradd. 1996. Culture in Mind: Cognition, Culture, and the Problem
of Meaning. Oxford and New York: Oxford University Press.
Strauss, Claudia, and Naomi Quinn. 1987. A Cognitive Theory of Cultural
Meaning. Cambridge: Cambridge University Press.
Turner, Mark. 2001. Cognitive Dimensions of Social Science. Oxford and
New York: Oxford University Press.
Whorf, Benjamin Lee. 1956. Language, Thought, and Reality: Selected
Writings of Benjamin Lee Whorf. Ed. John B. Carroll. Cambridge,
MA: The MIT Press.
Wolf, Hans-Georg. 2001. The African cultural model of community in
English language instruction in Cameroon: The need for more systematicity. In Applied Cognitive Linguistics. Vol 2: Language Pedagogy.
Ed. M. Putz, S. Niemeier, and R. Dirven, 22558. Berlin: Mouton de
Gruyter.

CYCLE, THE
The syntactic cycle was originally formulated in Chomsky (1965)
as a general principle of grammar that constrains the way transformational rules can apply in the derivation of sentences
containing embedded clauses. The term cycle refers to the property of syntactic derivations whereby transformations apply
within syntactic subdomains before they apply to larger syntactic domains that contain them. Thus, in (1), where A, B, and C
denote syntactic domains to which transformational rules can
apply, the set of transformations must apply to C before B, and
then to B before A (see Figure 1). If a rule X applies to domains
AC in a single derivation, it applies successive cyclically to C,
then B, and finally A that is, starting with the smallest cyclic
domain and proceeding stepwise to the largest (see Boeckx 2007
for discussion).
Noam Chomsky sharpens his original formulation as the
strict cycle condition (SCC):
No rule can apply to a domain dominated by a cyclic node A in
such a way as to affect solely a proper subdomain of A dominated
by a node B which is also a cyclic node. (1973, 243)

243

Cycle, the
(1)

Deconstruction
A

Figure 1.
The SCC includes the further restriction that transformations
may not revisit a subdomain after they have applied to the larger
domain that contains it. For example, once a transformation
applies to domain B in Figure 1, no transformation can apply
solely within the subdomain C. Since the earliest formulations,
clauses (complementizer phrase (CP)/inflection phrase (IP)
in current analysis) have been designated as cyclic domains.
Nominal phrases (i.e., noun phrase [NP] and determiner phrase
[DP]) and more recently light verb phrase (vP) have also been
proposed as additional cyclic domains.
Empirical motivation for the SCC involves deviant sentences
whose derivation violates the SCC, for example (2) under the
partial derivation given in (3).
(2)*Who did you wonder what bought?
(3) a.
b.
c.
d.

[CP1 C [IP1 you wonder [CP2 C [IP2 who bought what ] ] ] ]


[CP1 C [IP1 you wonder [CP2 who C [IP2 bought what ] ] ] ]
[CP1 who C [IP1 you wonder [CP2 C [IP2 bought what ] ] ] ]
[CP1 who C [IP1 you wonder [CP2 what C [IP2 bought ] ] ] ]

Specifically, the SCC blocks the countercyclic derivational step


that moves what in (3c) to the specifier position of CP2, as illustrated in (3d).
Whether a cyclic principle of rule application like the SCC
constitutes an axiom of syntactic theory depends on two
things: one concerning the formulation of transformations and
the other involving the empirical overlap with independently
motivated conditions. If the countercyclic movement of what
in (3) is prohibited by the basic formulation of transformations
(e.g., merge), then stipulating an independent cyclic principle
is redundant. The cyclic application of rules simply follows from
the formulation of rules (see Freidin 1999).
Suppose, however, that the formulation of transformations
does not prohibit the countercyclic movement in (3). Then, a
cyclic principle is required only if (2) under derivation (3) cannot
be excluded by other independently motivated constraints. For
example, under trace theory, the derivation (3) yields a representation (4).
(4)[CP1 whoi C [IP1 you wonder [CP2 whatj C [IP2 ti bought tj ] ] ] ]

The connection between who and its trace ti violates the


subjacency principle. Taking this condition to be a constraint
on trace binding (i.e., on representations), one empirical effect of
the SCC is subsumed under subjacency. (See Freidin 1978 for an
account that generalizes this kind of analysis.) Other proposals to
derive the empirical effects of a cyclic principle include Chomskys

244

extension condition (1993), the phase impenetrability condition


(Chomsky 2000), and cyclic linearization (Fox and Pesetsky 2005).
It appears that syntactic derivations do not allow countercyclic operations, either because the formulation of grammatical
operations will not allow them or because such operations violate
general constraints (aside from the SCC) on either derivations
or the representations that are produced. Whichever approach
turns out to be correct, it is clear that the theory of grammar need
not include a specific cyclic principle along the lines of the SCC,
given that virtually all of its empirical effects follow from independent factors. Either way, the cycle is deeply embedded in
syntactic theory.
Robert Freidin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Boeckx, C. 2007. Understanding Minimalist Syntax: Lessons from Locality
in Long-Distance Dependencies. Oxford: Blackwell.
Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT
Press.
Chomsky, N. 1973. Conditions on transformations. In A Festschrift for
Morris Halle, ed. S. Anderson and P. Kiparsky, 23286. New York: Holt,
Rinehart and Winston.
. 1993. A minimalist program for linguistic theory. In The View
from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger,
ed. K. Hale and S. Keyser, 152. Cambridge, MA: MIT Press.
. 2000. Minimalist inquiries: The framework. In Step by Step: Essays
on Minimalist Syntax in Honor of Howard Lasnik, ed. R. Martin,
D. Michaels, and J. Uriagereka, 89155. Cambridge, MA: MIT Press.
Fox, D., and D. Pesetsky. 2005. Cyclic linearization of syntactic structure. Theoretical Linguistics 31: 145.
Freidin, R. 1978. Cyclicity and the theory of grammar. Linguistic Inquiry
9: 51949.
. 1999. Cyclicity and minimalism. In Working Minimalism, ed.
S. Epstein and N. Hornstein, 95126. Cambridge, MA: MIT Press.
Lasnik, H. 2006. Conceptions of the cycle. In Wh-Movement: Moving
On, ed. L. Cheng and N. Corver, 197216. Cambridge, MA: MIT
Press.

D
DECONSTRUCTION
Deconstruction is a practice of exceptionally close and vigilant
critical reading that aims to reveal the various contradictions
or moments of aporia (of paradox or strictly irresolvable doubt)
endemic to the texts of Western philosophy, literature, and other
kinds of writing. It is best approached through the work of Jacques
Derrida (19302004), the most vigorous exponent of deconstruction and a thinker centrally concerned with issues in semantics,
hermeneutics (see philology and hermeneutics),
speech-act theory, and philosophy of language and logic (see
especially Derrida 1973, 1978, 1982, 1989).
Perhaps the most striking example is Derridas lengthy
treatment of Jean-Jacques Rousseau in Of Grammatology
(Derrida 1976). Here, he shows how Rousseaus ideas about a
vast range of topics nature, culture, language, society, ethics,
politics, history, sexual relations, personal identity, literature,

Deconstruction
and music are affected by a curious logic of supplementarity
that constantly twists his argument back against itself and thereby
subverts its manifest intent. Rousseau wants to say and does
quite explicitly state that in each case, there is (or once was)
an original, authentic, natural, uncorrupted state that then gives
way to a decadent, artificial, and degraded state where human
beings are condemned to live at a distance from their true nature
and enter into all kinds of intrinsically bad (since by very definition unnatural) relationship with themselves, each other, and
the world around them. Yet in each case, his argument comes up
against a stubbornly insistent counterlogic that throws its claims
into doubt by reversing the conceptual order of priorities upon
which that argument relies.
Thus, Rousseau sees language and music as having their
common point of origin in a mode of passionate speech-song
that expresses human feelings directly and without, as yet, any
need for those various bad supplements of syntactic and
lexical structure in language or harmony and counterpoint in
music that had since come to work their insidious, corrupting
effects. Indeed, as Derrida notes, the key word supplement
occurs with remarkable frequency and weight of semantic
implication throughout Rousseaus writing on these topics. Yet
when he treats them in a more sustained and reflective way,
Rousseau has perforce to concede that the melodious aspect of
music is always dependent on a background sense of its harmonic implications, just as the expressive aspect of language
depends on the existence of grammatical and lexical structures in the absence of which it could communicate nothing
whatsoever.
The same applies to his thinking about matters of history, politics, and civil society. Here, Rousseau purports to trace a process
of epochal decline from the close-knit, organic, natural communities that once existed before the advent of all the corrupting
forces of power, class, education, authority, political influence,
acquired expertise, and so forth which are falsely considered
progress or civilization by those same decadent standards.
And again, what passes for culture among the denizens of
modern society is, in truth, just another melancholy sign of the
falling away from nature or the ever more false and artificially
cultivated manners, practices, and modes of expression that figure as mere supplements to an otherwise perfectly self-sufficient
natural state. Yet here also Derrida shows that the term supplement is subject to a kind of dislocating force, or logico-semantic
torsion, that twists the operative sense of Rousseaus argument
against his avowed intent. In each case, the supplementary item
turns out to be not so much a mere supplement ( = add-on,
accessory, optional extra) but a supplement in the opposite,
palliative sense: that which is required in order to complete or
make good an otherwise defective, non-self-sufficient, or inadequate mode of being. Thus, it is strictly impossible a downright
contradiction in terms to posit the existence of a social state
of nature that would somehow precede and contrast with those
subsequent states whose hallmark, according to Rousseau, was
their basis in various, increasingly complex forms of societal and
cultural distinction. This attempt to describe what can never in
truth have existed a society unmarked by any of those structures (however primitive) that constitute the very conditions
of possibility for social life can in fact be seen to define the

condition of impossibility for the sorts of claim put forward by


Rousseau and like-minded thinkers.
Such is the logic of supplementarity whereby the text
bears involuntary witness to this strain on its powers of articulate expression through various, often extreme, complexities of
logical and syntactic structure. Chief among them are complexities of a modal and temporal type, the former brought about by
Rousseaus constant switching back and forth between talk of
what must, might, or should properly have been the case with
regard to the aforementioned orders of priority, the latter by his
likewise ambivalent, grammatically and logically elusive turns of
phrase when it comes to establishing a time-indexed (i.e., historical, rather than mythic) sequence for the process of decline that
his text narrates. Hence Derridas claim that the writer writes in
a language and in a logic whose proper system, laws, and life his
discourse by definition cannot dominate absolutely, since [h]e
uses them only by letting himself, after a fashion and up to a
point, be governed by the system (1976, 158). More specifically,
what Rousseau wishes to say about the proper, authentic order
of priority between passion and reason, speech and writing, melody and harmony, primitive and civilized stages of society, or
(subsuming all these) nature and culture is everywhere implicitly subject to challenge or thrown into doubt by that same
supplementary logic.
So critics of Derrida like John Searle and Jrgen Habermas
along with some of his admirers such as Richard Rorty are wide
of the mark when they take deconstruction to consist in nothing more than a routine technique for inverting or subverting the
various distinctions between reason and rhetoric, philosophy
and literature, or conceptual and linguistic issues (see Habermas
1987; Rorty 1982; Searle 1977). On their account, his work exemplifies the textualist ne plus ultra of that widespread linguistic turn that has been such a prominent feature of philosophy in
both the analytic and the continental traditions during the past
seven decades or so (Rorty 1967). Of course, one can see how this
idea took hold, given Derridas sharp focus on matters of textual detail and his extreme attentiveness to elements of figural
or metaphoric language that must complicate any straightforward appeal to literal, express, or intended meaning. However,
he is equally insistent that those who claim to turn the page on
philosophy always end up by just philosophizing badly, since it
is a pointless (and in any case self-refuting) gesture that affects to
have done with all those old philosophical concepts and categories while, in fact, surreptitiously or involuntarily deploying them
at every turn (Derrida 1982; Norris 1989; Rorty 1982, 1989).
These issues receive their most explicit treatment in his essay
The Supplement of Copula, where Derrida offers a full-dress
transcendental argument from the conditions of possibility for
thinking or reasoning in general against the idea put forward
by the linguist Emile Benveniste that our entire stock of philosophical concepts and categories can be seen to derive from a
certain language (the ancient Greek) and its distinctive range of
lexico-grammatical structures (Benveniste 1971; Derrida 1982,
175205). On the contrary, Derrida maintains, Benveniste cannot
advance a single proposition in support of his linguistic-relativist
case without falling back upon those same conceptual resources,
such as the distinction between language and thought. So there
is no making sense of Benvenistes claim to invert the received

245

Deconstruction
(philosophical) order of priorities by treating language as the
condition of possibility for thought, or in narrowly professional
terms linguistics as the discipline now poised to occupy the high
academic ground. Rather, what emerges from a critical reading
of Benvenistes texts is the absolute necessity that any such argument should turn out to controvert its own leading premise by
taking for granted a whole range of indispensable distinctions
that derive from a prior philosophic discourse, in this case one
that finds its first clear statement in Aristotles doctrine of the
categories.
There is a similar twist of argument, and again one that
is ignored by most commentators, in Derridas essay White
Mythology: Metaphor in the Text of Philosophy (1982, 20771).
On the usual account, what he here purports to show taking a
lead from Nietzsche is the saturation of philosophic discourse
by various (predominantly visual or tactile) types of metaphor
that cannot be expunged, as some philosophers would wish, or
even brought within the bounds of rational acceptability through
a systematic treatment or method of classification. That is, they
are so pervasive and go so far toward defining the very nature,
self-image, and operative scope of that discourse that it is strictly
impossible for philosophy either to manage without them or
come up with some rigorously theorized account that would
finally reduce them to order on its own methodological terms. To
suppose that philosophy has managed to resolve this problem
to achieve a clear demarcation between concept and metaphor
or literal and figural language is to take it for granted that the
sense aimed at through these figures is an essence rigorously
independent of that which transports it, which is an already philosophical thesis, one might even say philosophys unique thesis,
the thesis which constitutes the concept of metaphor (Derrida
1982, 229). However, this confidence may appear ill-placed if
one considers the extent to which philosophy depends upon a
range of metaphorical terms and distinctions like that between
metaphor (etymologically a means of transport or carrying
away) and concept (grasping, comprehending, holding together
in thought) which make up its very element.
What I have said so far about White Mythology fits in well
enough with the received view among mainstream analytic philosophers: that Derridas approach has more in common with
literary criticism than with philosophy properly so called, that
is, the practice of rigorous conceptual analysis. However, this is
a partial and highly prejudicial reading, as soon becomes clear
if one looks beyond the opening section where his approach
might plausibly be construed along these echt-Nietzschean
lines to later passages where Derrida goes out of his way to forestall or disqualify that interpretation. His counterargument (as
with the response to Benveniste) is that philosophy has provided
all the terms and categorical distinctions that must be seen as
absolutely prerequisite to any discussion of these issues, among
them most crucially the distinctions between concept and metaphor, reason and rhetoric, or philosophy and literature.
Hence, Derridas cardinal point: that this will require not only
the highest degree of conceptual precision but also a detailed
knowledge of their history and various stages of elaboration and
refinement to date.
White Mythology makes good this claim by examining a
great range of texts by philosophers, linguists, rhetoricians, and

246

historians of science and showing how they take for granted not
only the existence of certain prior philosophical concepts and
categories but also the necessity of bringing them to bear in the
process of deconstructing the kinds of uncritical or prejudicial
thinking that often go along with them. For if the concept of
metaphor is itself what Coleridge dubbed a philosopheme a
distinctively philosophic notion then we can have no means
of questioning the supposed priority of concept over metaphor
except by way of the discourse wherein that topic has received
its most decisive statements and elaborations. Thus, it makes no
sense to proclaim, with postphilosophical adepts like Rorty,
that we should give up the old deluded quest for truth, clear and
distinct ideas, conceptual precision, and so on and henceforth
embrace the Derridean ideal of philosophy as just another kind
of writing that at best offers new and adventurous modes of creative self-description (Rorty 1982). This involves not only a snippety reading of Derrida but also a failure to grasp his point that
such gestures can amount to no more than a kind of rhetorical
hand waving or a claim to have come out on the far side of philosophy, while in fact regressing to a prephilosophical stage of
unreflective immersion in language.
Such is indeed the charge that Habermas brings against
Derrida, namely, his having leveled or annulled the crucial genre
distinction between philosophy and literature or language in its
constative (i.e., truth-based or logical) and its performative (suasive and rhetorical) modes (Habermas 1987; see also performative and constative). It seems to me, on the contrary,
that Derridas most significant achievement will be seen to lie in
his contributions to philosophy of language and logic and, above
all, his remarkably inventive and original yet none the less rigorous rethinking of the relationship between these disciplines. (For
some early indications, see Norris and Roden 2002.)
Christopher Norris
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aristotle. 1990. Categories and De Interpretatione. Trans. J. L. Ackrill.
Oxford: Clarendon.
Benveniste, Emile. 1971. Problems in General Linguistics. Trans. Mary
Meek. Coral Gables, FL: University of Miami Press.
Dasenbrock, Reed Way, ed. 1989. Re-Drawing the Lines: Analytic
Philosophy, Deconstruction, and Literary Theory. Minneapolis:
University of Minnesota Press.
Derrida, Jacques. 1973. Speech and Phenomena and Other Essays
on Husserls Theory of Signs. Trans. David B. Allison. Evanston,
IL: Northwestern University Press.
. 1976. Of Grammatology. Trans. Gayatri C. Spivak. Baltimore: Johns
Hopkins University Press.
. 1978. Writing and Difference. Trans. Alan Bass. London: Routledge
and Kegan Paul.
. 1982. Margins of Philosophy. Trans. Alan Bass. Chicago: University
of Chicago Press.
. 1982. Margins of Philosophy. Trans. Alan Bass. Chicago: University
of Chicago Press.
. 1989. Afterword: Toward an ethic of conversation. In Limited Inc,
ed. Gerald Graff, 11154. Evanston, IL: Northwestern University Press.
Habermas, Jrgen. 1987. On levelling the genre-distinction between
philosophy and literature. In The Philosophical Discourse of
Modernity: Twelve Lectures, trans. Frederick Lawrence, 185210.
Cambridge: Polity Press.

Definite Descriptions

Deixis

Norris, Christopher. 1989. Philosophy as not just a kind of writing: Derrida and the claim of reason. In Dasenbrock 1989, 189203.
. 1990. Deconstruction, postmodernism and philosophy: Habermas
on Derrida. In Whats Wrong with Postmodernism: Critical Theory and
the Ends of Philosophy, 4976. Hemel Hempstead: Harvester.
. 2000. Deconstruction and the Unfinished Project of Modernity.
London: Athlone.
. 2002. Derrida on Rousseau: Deconstruction as philosophy of
logic. In Norris and Roden 2002, II: 70124.
Norris, Christopher, and David Roden, eds. 2002. Jacques Derrida. 4 vols.
London: Sage.
Rorty, Richard. 1982. Philosophy as a kind of writing. In Consequences
of Pragmatism, 89109. Brighton: Harvester.
. 1989. Two versions of logocentrism: A reply to Norris. In
Dasenbrock 1989, 20416.
Rorty, Richard, ed. 1967. The Linguistic Turn. Chicago: University of
Chicago Press.
Searle, John R. 1977. Reiterating the differences: A reply to Derrida.
Glyph 1: 198208. Baltimore: Johns Hopkins University Press.

on the verb, pronouns, or a combination of both. Many deixes


add information in the pronoun on the referents number (singular, paucal, plural, dualis, or trialis) and its classification
(masculine, feminine, neuter, animate, inanimate, edible). In
Asian and Native American languages, first person plural pronouns often distinguish whether or not the hearer is included
in the narrated event (labeled inclusive or exclusive), as shown
in Table 1. Languages may use a two-term or three-term system
to localize the referent in space. This tripartite distinction also
applies to languages with elaborate deictic systems like Malagasy
(Madagascar) and Venda (South Africa) (see Table 2).
In one-term systems, nouns and verbs may be used for a complete deictic reference as, for example, the verb ro seaward in
the following Ewaw (Indonesia) sentence.

DEFINITE DESCRIPTIONS

Time deixis can be encoded by tense inflections on the verb or


adverbs in which the speech moment is the deictic center. Maori
(New Zealand) seems atypical in that it signals past, present, or
future tense by means of special locative markers (respectively, i,
kei, and hei in the following examples).

A phrase of the form the is a definite description. In his 1905


On Denoting, Bertrand Russell sought an account of definite
descriptions that would speak to why it is worthwhile to produce
an identity statement with such a phrase (a = a is trivial, but
Jupiter = the largest planet is informative), why one cannot
substitute Jupiter for the largest planet in George Bush often
wonders whether Jupiter is the largest planet, and why sentences that contain definite descriptions that appear to denote
nonexistent objects (the present king of France) can be meaningful if they denote nothing. (For details, see Ludlow 2009.)
James McGilvray

1.

2a.

I
tepoti rua inanahi.
LOC.past Dunedin 3d yesterday
They were in Dunedin yesterday.

2b.

Kei raro te ngeru.


LOC inside ART cat
The cat is inside.

2c.

Hei
te ata
tua haere ai.
LOC.fut ART morning 1d go
PART
We will go in the morning.

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Ludlow, Peter. 2009. Descriptions. In The Stanford Encyclopedia Of
Philosophy, ed. Edward N. Zalta. Available online at: http://plato.
stanford.edu/entries/descriptions/.
Russell, B. 1905. On denoting, Mind 14: 47993.

DEIXIS
By deixis (from the Greek to point) we mean here all
cues provided by a language that localize a speech event and its
participants in space and time. By contrast, reference is based on
the privative distinction related to the deictic center (origo) /
not related to the deictic center. There are two reference systems to localize a referent. In a positional system, the speaker is,
or both speaker and hearer are, the deictic center whose position
is used to localize an entity. A dimensional system relies on the
speech participants orientation and understanding of the environment, in which case the deictic center may be something else
instead of the speech participants. The choice for one or both of
these systems evokes major differences among languages.
There are six categories of deixis: person, space, time, discourse, emphathy, and social status (Levinson 1983). Person
deixis usually distinguishes the speaker (first person) from
the hearer (second person) and the non-speech or narrated
participant (third person). This can be encoded by inflection

Om-liik=ken nung afa en-ho ded=i


en-ro?
2sg-see=hit my thing 3sg-pass road=DEM 3sg-seaward
Did you see something of mine on that road?

Often, space deictics are used originally to locate a referent in


the discourse (as, for example, in Leti (Indonesia): ptal=d
(bottle=here) this bottle here; ptal=di (bottle=now) the bottle we are discussing now; and ptal=d=di (bottle=here=now)
this bottle here that we are discussing now).
A special type of discourse deixis, labeled switch-reference in
the literature, occurs in Native American, Papuan, and Australian
languages where special verb inflections or pronouns signal
whether or not the subject in a clause has the same referent as
the subject in the following clause (same subject [SS] or different
subject [DS]).
3.

U-hu ma or hari-k limu teyen ya-ha


lafaura.
Do-SS man he died-DS they bench make-SS placed
Then the man died, people (in the village) made a bench and
placed him there.
(Mende, Papua Niugini, after Nozawa 2000).

In Algonquian languages (North America), third person pronouns signal whether their referent is more or less topical
in the narration (for example, proximal bi versus obviative
yi in Navajo). Similarly, a languages deictic system may signal the speakers empathy toward the referent and its status
within society. In Javanese (Indonesia), for example, social

247

Deixis
Table 1. Pronoun systems in four languages
English

Quechua (Peru)

Tamil (India)

Biak (Indonesia)

1st person singular

nuqa

nn

ai

2d person singular

you

qan

au

3d person singular

he
she
it

pay

avan (masculine)
ava (feminine)
atu (neuter)

1st person plural


inclusive

we

nuqanchis

nm

u (dualis)
o (trialis)

1st person plural


exclusive

we

nuqayku

nka

nu (dualis)
mo (plural)

2d person plural

you

qankuna

nka

mu (dualis)
mo (plural)

3d person plural

they

paykuna

avar

su (dualis)
so (trialis)
si (animate plural)
na (inanimate plural)

Table 2. Demonstratives in Malagasy (Madagascar)


Referents
Distance to speaker
Boundedness

Proximal
Bounded

Unbounded

Distance-neutral
Medial

Distal

Bounded

Unbounded

Visibility

Number

Visible

Singular

Ito ~ ity

Itsy

Iry

Io

Iny

Plural

Ireto

Iretsy

Irery

Ireo

ireny

Singular

Izato

Izaty

Izary

Izao

izany

Plural

Izatero

Invisible

Source: After Imai 2003, 201.

deixis created separate low and high lexicons that permeate


the entire language. The low style is represented in 4a, the high
style in 4b.
4a.

Dewek-e sing kok tuko-ni


iwak.
self-POS REL you buy-APPL fish

4b. Piyambak-ipun ingkang sampeyan tumbas-aken ulam.


self-POS
REL
you
buy-APPL
fish

It is him whom you bought fish for.


Deixis is a major research topic in typology, pragmatics,
anthropological linguistics and, recently, in poetics.
Aone van Engelenhoven
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bhler, K. 1982 (1934). The deictic field of language and deictic worlds.
Speech, Place and Action: Studies in Deixis and Related Topics. Ed.
R. J. Jarvella and W. Klain (eds). Chichester: John Wiley, 930.

248

Duchan, J. F., G. A. Bruder, and L. E. Hewitt. 1995. Deixis in Narrative: A


Cognitive Science Perspective. Hillsdale, NJ: Lawrence Erlbaum.
Felson, Nancy, ed. 2004. The poetics of deixis in Alcman, Pindar and
other lyric. Arethusa 37.3 (Special Issue).
Garry, Jane, and Carl Rubino, eds. 2001. Facts About the Worlds
Language: An Encycplopedia of the Worlds Major Languages, Past and
Present. New York and Dublin: H. W. Wilson.
Green, Keith, ed. 1995. New Essays in Deixis: Discourse, Narrative,
Literature. Amsterdam: Rodopi.
Imai, Shingo. 2003. Spatial deixis. Ph.D. thesis, State University of New
York at Buffalo.
Levinson, Stehen C. 1983. Pragmatics. Cambridge: Cambridge University
Press.
Levinson, Stephen C., and David P. Wilkins. 2006. Grammars of Space:
Towards a Semantic Typology. Explorations in Cognitive Diversity,
Language, Culture & Cognition 6. Cambridge: Cambridge University Press.
Nozawa, Michiyo. 2000. Participant Identification in Mende. Available
online at: http://www.sil.org/ pacific/png/abstract.asp?id=506
Te Aka Mori-English, English-Mori Dictionary and Index. Available
online at: http://www.maoridictionary.co.nz/index.cfm.

Descriptive, Observational, and Explanatory Adequacy

DESCRIPTIVE, OBSERVATIONAL, AND EXPLANATORY


ADEQUACY
In empirical science, the adequacy of a theory is determined by the
degree to which it gives insight into the real nature of certain aspects
of the world. The evaluation of a theory can concentrate on the correspondence between the predictions of the theory and observed
phenomena, on the plausibility of the system described by the theory as underlying these phenomena, or on the compatibility of the
theory with theories for adjacent fields. Ideally, a theory scores well
on all three of these accounts. In linguistics, a special set of terms formalizing these criteria was introduced by Noam Chomsky (1964).

Origin of the Terms


The terms of observational adequacy, descriptive adequacy, and
explanatory adequacy were introduced by Chomsky in his plenary address to the Ninth International Congress of Linguists in
1962. These terms are not widely used outside linguistics. They
are collectively referred to as the levels of adequacy. Chomsky
(1964, 289) describes them as follows:
A grammar that aims for observational adequacy is concerned
to account for observed linguistic utterances.
A grammar that aims for descriptive adequacy is concerned
to account for the speakers underlying system of intuitions.
A linguistic theory that aims for explanatory adequacy is concerned to provide a principled basis for selecting a descriptively adequate grammar.

Figure 1. Chomskyan linguistics and levels of adequacy.

adequacy, it is sufficient that the observable facts are covered.


For descriptive adequacy, it is required that they be covered by
a grammar that describes the speakers competence. Although
both observational and descriptive adequacy are described
as properties of grammars, only for descriptive adequacy does
the grammar correspond to grammar in Figure 1. A grammar
that ignores the need to describe the speakers competence is a
grammar of a kind not represented in this figure. The opposition
between descriptive and explanatory adequacy is characterized
by the fact that the latter is a property of a linguistic theory of a
higher level of abstraction than a grammar.

Position in Chomskyan Linguistics


The interpretation of the three levels of adequacy cannot be separated from Chomskys view of the nature of language and the way it
should be studied. Schematically, this view can be represented as
in Figure 1, which is based on ten Hacken (2007), where a detailed
discussion of and motivation for the elements of the diagram may
be found. On the left-hand side, real-world entities are represented. Observable facts are phenomena and events that can be
observed by the linguist, for example, grammaticality judgments. competence is the knowledge of language in the mind/
brain of the speakers that enables them to produce these facts. The
language faculty is a set of genetically determined predispositions
of human beings that enable them to acquire this competence.
The gray arrows between them can be read as underlies.
The rounded rectangles in the middle of Figure 1 represent
theoretical entities. An observation is a theoretical entity in the
sense that it imposes a certain structure on the world and selects
relevant aspects. A grammar in Chomskyan linguistics is a theory of the speakers competence. universal grammar (UG)
is a theory of the language faculty. The grammar can be tested
by observations and can explain these observations because
they correspond to consequences of the competence. UG can be
tested by individual grammars because for each competence, a
grammar must be available that is allowed by the language faculty. UG explains the individual grammars in the sense that it
describes the mechanism that makes the emergence of competence in the individual (i.e., language acquisition) possible.
As indicated in Figure 1, the levels of adequacy correspond
closely to the levels of theoretical depth. For observational

Discussion until the Emergence of Principles and


Parameters
The main purpose of the introduction of the concept of observational adequacy seems to have been to set off Chomskyan linguistics from post-Bloomfieldian linguistics. In post-Bloomfieldian
linguistics, a grammar was not supposed to describe the speakers
competence because the speakers competence is a mental entity.
Zellig S. Harris ([1951] 1960) and Charles F. Hockett (1954), for
instance, reject any appeal to mental states because it is impossible to observe them directly. Any recourse to mental entities was
deemed unscientific. In the framework of Figure 1, this is tantamount to a rejection of descriptive adequacy as a legitimate goal
of linguistic theory. Of course, post-Bloomfieldian linguists could
not accept the allegation that they were aiming only for observational adequacy, as shown by Fred W. Householder (1965). As
analyzed by Pius ten Hacken (2007), post-Bloomfieldian linguistics assumed a different set of criteria for the selection of grammars, which was not compatible with the framework of Figure 1.
The criteria of descriptive and explanatory adequacy create a
certain tension because descriptive adequacy is served by a weak
UG and explanatory adequacy by a strong UG. The weaker the constraints imposed by UG, the more different grammars it allows and
the easier it is to find one for a particular language. The stronger
the constraints imposed by UG, the smaller the range of grammars
a child has to choose from in language acquisition and the more
aspects of the grammar that are determined by genetic factors.
The tension between descriptive and explanatory adequacy is
mentioned by Chomsky (1981, 3) as beneficial because it directs

249

Descriptive, Observational, and Explanatory Adequacy


linguistic theory to an optimal balance between the power of UG
and the power of individual grammars. It provides the basis for a
solution to the problem of selecting the grammar that corresponds
to the way competence is actually organized in the speaker. If we try
to devise a grammar for a language on the basis of a finite set of data,
there are indefinitely many candidates. No (finite) amount of additional data can constrain the range of candidate grammars to a finite
set. A very similar problem was known to the Post-Bloomfieldians
as the problem of non-uniqueness (cf. Chao [1934] 1957).
By exploiting the tension between descriptive and explanatory
adequacy, Chomsky hoped to solve the indeterminacy and, at the
same time, reach a deeper level of explanation. In Chomskyan
linguistics, it is assumed that there is a single correct grammar,
that is, the one describing the actual system in the speakers
mind. If this system can come into existence, it has to be learnable on the basis of the language faculty and a limited amount
of input data from the environment. Whatever is contributed to
language acquisition by the language faculty must be common
to all human languages. This reasoning implies that descriptive
and explanatory adequacy can only be achieved simultaneously.
Without a proper theory of the genetically determined language
faculty, it is not possible to find a grammar describing the actual
knowledge of language that a speaker has.
An alternative tradition in the approach to the problem of
non-uniqueness is to deny the relevance of the problem. Harris
([1951] 1960) and W. V. Quine (1972), for instance, assume that
any grammar that covers the data is a correct grammar and that
there is no principled way to choose between alternative correct
grammars (cf. indeterminacy of translation). A common
assumption in this tradition is that explanatory adequacy can
only be achieved after descriptive adequacy has been achieved.
Gerald Gazdar and colleagues observe that a description of the
relevant phenomena is a necessary precondition to explaining
some aspect of the organization of natural languages (1985, 2).
Taken literally, their observation does not have a direct bearing
on the order in which descriptive and explanatory adequacy can
be achieved. They only mention description and explanation, not
descriptive adequacy and explanatory adequacy. For the application of the latter pair of terms, it is necessary to conceive a grammar as a theory of a speakers competence. As this conception is
generally foreign to theories in this tradition, they are all classified
as aiming only for observational adequacy in the original sense of
the levels of adequacy introduced by Chomsky (1964).

Recent Developments
Chomsky (1981) introduced the principles and parameters
(P&P) model. In this model, the language faculty is considered as
a set of genetically determined principles that are operative in all
languages. In order to account for the differences between languages, principles are assumed to have parameters. A parameter
is a variable in the principle with a predetermined set of values.
Language acquisition is then analyzed as finding the right values
for the parameters. The P&P model solves the tension between
the demands of descriptive and explanatory adequacy because it
provides a basis for explaining language acquisition while describing the range of attested languages. It does not provide an obvious
constraint on the proliferation of parameters, however. This means
that the indeterminacy problem arises again, but at a higher level.

250

The higher-level indeterminacy problem raised by the proliferation of parameters cannot be solved within the framework of
Figure 1. Chomskys (1995) minimalist program (MP) (see
minimalism) addresses this problem by considering the questions of the unification of linguistics with biology and the emergence of the language faculty in evolution (see biolinguistics).
The latter aspect is elaborated by Marc D. Hauser, Chomsky,
and W. Tecumseh Fitch (2002). In the analysis proposed by ten
Hacken (2007), the MP adds a new pair of a real-world entity and
a theoretical entity on top of the three represented in Figure 1.
These entities determine and explain evolution, respectively.
They are not part of linguistics proper but operate more generally. It is not necessary to specify their exact nature in order to
use them as constraints on the way the language faculty can have
emerged and should be shaped.
If the levels of adequacy are considered only in the way they
are defined by Chomsky (1964), they have lost their relevance in
the MP, as stated by Chomsky (2002, 12933). If they are considered as reflecting a wider-ranging concern in empirical science,
they have to be reformulated in a more general way. Ten Hacken
(2006) considers two alternative ways of adapting the levels of
adequacy to the expanded framework arising from the MP. One
is to add a new level of adequacy, directly connected with the
new pair of entities added on top of Figure 1. The other approach
is to relativize descriptive and explanatory adequacy with respect
to the level of the entities to which they are applied.
Pius ten Hacken
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chao, Yuen-Ren. [1934] 1957. The Non-uniqueness of phonemic solutions of phonetic systems. Bulletin of the Institute of History and
Philology 4: 36397. Repr. in Readings in Linguistics: The Development
of Descriptive Linguistics in America 19251956, ed. Martin Joos, 3854.
Chicago: University of Chicago Press.
Chomsky, Noam. 1964. Current Issues in Linguistic Theory. Den
Haag: Mouton. This book is the original source for the terms descriptive, observational, and explanatory adequacy.
. 1981. Lectures on Government and Binding. Dordrecht: Foris.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
. 2002. On Nature and Language. Cambridge: Cambridge University
Press.
Gazdar, Gerald, Ewan Klein, Geoffrey Pullum, and Ivan Sag. 1985.
Generalized Phrase Structure Grammar. Cambridge: Harvard
University Press.
Harris, Zellig S. [1951] 1960. Methods in Structural Linguistics.
Chicago: University of Chicago Press. Repr. as Structural Linguistics, 1960.
Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch. 2002. The
faculty of language: What is it, who has it, and how did it evolve?
Science 298: 156979.
Hockett, Charles F. 1954. Two models of grammatical description.
Word 10: 21031.
Householder, Fred W. 1965. On some recent claims in phonological
theory. Journal of Linguistics 1: 1334.
Quine, W. V. 1972. Methodological reflections on current linguistic
theory. In Semantics of Natural Language, ed. Donald Davidson and
Gilbert Harman, 44254. Dordrecht: Reidel. This article offers an alternative conception of adequacy for grammars.
ten Hacken, Pius. 2006. The nature, use and origin of explanatory adequacy. In Optimality Theory and Minimalism: A Possible Convergence?
ed. Hans Broekhuis and Ralf Vogel, 932. Linguistics in Potsdam 25.

Dhvani and Rasa


This article offers a version of descriptive and explanatory adequacy
that adapts them to the latest developments in Chomskyan linguistics.
. 2007. Chomskyan Linguistics and Its Competitors. London:
Equinox.

DHVANI AND RASA


The central concept of ancient Indian aesthetics is rasadhvani. A
combination of two interrelated terms, rasa and dhvani, the term
refers to a complex theory of emotion-genres based on suggestiveness of language, allusion, and imagery. The theory developed
during the classical period of Sanskrit literature. More recently,
rasadhvani has drawn the attention of scholars in cognitive science, especially those who study representational emotion (Oatley
2004, 152). The term rasa refers to a readers or viewers aesthetic
experience in relation to a work of art, music, or literature, his/her
enjoyment (or relish) of it. Dhvani refers to the verbal processes
of suggestiveness, or vyajakatva (Ingalls 1990, 9). Vyajakatva
of language and other representational signs is essential for triggering memory traces in the mind of the recipient. When this
happens, an intersubjective mirroring (see mirror systems,
imitation, and language) aligns remembered emotion with
the represented rasa (Hogan 1996, 17071).
Given the emphasis on an essentially transactive relationship between the actual and the imaginary, rasadhvani can be
conceived of as a theory of reader response. However, it is not
merely a reader-response theory but also a systematic theory
of representation. Among the prominent rasa theorists whose
treatises have been translated and reprinted are Bharat-Muni
(second century b.c.e.), nandavardhana (ninth century c.e.)
and Abhinavagupta (tenth century c.e.). A major contribution
of these three, among many others, is the linking of emotion to
genre in a systematic way (Oatley 2004, 153). Although each has
his own pet project and theoretical (or practical) obsession, the
common assumption is that a basic emotion insofar as one can
conceive of an emotion as basic, such as anger, fear, or love in
its rasa format can be and often is the unifying principle for an
artwork, while ancillary emotional states elaborate the unifying
rasa through antithesis and collaborative synthesis.
In their account of aesthetic experience, the rasa theorists refer
to physical processes, such as senses and sense perceptions, and
body, hand, and facial gestures, with the same ease with which
they refer to subtler processes, such as the pra (breath), mental
entities like manas (the mind), citta (cognition), budhi (cognizing
intelligence) and ahakra (ego consciousness). In his widely
known work Yogastra, Patajali (third century c.e.) searched for
points of alignment between bodily processes that affect the mind
and higher cognition (Patajali 1971, 6694). His ideas on the theory and practice of yoga aim for personal development, as well
as social practice based on compassion and non-violence. Later,
nandavardhana and Abhinavagupta considered aesthetic experience, too, as a form of yoga, approximating samdhi (contemplative realization), though it remains grounded in the materiality
of experience. To this purpose, they refined the folk science of
emotions while trying to give an account of why and how literary
works induce emotional states and thought trends that produce
aesthetic pleasure, why representations of fear, anger, horror, and
so forth are enjoyable and deeply satisfying.

Just as the substantive body of Greek literature provided


materials for Aristotles therapeutic idea of katharsis, the rasa
theorists also had literary materials available to them. These
included all of the epic material and most of the early classical
material that we now possess, and they were acquainted with a
substantial literature in Prkrit [vernacular], most of which is now
lost (Ingalls 1990, 5). Although earlier Sanskrit poetics tended
to be somewhat prescriptive, rasa and dhvani aesthetics is primarily descriptive. No doubt, this is in part due to widely shared
knowledge of canonical and local literatures, a fully developed
literary culture, and the associated reception aesthetics. From
the engagement of centuries, thus, emerged the nine emotiongenres, or rasas: the erotic (gra), the comic (hsya), the
tragic (karua), the furious or cruel (raudra), the heroic (vra),
the fearsome or timorous (bhaynaka), the gruesome or loathsome (bbhats), and the wondrous (adbhuta). To these a ninth
was added later, the rasa of peace (nta) (Ingalls, 1990, 16).
Elaborate identification of determinants (vibhvas) of emotion, its consequents (anubhvas), permanent mood-congruent
states of mind (sthybhvas), and the transient states of mind
(vyabhicri, or sacr bhvas) involved the rasa theorists into
making a distinction between two sorts of emotion, rasa and
bhva. It is not uncommon for early theorists, as well as their
modern commentators, to give differing accounts of the distinction between rasa and bhva, because this is the most controversial area in rasadhvani studies (Pandit 2003, 16572). The general
consensus, however, is that bhva, as indicated, is everyday emotion grounded in self-interest and ego attachment, and rasa is what
we feel empathically in relation to the objective determinants of
rasa, most often via characters in fiction. English renditions of
the Nyastra translate bhvas as emotional tracts and states,
providing long lists under each subtype of what would today be
called emotion categories, ranging from physical emotion to
mental states and thought trends (Bharata-Muni n.d., 87113).
Insofar as bhvas are already part of representation, the differentiation of rasa as a separate emotive entity assumes that within
the deictic fields of a narrative, characters emotions will function, deictically, as raw emotions, rooted in egotism and various
forms of misrecognition, but for the reader the emotional experience will be of rasa. Some of this applies to characters as well.
In contrast to real people, characters as deictic subjects come to
have fictionally complete lives (i.e., their stories resolve either
in death or with lovers uniting in the happily ever after); hence,
some of their emotive experience, especially toward the end, will
be that of rasa. In desiring, they will go beyond desire to achieve
an emotional state based on recognition and understanding.
The rasa versus bhava distinction becomes more definitively
clear in nandavardhanas focus on the importance of nta
rasa, which he considers the greatest happiness (Ingalls 1990,
16). Citing a verse from the epic Mahabhrata as supporting evidence, nandavardhana says that nta (the peaceful) is characterized by the full development of happiness that comes from
the dying of desire (nandavardhana 1990, 520). Not invested
in the dying of desire but its aesthetic transformation, his successor Abhinavagupta too considers nta an essential part of
the rasa emotionality and, hence, a part of all rasas. Countering
objections that nta cannot be regarded as an emotion, he
asks: What shall we call the heroism of compassion? Is it the

251

Dhvani and Rasa


heroism of religion, or the heroism of generosity? It is neither; it
is simply another name for the peaceful [nta] (Abhinavagupta
1990, 525).
Upon careful examination of the primary texts on rasadhvani,
nandavardhanas Dhvanyloka and Abhinavaguptas commentary, one is led to believe that while bhva covers a range
of emotions, representational and nonrepresentational, rasa is
emotion aligned with the mind-steadying potentialities of nta.
The critics of this idea, who are rebuffed by Abhinavagupta, mistake steadiness of the mind for stillness and wonder if nta can
be a rasa.
In addition to considering nta an emotion-genre as well as
an overarching aesthetic for all rasas, Abhinava combines the
concept of rasa with the concept of dhvani. It is through patterned verbal suggestion (dhvani) that the violent, the hateful,
the horrific, the furious will give rise to aesthetic enjoyment: the
rasa experience. There are various forms of dhvani, but Abhinava
considered rasadhvani the most important. The most basic form
of dhvani is vastudhvani, which refers to suggestion of a thing,
or a fact. It is not necessarily emotive, but emotive dhvani can
build on it. The dhvani movement began with a paradigm shift
away from the figures of speech (alakra) emphasis in earlier
Sanskrit poetics. However, it did not abandon that idea. The concept of alamkdhvani combined connotation through figures of
speech with suggestion, though rasadhvani was still being considered more important.
In this connection, Daniel H. H. Ingalls notes that in Greek
rhetoric, significatio may seem like a close parallel to dhvani, but it
is the figure that draws attention to itself. He continues: [O]nly
under allegory and irony does Greco-Latin rhetoric come to what
would qualify with nanda as dhvani, and at that only vastudhvani (Ingalls 1990, 38). Unlike significatio, dhvani is not a trope;
it does not draw attention to itself. It is a suggestive process, an
aesthetic strategy that erases the primary meaning, mukhyrtha,
of a word, or phrase, to incline it toward suggestiveness within
the textual context of a representational schema.
It is important to keep in mind that the term dhvani was originally borrowed from the grammarians for whom it had a technical meaning. In Vkyapadya, Bharthari defines dhvani in the
following manner: The true form [that is the semantic content]
in the word that is manifested by the dhvani is determined by a
series of cognitions [viz., the cognitions of successive phonemes],
which are unnameable [that is to say, each phoneme-cognition
in itself is unassignable to this word or that], but favorable to the
final [word-identifying] cognition (quoted in Ingalls 1990, 170;
bracketed insertions from Ingalls).
In poetic theory, the technical, linguistic meaning of dhvani
is used only metaphorically. Graduated phoneme-cognition
stands for the temporality of the reading process, where one is
engaged but is not sure what the dhvani meaning might be until
the narrative, musical, or other equivalent of the final wordidentifying phoneme-cognition is registered on the mind.
A brief example of the operations of dhvani used in the
Sanskrit tradition is a Prkrit verse, where a wife gives sleeping
directives to the stranger who is a guest for the night (while the
husband is away). She says Mother-in-law sleeps here, I there;
/ look, traveler, while it is light. / For at night when you cannot
see, / You must not fall into my bed (Ingalls 1990, 14). Clearly,

252

by erasing the primary sense, mukhrtha, of her words, the lonely


wife makes an erotic suggestion through a literal negation of it.
This is a very simple example. Others from the epics Rmyaa
and the Mahbhrata are much more complicated.
In explicating some of these epic and classical examples from
the rasadhvani perspective, Abhinavagupta developed a theory
of memory equipped with the notion of memory banks, storage
and retrieval processes. The connection to memory semantic,
emotional, and episodic allowed the subsequent integration of
rasadhvani into contemporary cognitive science (Hogan 1996,
17076).
Following
traditional
theories
of
consciousness,
Abhinavagupta believes that all experiences perceptual,
cognitive, emotional, etc. leave traces in the mind (Gnoli
1968, 79). Reflecting on why representational grief is relished,
Abhinavagupta explains that the basic emotion for grief is
compassion. And, compassion consists of relishing (or aesthetically enjoying) grief. That is to say, where we have the basic
emotion of grief, a thought trend that fits with the vibhvas and
anubhvas of this grief, if it is relished (literally, if it is chewed
over and over), becomes a rasa and so from its aptitude [towards
this end] one speaks of any basic emotion as becoming rasa
(Abhinavagupta 1990, 117; insertions by Ingalls). Drawing a general conclusion, he adds: The basic emotion is put to use in the
process of experiencing the rasa of literary and art works, as
thought-trends are transferred from what one has already experienced in ones own life to one which one infers in anothers life
(ibid.). This process involves distancing of ones own emotion
from self-interested concerns to something larger.
Abhinavaguptas notions of memory-trace (saskra)
and desire-trace (vsan) are not unrelated to his theological preoccupation with how to free the mind from egocentric
attachments (and ephemeral satisfactions) to incline it toward
transcendental joy: nanda. The importance given to nta rasa
has the same origin. The materiality of aesthetic experience, for
Abhinavagupta, is a means of moving toward the nonmateriality
of transcendental experience. Socially, the rasa theory focuses
on education of emotions that would produce sahdaya citizens.
Abhinavagupta defines sahdayatva (literary sensitivity) as the
faculty of entering into the identity of the poet (Ingalls 1990,
72). In a more modern context, sahdaya is a person trained in
rasadhvani-generated understanding of emotion and social obligation (Hogan 2003, 1217).
An instance from John Websters Duchess of Malfi ([1623]
1961), a work clearly not known to the rasadhvani theorists, will
demonstrate to what extent applications of the rasadhvani aesthetics are contingent neither on a shared origin nor on areal
connectedness of traditions. Briefly, Websters play revolves
around a duchesss secret marriage, upon her early widowhood, to her steward, against the wishes of her socially powerful
brothers. The marriage remains a secret for some time, but in the
middle of the play the duchess is separated from her husband,
Antonio, and is imprisoned along with her children and servant
woman, Cariola. Toward the end of the play, a loyal friend, Delio,
and Antonio approach the cardinals house in hopes of reconciling Antonio to his wifes brothers. While the reader knows that
the duchess, her children, and Cariola have been murdered,
Delio and Antonio do not, at least not with certainty.

Dhvani and Rasa


The deserted point of entry they choose is a walkway around
a fortification near the ancient ruin of an abbey. A piece of the
old cloister around there Gives the best echo, says Delio to
Antonio. The words repeated by the echo are so clear in their
enunciation, Delio remarks, that it is regarded by some as a
spirit (5.3.18). As one would expect, the echo repeats the last
words and phrases of what Antonio and Delio say to each other,
and these repeated sentence fragments (italicized in many editions of the play) compose a text of their own with a dhvani message to Antonio: a message that holds a mirror to his suspicions
and dread.
As the conversation continues, Delio remarks that like
human beings, ancient monuments also become deceased
and so Must have like death that we have (5.3.19). The echo
repeats: Like death that we have (5.3.20), changing the referent
for we from humans in general to the duchess and her children. For the reader, the echo-utterance is her utterance, not
because she has become a ghost but because we know that the
duchess would like to inform and warn her husband. On his part,
Antonio imagines that the echo spoke in A very deadly accent
(23), as instantly it echoes back: Deadly accent (21). Delio, less
filled with dread and more with anxious anticipation, thinks this
utterance signifies some thing of sorrow (25). Expectedly, the
phrase, A thing of sorrow is reverberated and collated with the
repetition of That suits it best (256). Drawn thus far into the
echo-utterance, Antonio says: Tis very like my wifes voice; the
echo promptly repeats: Ay, wifes voice (27). The next few syntactic and semantic fragments, similarly taken out of the stated
context when repeated by the echo are Do not (30) suggesting
that they should not go in. This pattern continues for some time,
till the text develops through echoes of Fly your fate (38); For
thou art a dead thing (41).
At the narrative level, clearly, the echo phenomenon foreshadows the end; yet there is no need for foreshadowing. The
reader knows that nothing good can come out of Antonios going
into the house. It is equally clear that Webster is not using the
supernatural trope. The echo-text is like a code; it uses vastudhvani in the way the primary sense in the sentences spoken by
each speaker, their mukhyrtha, is erased. A suggested meaning, or the dhvani meaning, is inserted systematically. Up to this
point in the play, the plot line involving the duchess and her family, though it has had moments of tragic grandeur, was mostly
marked by abjection and horror. Into this general scenario, the
well-placed junctural scene inserts a reflective moment of tranquility (of nta), with a muted flavor of retrospective and prospective experience of grief.
Through the dhvani resonance of the echo scene, the tragic
fate of this holy family (Antonio last parted from his wife near
the Loretto convent, where she was captured by her brother, the
cardinal [3.5.1140]) is, thus, represented through karua rasa,
the aesthetic feeling of compassionate sorrow. In contrast, what
happens to the brothers is represented through the emotiongenre of absurd horror. From the start, Webster walls off the
inner story of the duchesss secret marriage, her motherhood,
and marital happiness from the overall clamminess of a sordid
world, where the most heinous sinner is the one who punishes
others for violation of social rules. Despite Websters many ironies (embodied in the character of Bosola), the emotional logic

of this inner story is like a clear stream, resonant with muted


gra (the rasa of romantic love), and is constituted by various determinants, consequents, and permanent and transient
emotional states. For instance, one permanent mood-congruent state of mind (sthy bhva) is the duchesss love for her
husband and her enjoyment of her secret marriage, despite the
dread. Another is her vtsalya, mother-love. At the moment of
her brutal murder in prison, when Cariola cries out that she will
die with her, the duchess pleads: I pray thee, look thou givst
my little boy / Some syrup for his cold, and let the girl / Say her
prayers ere she sleep (4.2.18890). Some of the transient emotional states (sacri bhvas) are her hopes, fears, worries, and
anxieties as she conceals her pregnancies and plans for her familys safety.
The precision and economy of Websters semantic technique
presents the reverberated sentence fragments not as random
repetitions. Rather, they are put together like semantic parallels to the sequence of phoneme-cognitions, where the final
phoneme, retrospectively, forms a meaningful unit the grammarians concept of dhvani, as mentioned earlier. Here, the final
semantic unit, like the final phoneme, brings forth the full efficacy
of rasadhvani when the two auditors (Antonio and Delio) stand
awed by the final words: Never to see her more (5.3.45), echoed
in answer to Antonios My duchess is asleep now, / And her little ones, I hope sweetly, O heaven, / Shall I never see her more
(424). After his conditional statement is changed to an assertion
by the echo, never see her more, Antonio says: I marked not
one repetition of the echo / But that; and on the sudden, a clear
light presented me a face folded in sorrow (469). What Antonio
has known subconsciously for some time, but has not dared to
believe, is communicated to him with concern, consolation, and
infinite sorrow minutes before he will have no time for reflection (on his fate), or for emotion of any kind.
Lalita Pandit Hogan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abhinavagupta. 1990. The Dhvanyaloka of nandavardhana with Locana
of Abhinavagupta. Trans. Daniel H. H. Ingalls, Jeffrey Moussaieff
Masson, and M. V. Patwardhan. London: Harvard University Press.
nandavardhana. 1990. The Dhvanyaloka of Anandavardhana with
Locana of Abhinavagupta. Trans. Daniel H. H. Ingalls, Jeffrey Moussaieff
Masson, and M. V. Patwardhan. London: Harvard University Press.
Bharata-Muni. n.d. The Natya Sastra. Delhi: Satguru Publications
Gnoli, Raniero. 1968. The Aesthetic Experience According to Abhinavagupta.
Varanasi, India: Chowkhamba Sanskrit Series.
Hogan, Patrick. 1996. Towards a cognitive science of poetics: Anandadhana, Abhinavagupta, and the theory of literature.
College Literature. 23.1: 16478.
. 2003. Introduction: Tagore and the ambivalence of commitment. In Rabindranath Tagore: Universality and Tradition, ed. Patrick
Colm Hogan and Lalita Pandit, 923. Madison, NJ: Fairleigh Dickinson
University Press.
Ingalls, Daniel H. H., ed. 1990. The Dhvanyaloka of nandavardhana
with Locana of Abhinavagupta. London: Harvard University Press.
Oately, Keath. 2004. Emotions: A Brief History. Oxford: Blackwell
Pandit, Lalita. 2003. The psychology and aesthetics of love: ringra,
Bhvan, and Rasadhvani in Gora. In Rabindranath
Tagore: Universality and Tradition, ed. Patrick Colm Hogan and Lalita
Pandit, 14174. Madison, NJ: Fairleigh Dickinson University Press.

253

Dialect
Patajali. 1971. Yogasutra. Ed. and trans. J. R. Ballantyne and Govind
Shastri Deva. Varanasi, India: Indological Book House.
Webster, John. Duchess of Malfi. [1623] 1961. In Elizabethan Drama, ed.
Leonard Dean, 271360. 2d ed. Englewood Cliffs, NJ: Prentice-Hall.

Dialogism and Heteroglossia


community may lead to new dialects. Because dialect boundaries
are fuzzy, contiguous regional dialects may form dialect chains.
Within the chain, two adjacent dialects will display greater similarity than two dialects that exist at a distance from one another.
Richard Cameron

DIALECT
In pursuit of a useful definition of dialect, one may begin by
distinguishing language from dialect. Think of a language as a
set. A dialect is a set member of that set. Just as a set does not
exist in the absence of its set members, so a language does not
exist in the absence of its dialects. A dialect is often identified
as a variety of a language spoken by a group of people. Thus, a
dialect involves the shared linguistic behavior and knowledge
of a group, not of the idiosyncratic individual. This variety may
be understood, to varying degrees, by people who speak other
contemporary varieties of the same language. This criterion of
mutual intelligibility is problematic as different languages can
also show degrees of mutual intelligibility. (Consider Swedish
versus Norwegian.) Consequently, to speak of a dialect presupposes a prior identification of the language to which the variety
belongs. Within this language, each variety will differ from others
in terms of pronunciation, vocabulary, and/or grammar. More
precisely, each variety may differ in its phonetic implementation
rules, phonology, lexical items or semantics, morphology,
syntax, and the pragmatic functions of discourse markers or
syntactic structures. Although differences of speech-act realizations, narrative types and topics, or conversational routines
can also exist, less research has explored these issues.
Generally, differences between varieties of the same language
may be quantitative or qualitative in nature. What is a quantitative difference? Consider words like the or that. Most speakers of
English will sometimes pronounce the initial consonant as a fricative [] and sometimes as a stop [d]. This alternation between
[] and [d] can be quantified. In turn, the frequencies with which
individuals say [] versus [d] may correlate to categories of identity, such as age, gender, class, or ethnicity. Alternating forms,
like [] ~ [d], are called variants of a sociolinguistic variable. This
variable is represented as (dh) in the literature. Sociolinguistic
variables like (dh) show quantitative differences across social
classes (upper favors [] more than lower), genders (women
favor [] more than men), and age (working adults favor [] more
than young children). Such frequency differences have also been
termed group-preferential differences.
What is a qualitative difference? Consider the Spanish la casa
(house) and la caza (hunt). In Puerto Rico, they are both pronounced [kasa]. In Spain, casa is [kasa] but caza is [kaa]. Thus,
in one dialect we find a feature that is absent from another. This
is a qualitative difference. Some researchers identify this as a
group-exclusive difference.
Qualitative and quantitative dialect differences may map onto
groups defined by geographical regions or by social characteristics. Thus, researchers speak of regional or social dialects. This
differs from the common belief that dialects are only regional.
Because differences among the genders or age groups occur, we
may ask if varieties associated with gender or age are dialects as
well. In practice, researchers reserve social dialects for class and
ethnic varieties, even though gender and age differences in a

254

SUGGESTION FOR FURTHER READING


Labov, William, Sharon Ash, and Charles Boberg. 2006. The Atlas of
North American English: Phonetics, Phonology and Sound Change.
Berlin: Mouton/de Gruyter.

DIALOGISM AND HETEROGLOSSIA


Dialogism and heteroglossia, as well as polyphony, chronotope,
and an array of related terms used in literary criticism, cultural
studies, and postcolonial studies, are most commonly associated with the writings of Mikhail Bakhtin (18951975). The
concepts connected with these terms serve to question various forms of epistemologism or fixing of putative knowledge in
a timeless, generalized, absolute form, a tendency common to
many European philosophical systems of the nineteenth century and before (Holquist 1990, 18). Detailed elaborations of
dialogic terminology by the original authors and commentators suggest a general distrust of any transformation of temporal experience into objectivist abstraction through a reifying
vocabulary. Even though dialogical theory has accumulated its
own privileged vocabulary, its deeper project is to challenge the
monologic discourses of authority in favor of internally persuasive discourses, which are anchored in the polyglossia of
social life (Bakhtin 1981, 3345). From this perspective, dialogue
is essential to the generation, accumulation, and dispersion of
knowledge.
Given this emphasis on dialogue, it is somewhat ironic that the
disputed authorship of various writings of the Bakhtin Circle has
given rise to a great deal of debate over the years (Holquist 1990,
21011). In their definitive biography, Katerina Clark and Michael
Holquist take pains to establish that the volumes attributed to
Valentin Voloshinov and P. N. Medvedev were really authored
by Bakhtin. His involvement in reverse plagiarism, the biographers attest, is a consequence of Bakhtins love of conversation, the give and take of good talk, which is the keystone of his
dialogism (Clark and Holquist 1984, 14670). Looking toward
applications of dialogical theory to digital media, classroom
pedagogy, cognitive theory, and empirical studies, others maintain that for searching beyond early dialogism, it is important
to keep in mind that there were important differences in ideology among the members of the Bakhtin Circle (Bostad et al.
2004, 7). Nevertheless, they all shared a conviction that sociohistorical embeddedness of symbolic tools implies that signs
carry their previous use with them without having entirely fixed
meanings (ibid., 11). This provides sufficient internal coherence
to dialogical theory.

Some Basic Concepts


In the course of discussing the novel as a polyphonic genre,
Bakhtin defines heteroglossia as the diversity of speech genres
that are rooted in social life. The novel, he says, orchestrates all

Dialogism and Heteroglossia


its themes, the totality of the world of objects and ideas expressed
in it by means of the diversity of speech genres (razonreie) and
by differing individual voices that flourish under such conditions (1981, 263). His preference for the novel over the epic is
based on the notion that high genres distance literary form from
the diversity of speech genres by choosing an elevated style, a
unitary language, at the expense of the common language
and its polyphonic utterances. This joining of diversity must
be simultaneous, not segmented into different times. In other
words, it must be heteroglossic simultaneity. That simultaneity is
neither unitary nor directed at a telos. Its only goal is interactive
self-understanding.
Interaction is produced by dialogue, which brings us to dialogism. Foregrounding utterance as the most salient unit of
language and dialogue as the social and biological condition in
which language flourishes, Bakhtin states: We are taking language not as a system of abstract grammatical categories, but
rather language conceived as ideologically saturated, language
as world view, even as concrete opinion, insuring a maximum
of mutual understanding in all spheres of ideological life (1981,
271). Just as language is not merely a system of abstract grammatical categories, but concrete opinion and world view,
truth (pravda) is ongoing dialogue, never-ceasing talk. Since
talk brings into play both the cognitive structure of the brain and
ones immersion in the social life of a language, consciousness is
necessarily participative consciousness, and the primary unit
of polyphony in the novel is the author-hero dialogue that produces an open-ended unfinalizable hero (Reed 1999, 117).
Although the specific subject for these formulations is
Dostoevsky, the implications apply to literature, philosophy,
linguistics, ethics, and social life in general. Similarly, the carnivalesque, as explored by Bakhtin in Rabelais, includes dialogically significant speech genres, such as curses, oaths, slang,
humour, popular tricks and jokes, scatological forms, in fact all
the low and dirty sorts of folk humor (Peter Stallybrass and
Allon White, quoted in Emerson 1999, 247). For Bakthin, however, the structures of addressivity and answerability the ways
in which all speech is oriented toward an addressee and calls
for a dialogic answer have a larger constitutive function than
one might at first infer from the phrase speech genres. Within the
frames of heteroglossic creativity and constraint, Baktin suggests,
Heroes are genres, and trends and schools are second and third
rank protagonists (1990, 78).
To say that some discourse is or is not dialogical itself risks
reifying that discourse in a monological way, however. Thus,
Bakhtin stresses, we may dialogize a discourse. Indeed, this perspective can be used to enter into dialogue with Bakhtins relegation of the epic to the discourse of authority. A Bakhtinian
approach entails that epics are utterances grounded in temporal
social realities that, at some point in time, attain a degree of evaluative finalization and absorb a degree of valorized perception.
The Indian epic Rmyaa, for instance, with its regional versions still available, along with their link to lively speech genres,
can be understood as an ongoing societal conversation about
ideas of order in family and state, about right rule, duties of wife
to her husband, brother to brother, ruler to subject, and so forth.
However, it should not be forgotten that the Rmnyaas long
life in cultural memory comes from its lasting value determined

by continual mainstreaming of its message, but also by continual and only partially successful marginalization of its residual
polyphony and heteroglossia.

What Dialogism Is Not


One might reasonably ask here if dialogism is just another way of
referring to Hegelian (or Marxist) dialectic or Saussurean binarism. In fact, there are striking differences, even contradictions,
among these concepts. Dialogism may, in fact, be seen as quarreling with the tropes of dialectic and binary opposition. On the
other hand, these quarrels never resort to confronting systems
with their elaborated antitheses because of the complicities this
entails (Pechey 1999, 326). In other words, Bakhtinian dialogism refuses an agonistic identification with the aggressor, as it
resists any kind of maximal codification of its own discourse and
terminology.
Consider dialectical thinking. Although it allows for a thesis
and an antithesis, dialectical thinking resolves differences in a
synthesis. Dialogue, on the contrary, advocates a copresence of
differences. More importantly, in dialogue, differences are never
only two that can be conveniently resolved into one. Differences
are many and dispersed because the voices in a society are plural
and dispersed.
Similarly, even though dialogism defines itself against monologism and, thus, sets up an initial binary opposition, dialogical
theory does not sponsor binaryism, nor can it be confused with
antibinaryism. In its most productive forms, dialogism is asymmetric dualism (Holquist 1990, 52). Reflecting on the shift
from bipolar thinking to triadic thinking in dialogism, Sigmund
Ongstad draws attention to the super-addressee as the invisible
third party in dialogue. In Bakhtin words, any dialogue takes
place against the background of the invisible thirds responsive
understanding (quoted in Ongstad 2004, 78). The role of the
super-addressee is, it must be noted, participatory, not telic or
ontological.

Dialogism, Heteroglossia, and Answerability: An Example


from King Lear
A brief example from King Lear will clarify this point. In the
opening scene of the play, Lear asks his daughters how much
they love him. After the first two have spoken, Lear asks what
Cordelia can say to draw a third more opulent share of royal
largesse. Cordelias cryptic answer, Nothing, clashes violently
with the extravagant professions of her sisters. The addressee for
all their speeches is Lear, also the others who are present. The
super-addressee is the invisible viewer/auditor of Shakespeares
time and ours. When Lear asks Cordelia to mend her speech,
lest she mar her fortunes (1.1.945), it is clear that the king
has already decided to punish by withholding patrimony (and
love), but the father worries. His caution expects filial answerability (of speech and action). Lears later characterization of his
loved daughter as Dowered with our curse and strangered with
our oath (1.1.205) does not merely show that he is hiding his
shame in anger; it attests to the fact that, though the relationship
has fallen apart, the frames of addressivity and answerability
are intact. Consequently, the dramaturgy of the scene is marked
by a violent clash between the discourse of authority shown
in Lears dividing of the kingdom and the mutually struggling

255

Dialogism and Heteroglossia

Diffusion

internally persuasive discourses of the ambitious, unloving


daughters and the not-ambitious, loving daughter.
The heteroglossic vitality of this moment in the play is
enhanced by Lears own utterances as they move back and forth
from the discourse of authority to an internally persuasive discourse. When Cordelia does not mend her speech enough and
chooses, instead, to mar her fortunes, risking her hurtful answer
Lear asks: But goes thy heart with this? The poignant question
betrays an involuntary admission that the heart does not need
to go with the words one speaks, as much as it is indicative of his
growing sense that the logic of the discourse (of love) he initiated
has been undermined. Having used up his kinglike and fatherlike
powers, Lear resorts to accusation: So young and so untender?
uttered, once again, to the answerability of Cordelias self-defense: So young, my lord, and true (1.1.1058)
This kind of responsiveness to the other is the defining feature
of heteroglossic simultaneity, perhaps its central ethical norm.
At the beginning of the scene, Lear is not responsive to Cordelia
as his other. The super-addressee, embodied later in Lears
Fool, senses that Lear is like a monologic author, and his daughters are his heroines and villains. As an unsurprising parallel,
Bakhtin, too, thinks of the hero as a model for a person in society
and insists that a hero is not a voiceless object but a legal person with rights. Deictically imagined as someone who acts, talks,
and exists in a world of heteroglossic simultaneity, heroes are
formally equal subjects of law that is immanent to the relations
between persons themselves (Brandist 2004, 30; see deixis). In
this case, Cordelia, though not the hero of this play, is intent on
establishing herself as a person with a voice and a legal person
with rights, just as her sisters are invested in gaining access to
social power.
In stylistic terms, Lears utterances conflate formal language of the court with speech genres of familial conversation.
He changes nouns to unusual verbs, as in strangered and
dowered; repeats the plural first person pronoun, our, twice
in the same line, calling attention to himself as speaker; substitutes curse and oath for authoritative command; and echoes
Cordelias Nothing at the moment of first shock that a daughter,
who is also his subject, would be so insubordinate (1.1.8790).
In the course of the play, Lears unitary consciousness has
to become a participative consciousness. Most importantly, Lear
has to acquaint himself with the consciousness(es) of his daughters as others. In this way, the play seems to point directly
toward Bakhtinian conclusions. The raison detre for heteroglossic simultaneity as an aesthetic-ethical norm is to sustain the
chaos of human experience and the noise of language within a
polyphonic system, not to underwrite it through monological
glossing.
Lalita Pandit Hogan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bakhtin M. M. 1981. The Dialogic Imagination: Four Essays. Ed.
Michael Holquist. Trans. Caryl Emerson and Michael Holquist.
Austin: University of Texas Press.
. 1984. Problems of Dostoevskys Poetics. Ed. and trans. Caryl
Emerson. Introduction by Wayne C Booth. Minneapolis: University of
Minnesota Press.

256

. 1990. Art and Answerability: Early Philosophical Essays. Ed. and


trans. Michael Holquist, Vadim Liapunov, and Kenneth Brostorm.
Austin: University of Texas Press.
. 1993. Towards Philosophy of the Act. Ed. Michael Holquist and
Vadim Liapunov, trans. Vadim Liapunov. Austin: University of Texas
Press.
Bostad, Finn, Craig Brandist, Lars Sigfried Evenson, and Hege
Charlotte Faber, eds. 2004. Bakhtinian Perspectives on Language
and Culture: Meaning in Language, Art and New Media. New
York: Macmillan, Palgrave.
Brandist, Craig. 2002. The Bakhtin Circle: Philosophy, Culture and Politics.
London: Pluto Press.
. 2004. Law and genres of discourse: The Bakhtin Circle theory
of language and the phenomenology of right. In Bostad et al. 2004,
2345.
Clark, Katerina, and Michael Holquist. 1984. Mikhail Bakhtin.
London: Harvard University Press.
Emerson, Caryl, ed. 1999. Critical Essays on Mikhail Bakhtin. New
York: G. K. Hall.
Holquist, Michael. 1990. Dialogism: Bakhtin and his World. 2d ed. New
York: Routledge.
Ongstad, Sigmund. 2004. Bakhtins triadic epistemology and ideologies
of dialigism. In Bostad et al. 2004, 6588.
Pechey, Graham. 1999. Boundaries versus binaries: Bakhtin in/against
history of ideas. In Emerson 1999, 32137.
Reed, Natalia. 1999. The philosophical roots of polyphony: A
Dostoevskian reading. In Emerson 1999, 11752.
Shakespeare, William. 2001. King Lear. Ed. R. A. Foakes. London: Thomas
Learning.

DIFFUSION
The study of linguistic diffusion is concerned with describing and
explaining how languages or language features spread over time
and space. On a macrolevel, diffusion refers to the dispersion of
languages from a common point of origin. Through migration
and subsequent isolation, thousands of languages have developed from a highly limited set of protolanguages. The physical,
demographic, and social constraints on language dispersion cannot be reduced to a simple algorithm, and the reconstruction of
protolanguages, language family relationships, and patterns
of spatial dispersion remain a primary challenge in historical
linguistics.
On a language-specific level, diffusion is concerned with the
spread of particular linguistic innovations across the varieties of
a language or, in some cases, across languages. Linguists, particularly sociolinguists, dialectologists, and historical linguists,
seek to identify the mechanisms of transmission and the factors
that promote or inhibit the spread of language traits. A linguistic
change is initiated in a particular locale at a given point in time
and spreads outward from that point in progressive stages so that
earlier changes reach the outlying areas later. The wave model
assumes that a change spreads in concentric layers, as waves
radiate outward from a central point of contact when a pebble is
dropped into a pool of water. Forms that follow this straightforward time and distance relation follow the pattern of contagious
diffusion (Bailey et al. 1993).
Because of physical, social, and psychological factors, a model
that considers only time and distance is too simplistic to account
for the spread of linguistic forms. Diffusion researchers cite at

Diffusion
least five factors that influence the dispersion of customs, ideas,
and practices: 1) the phenomenon itself, 2) communication networks, 3) distance, 4) time, and 5) social structure. Although linguistic structures are inherently quite different from phenomena
such as technological innovations, they are subject to many of
the same social and physical factors that influence the nature of
diffusion in general.
A gravity model or hierarchical model of language (Trudgill
1974) often provides a better profile of the diffusion of linguistic
forms than a simple wave model. In the gravity model, which is
borrowed from the physical sciences, diffusion is a function not
only of the distance from one point to another, as with the wave
model, but of the population density of areas that stand to be
affected by a nearby change. The interplay between the population density of two areas and the distance that separates them
parallels the effects of density and distance on gravitational pull.
Changes are most likely to begin in heavily populated cities that
serve as cultural hearths. From there they radiate outward, but
not in a simple wavelike pattern; rather, innovations first reach
moderate-size cities that fall under the area of influence of some
large, focal city, leaving nearby, sparsely populated areas unaffected. Gradually, innovations filter down from more populous,
denser areas to less densely populated areas, affecting rural
areas last, even if such areas are quite close to the original focal
area of the change. The spread of change is thus like skipping a
stone across a pond, rather than dropping a stone into a pond, as
in the wave model. The model of change following this pattern is
referred to as cascade diffusion.
One of the noteworthy examples of cascade diffusion is a
vowel shift currently taking place in the northern cities of the
United States. Part of this elaborate rotation involves the shift
of the vowel of thought so that it sounds more like the vowel of
lot. The lot vowel, in turn, sounds more like the vowel of trap,
which moves closer to the pronunciation of the vowel of dress.
This vowel shift proceeds from larger cities in the North to successively smaller ones, leaving in-between rural areas relatively
unaffected until the later stages of the change.
Gravity models of change include factors of distance and
communication networks as a function of population density,
but they do not recognize the role of other social and psychological factors. For example, changes do not spread evenly across
all segments of the population. Members of upwardly mobile
social classes usually adopt linguistic innovations more quickly
than do members of other classes, and women and younger
people are often leaders in certain kinds of language change. It
is therefore essential to track changes not only across geographical space and population density but also across different age,
ethnic, gender, and social status groups (cf. age groups and
gender and language).
In terms of social networks, the first people to adopt changes
are those with loose ties to many social groups but strong ties to
none due to the face that strong ties inhibit the spread of change.
In order for the changes to make their way into more close-knit
groups, they need to be picked up by people who are central figures in these groups but who are willing to adopt change nonetheless, perhaps for reasons of prestige. Because these early
adopters are well regarded in their social groups, the changes
they adopt are likely to be picked up by other members of these

Digital Media
groups, thereby diffusing throughout the group and to other
groups or communities within a population (Labov 2001).
One important study of language diffusion in the southern
United States (Bailey et al. 1993) shows that although many
linguistic innovations follow the more common hierarchical
pattern of cascade diffusion, some features may display the
opposite diffusion pattern. For example, the use of the special
intentional modal fixin to, as in Theyre fixin to go now, once
heavily concentrated in rural areas of the American South, has
now been adopted in some larger, urban population centers.
The explanation for this contrahierarchical diffusion pattern is
tied to its symbolic marking of traditional southern American
speech. In the face of a large influx of outsiders into the region,
native urban residents may seek to assert their southern identity by adopting selected structures strongly associated with the
regional South, showing that the social meaning attached to
linguistic forms has to be considered along with geographical,
demographic, and interactional factors in explaining linguistic
diffusion.
Walt Wolfram
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bailey, Guy, Tom Wikle, Jan Tillery, and Lori Sand. 1993. Some patterns
of linguistic diffusion. Language Variation and Change 5: 35990.
Labov, William. 2001. Principles of Linguistic Change. Vol. 2. Social
Factors. Malden, MA, and Oxford: Blackwell.
Rogers, Everett M. 1995. Diffusion of Innovations. 5th ed. New York: Free
Press.
Trudgill, Peter. 1974. Linguistic change and diffusion: Description and
explanation in sociolinguistic dialect geography. Language in Society
3: 21546.

DIGITAL MEDIA
Digital media are those media whose means of production and
distribution are digitized via computers; the term is commonly
used in contrast to older forms of media, such as print (for text)
or analog devices (for sound and images). In language studies,
digital media most commonly refer to the Internet, the World
Wide Web, mobile telephony, and other networked and wireless
technologies that support human communication known as
computer-mediated communication (CMC) and the transmission of information. Digital media can also refer to digital storage
devices for data, sound, video, and graphics. Here, we are concerned primarily with the former sense, especially the impact of
digital communication technologies on peoples individual and
collective use of and relation to language.

History
Communication via digital media can be traced to the invention of packet switching technology in the 1960s, which enabled
messages to be exchanged among networked computers. The
ARPANET, the predecessor of the Internet, was implemented as
a United States defense department project in 1969; by the mid1970s, it had become popular for human communication via
e-mail and mailing lists. In 1979, the USENET was created as an
alternative, grassroots network; USENET newsgroups, along with
various BBS (bulletin board systems) and networks hosted on

257

Digital Media
private servers during the 1980s, were eventually integrated into
the Internet, the term used after 1983 for the collection of networks that had grown around the ARPANET. By the late 1980s,
the Internet offered public real-time chat via Internet Relay Chat
and MUDs (Multi-User Dimensions), along with email, mailing
lists, and newsgroups. Around the same time, Internet service
providers (ISPs) were starting to make the Internet accessible
to people in their homes, rather than just from businesses and
universities.
The introduction of the World Wide Web in 1991 and the first
graphical browser in 1993 transformed the Internet by enabling
networked multimedia. By the mid-1990s, Internet telephony
and videoconferencing were available, along with graphical virtual worlds. Despite the increasing availability of bandwidth to
support multimedia, however, text retained its popularity. The
late 1990s saw the emergence of several text-based applications: instant messaging, weblogs (blogs), and text messaging on
mobile phones (especially in Europe and Asia).
A more recent trend has been toward mobile media and flexible access. Starting with external hard drives for external data
storage and continuing with laptops, personal digital assistants
(PDAs), iPods, and smartphones, digital media have moved
away from desktop computing toward more distributed, lightweight, faster devices.

Language-Related Issues
The rapid rise in popularity of digitally mediated communication
over the past two decades has attracted considerable interest
from language scholars. The central debates have focused on
how to classify such communication relative to speech and writing, the effects of technology on language and language use, the
purported anonymity of text-based CMC and its social and linguistic consequences, and the long-term effects of digital media
on individual languages and the global language ecology.
Computer-mediated communication is sometimes claimed
to constitute a third modality of language, alongside speech and
writing. Text-based CMC, by far the most common manifestation
of digital communication, blends the production and reception
features of writing (typing on a keyboard or otherwise entering
characters into an alphanumeric interface; reading messages on
a screen) with the structural and interactional features of spoken
conversation (e.g., informality, phatic content, relatively rapid
exchange of messages), making it a hybrid modality with distinctive characteristics (Crystal [2001] 2006). Moreover, the personal
accessibility and wide public reach of the Internet have led some
to characterize it as fundamentally transformative of human
communication, a revolution as profound as that triggered by
the printing press.
At the same time, the novelty of digital language should not
be overstated. It is often possible to trace the roots of so-called
emergent or digitally native CMC genres (Crowston and Williams
2001) to older written and oral genres. An example is the blog,
which, while arguably a historically unprecedented hybrid of
personal, interpersonal, and mass communication, manifests
continuities with handwritten diaries, phone calls to friends and
family, project logs, and letters to the editor. Ultimately, what may
be most unique about digital media is their tendency to support
a convergence of language features, genres of communication,

258

and communication technologies that were previously considered distinct. The incorporation of text chat into multiplayer
online games and the ability to send text messages from mobile
phones to interactive television (iTV) programs illustrate the latter trend.
Theoretical debate has also centered around the effects of
digital technology on human communication. A strong technological determinism position holds that production and reception constraints on CMC inevitably shape digitally mediated
language and language use. Such a position finds support in
research findings that technical constraints on message exchange
disrupt and reshape turn-taking patterns across a range of digital
genres (Herring 1999). A weaker version of technological determinism holds that features of specific technologies predispose
users to communicate in certain ways, but that users may override those predispositions. For example, the synchronicity of
CMC systems tends to affect message length, complexity, and
formality (with messages in asynchronous modes being generally longer, more syntactically complex, and more formal than
in synchronous modes), although both formal and informal language can be found, for example, in email (asynchronous) and
chat (synchronous), depending on the topic and purpose of the
communication.
The social construction of technology theory goes further to
assert that users shape technologies through their use as much or
more than their use is shaped by those technologies (Bijker and
Law 1992). This view receives support from computer-mediated
cooperative work and online education, where the nature of
the tasks structures communication in often predictable ways.
Further, many face-to-face social and interactional dynamics,
including gendered patterns of communication, are reproduced
in digital discourse, albeit differently in academic discussion
forums than in chat. In an effort to account for such variation,
a fourth position holds that there is no single way in which technology influences mediated language; rather, it depends on the
particular constellation of technical and social variables that
characterizes a given sample of mediated discourse (Herring
2007). A desideratum for future research is a coherent theory
that can predict when specific types of media will have particular
communicative effects.
Another nexus of debate concerns the purported anonymity of digitally mediated communication. Because social cues
conveyed through prosody, facial expression, and physical
appearance of message senders are filtered out in text-based
CMC, many early scholars believed that digitally mediated communication was depersonalized and that users identities were
masked or irrelevant. This was thought to give rise to flaming or
hostile language (and antisocial behavior, in general); play with
identity and liberatory (or inauthentic, depending on ones perspective) online self-presentations; and compensatory linguistic
strategies, such as creative spellings and emoticons (faces made
out of ascii characters), in order to enhance ones social presence and signal ones intentions. These linguistic strategies have
been referred to as textspeak by David Crystal ([2001] 2006; for
examples, see Figure 1).
Alternative perspectives have also been advanced on these
phenomena, however. True anonymity is infrequent, since most
people who communicate digitally use consistent identifiers, and

Digital Media
HK English: Hee hee . . . dunno why I always like to send u mails ar! Part is
becoz I wanna keep contact with u la!
French:

Ca sera donc tjs 1 plaisir 2te revoir! :-) [So it will always be a pleasure to

Figure 1. Examples of textspeak (bolded) in


Hong Kong English, French, Romanized Arabic,
and Japanese (from Danet and Herring 2007).

meet you again :-)]


Arabic:
Japanese:

w 3laikom essalaaam asoomah ^_^ [Hi there, Asoomah ^_^]


(*^

^*) [Congratulations on your comeback

(as if singing) That was good (*^ ^*)]

in the case of private communication (e.g., via e-mail, instant


messaging, or short message service [SMS]), the communicators usually already know one another. Flaming may be better
explained by the lack of accountability characteristic of public
Internet forums than by anonymity per se, given that many hostile messages are sent by people with known identities. Play with
identity, while fashionable in some chat environments, occurs
less often in practice than was implied by early theorists, in part
due to the difficulty of maintaining a false identity over time.
Recent years have also seen an increasing tendency for people to
post photographs of themselves, for example, on social networking sites although false and digitally modified photos can, of
course, be posted. Finally, textspeak is also shaped by the impetus to type quickly, especially in real-time message exchanges,
resulting in creative, often abbreviated, spellings. Nonetheless,
it remains the case that digital media afford new and increased
opportunities for selectively crafting ones self-presentation,
both linguistically and visually, and for deceptive communication to take place.
The scope and spread of digitally mediated communication,
both globally and over time, give rise to other language-related
issues. Digital media enable unprecedented large-scale conversations (e.g., in public discussion forums) and provide vast,
potentially interactive audiences (e.g., for websites and blogs) in
which many participants are unknown to one another and participation is open to a wide spectrum of society. Conversations
involving hundreds (or thousands) of people raise new challenges for maintaining interactional coherence, and unknown
audiences constitute new kinds of addressees when the broadcast content is personal, as is the case for many blogs. As ordinary language users come to grips with these challenges, new
media-specific norms are emerging, much as people a century
ago evolved new interactional and pragmatic norms for speaking
over the telephone.
The Internet enables new kinds of social formations to arise
known as virtual communities which often develop characteristic communicative practices; these, in turn, may spread. New
lexical items, as well as textspeak features, have diffused rapidly across the Internet and have become integrated to varying
degrees into everyday speech and writing, especially those of
young people, giving rise to the claim that digital media are accelerating processes of language change. This includes introducing

new morphological formatives such as e- and cyber- into the


English language; however, there is less evidence that digital
media are associated with syntactic changes, which typically take
place more slowly. The fears of some educators and journalists
that digital communication is accelerating language decline and
interfering with childrens learning of standard written language
appear to have no basis in empirical fact (Thurlow 2006).
Digital media also have global implications for cross-cultural
communication, multilingualism and language choice, and the
status of individual languages. Although still a small percentage
of the worlds languages, those used on the Internet are growing in number. Figure 1 gives examples of textspeak in four
languages.
There is debate, however, as to whether linguistic diversity
equal to that in the offline world will eventually be achieved,
or whether digital media are promoting and accelerating the
dominance of English and other large languages. Evidence from
multilingual contact situations, such as cross-national Internet
discussion forums, suggests that English or the regional language
(e.g., Spanish, German, Russian) tends to be used as a lingua
franca in order to ensure the widest comprehension; this trend
bodes ill for the use of minority languages in such forums. At
the same time, many Internet forums have national rather than
international audiences, and localization efforts are producing hardware and software in local languages. Some speculate
that these trends are leading toward a global diglossia, with
English as the High (international) variety and local languages
as the low, or colloquial, variety. The Internet has also been used
with some success as a tool to support revitalization efforts for
endangered languages (Danet and Herring 2007).

Current State of Research


From the outset, scholarship on digital media was broadly interdisciplinary. In the first two decades of CMC research, scholars
trained in communication, rhetoric, social psychology, management, linguistics, humancomputer interaction, anthropology,
and education came together in interdisciplinary fora to try to
meet the challenge of characterizing online communication,
and in recent years, new interdisciplinary fields have arisen in
which digital media play a central role, such as new media studies
and social informatics. At the same time, there is a trend toward
increasing disciplinary specialization, as new media become

259

Diglossia
accepted into mainstream disciplinary approaches. In language
studies, new media currently provide application domains (e.g.,
for language learning) and sources of data for empirical analysis
and, increasingly, for theorizing about language from cognitive,
social, and evolutionary perspectives.
Susan C. Herring
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bijker, Wiebe, and John Law, eds. 1992. Shaping Technology/Building
Society: Studies in Sociotechnical Change. Cambridge, MA: MIT Press.
Crowston, Kevin, and Marie Williams. 2001. Reproduced and emergent
genres of communication on the World-Wide Web. The Information
Society 16.3: 20116.
Crystal, David. [2001] 2006. Language and the Internet.
Cambridge: Cambridge University Press.
Danet, Brenda, and Susan C. Herring, eds. 2007. The Multilingual
Internet: Language, Culture, and Communication Online. New
York: Oxford University Press.
Herring, Susan C. 1999. Interactional coherence in CMC. Journal of
Computer-Mediated Communication 4.4. Available online at: http:
//jcmc.indiana.edu/vol4/issue4/herring.html.
. 2004. Slouching toward the ordinary: Current trends in computermediated communication. New Media & Society 6.1: 2636.
. 2007. A faceted classification scheme for computer-mediated
discourse. Language@Internet. Available online at: http://www.
languageatinternet.de/articles/761.
Thurlow, Crispin. 2006. From statistical panic to moral panic: The
metadiscursive construction and popular exaggeration of new
media language in the print media. Journal of Computer-Mediated
Communication 11.3: article 1. Available online at: http://jcmc.indiana.edu/vol11/issue3/thurlow.html.

DIGLOSSIA
In his seminal article, Charles Ferguson (1959, 435) defined
diglossia as
a relatively stable language situation in which, in addition to the
primary dialects of the language (which may include a standard
or regional standards), there is a very divergent, highly codified
(often grammatically more complex) superposed variety, the
vehicle of a large and respected body of written literature, either
of an earlier period or in another speech community, which is
learned largely by formal education and is used for most written
and formal spoken purposes but is not used by any section of the
community for ordinary conversation.

Using the examples of Greek, Arabic, Haitian Creole, and Swiss


German, Ferguson discussed several characteristics common
across diglossic situations. First of all, there is a strict division
of labor between the two varieties: The superposed variety or
the H(igh) variety is used mostly in prestigious domains (e.g.,
education; see prestige ), and the vernacular or the L(ow)
variety is restricted to informal domains (e.g., neighborhood).
Second, although the two varieties are genetically related, the
H variety is structurally more complex than the L variety (e.g.,
the H variety has more overt case markers than the L variety). Third, the H variety is more highly valued than the L variety: While there is a sizable body of literature written in the H
variety, the L variety is rarely used in the written form except
in dialect poetry and advertising. Fourth, the H variety tends

260

to be more standardized than the L variety: Grammars and


dictionaries are written for the H variety, but not usually for the
L variety. Fifth, while the L variety is the language of the home,
the H variety is not spoken natively by anyone in the community and has to be learned through schooling. Finally, although
the L variety may gradually replace the H variety due to such
factors as more widespread literacy and broader communication among different social groups, a diglossic situation usually
persists for centuries or even millennia.
Diglossic situations are different from other commonly
found language situations in several respects. In contrast to
diglossic situations, many bilingual situations do not maintain a clear functional compartmentalization of the two varieties
(see bilingualism ). In Arabic-speaking countries, colloquial
Arabic serves as the basic medium of interaction, but modern
standard Arabic is the preferred variety for formal purposes.
However, in a bilingual community such as Flemish- and
French-speaking Belgium, both varieties are used to perform
similar functions in formal and informal domains. Diglossic
situations are also different from standard-with-dialects situations. In the German-speaking regions of Switzerland, the H
variety (Hochdeutsch) is learned through formal schooling and
is not used as the medium of everyday interaction. On the other
hand, in Italy (a standard-with-dialects situation), many people
speak standard Italian natively and use it in formal as well as
informal settings.
Over the years, numerous scholars have reworked Fergusons
definition of diglossia. While maintaining the criterion of strict
functional compartmentalization, Joshua Fishman (1967)
broadened the definition of diglossia to include genetically unrelated varieties. According to this broad definition, Spanish- and
Guaran-speaking Paraguay would be classified as a diglossic
community, in that the two genetically unrelated varieties function like H and L varieties in diglossic situations. However, some
have criticized this definition as diluting the original meanings of
diglossia. Although the Spanish-Guaran situation resembles the
diglossic situation in the Arabic-speaking world, the two differ in
their social origin and course of development. While the former
came into being through the confluence of two sociolinguistic
traditions as a result of colonial contact, the latter was derived
from the internal functional differentiation within a single
sociolinguistic tradition (see colonialism and language).
Furthermore, when language shift occurs in a bilingual community, it is usually the H variety that replaces the L variety.
In contrast, in the terminal stages of Fergusonian diglossia,
the L variety often displaces the H variety. More recently, Alan
Hudson (2002) argued that the absence of native H speakers
distinguishes diglossia from bilingual situations like the one in
Paraguay. This characteristic, Hudson maintains, enhances the
stability of diglossia. Without a prestigious community of native
H speakers, L speakers lack the motivation to adopt H for everyday communication.
Another point of contention is the discreteness of H and L. In
many diglossic communities, there exists a continuum of forms
between the H and L varieties. In addition, speakers sometimes
mix H and L in the same functional domain and even in the same
utterance. In the Arabic-speaking world, speakers sometimes
engage in diglossic switching (Walters 2003). In this case, one

Discourse Analysis (Foucaultian)


variety (i.e., the matrix) provides the frame for an utterance,
while the other supplies lexical items that are inserted into the
frame. In formal interviews, speakers may use modern standard
Arabic as the matrix but draw on lexical items from the L variety. In other cases, the L variety serves as the matrix. This may
occur when Arabs who speak different Arabic varieties interact
with one another. They use mostly their own varieties but with
lexical items from Modern Standard Arabic and other spoken
Arabic varieties. A closer look at diglossic switching is warranted
because it may yield important insights into the nature of code
mixture in diglossic communities.
In his final major statement on the subject, Ferguson (1991)
lamented the fact that studies in the last few decades have focused
mostly on individual cases and examined whether or not they
are instances of diglossia. He called for more cross-community
studies that investigate the origins and developments of different
diglossic situations, as well as research that examines diglossic
situations during rapid social change.
Andrew Wong
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ferguson, Charles. 1959. Diglossia. Word 15: 32540.
. 1991. Diglossia revisited. Southwest Journal of Linguistics
10.1: 21434.
Fishman, Joshua. 1967. Bilingualism with and without diglossia; diglossia
with and without bilingualism. Journal of Social Issues 23.2: 2938.
Hudson, Alan 2002. Outline of a theory of diglossia. International
Journal of the Sociology of Language 157: 148.
Walters, Keith 2003. Fergis prescience: The changing nature of diglossia in Tunisia. International Journal of the Sociology of Language
163: 77109.

DISCOURSE ANALYSIS (FOUCAULTIAN)


The type of analysis of discourse developed by Michel Foucault
has a complex history, and the term discourse itself is used in a
range of ways by different theorists. Even Foucault himself draws
attention to the difficulty of fixing on a particular type of analysis that follows from his definition of the term. As he comments,
Instead of gradually reducing the rather fluctuating meaning of
the word discourse, I believe I have in fact added to its meanings: treating it sometimes as the general domain of all statements, sometimes as an individualizable group of statements,
and sometimes as a regulated practice that accounts for a number of statements ([1969] 1972, 80).
This quotation is crucial for understanding the range of meanings that the term has accrued to itself within Foucaults work
and within that of other discourse theorists and, consequently,
crucial to the type of analysis of discourse that they undertake. The first definition that Foucault gives is the most general
one: the general domain of all statements; that is, all utterances
or texts that have meaning and which have some effects in the
real world count as discourse. This is a broad definition and is
generally used by Foucault in this way, particularly in his earlier
more structuralist work, such as Archaeology of Knowledge
([1969] 1972), when he is discussing the concept of discourse
at a theoretical level. It may be useful to consider this usage
to be more about discourse in general than about a discourse

or discourses, with which the second and third definitions are


concerned. The second definition that he gives an individualizable group of statements is one that is used by Foucault
when he is discussing particular structures within discourse;
thus, he is concerned to be able to identify discourses, that is,
groups of utterances that seem to be regulated in some way and
to have a coherence and a force to them in common. Within
this definition, therefore, it would be possible to talk about a discourse of femininity, a discourse of imperialism, and so on. His
third definition of discourse is perhaps the one that has most resonance for many theorists: a regulated practice which accounts
for a number of statements. I take this to mean that he is interested less in the actual utterances/texts that are produced than
in the rules and structures that produce particular utterances
and texts. It is this rule-governed nature of discourse that is of
primary importance.
Within most discourse theorists work, these definitions are
used sometimes almost interchangeably. One of the most productive ways of thinking about discourse is not as a group of
signs or a stretch of text but as practices that systematically form
the objects of which they speak (Foucault [1969] 1972, 49). In
this sense, a discourse is something that produces something
else (an utterance, a concept, an effect), rather than something
that exists in and of itself and can be analyzed in isolation. A discourse is generally something that is affirmed by an institution
and, therefore, constitutes an intervention in power relations. A
discursive structure can be detected because of the systematicity
of the ideas, opinions, concepts, ways of thinking, and behaving
that are formed within a particular context, and because of the
effects of those ways of thinking and behaving.
The theorists who have drawn on Foucaults work on discourse most extensively to develop a form of discourse analysis
have been critical discourse analysts. They have tried to
develop a form of linguistic analysis of texts that is openly political and, therefore, draws on a more social model of discourse
than conventional linguistics generally does (Fairclough 1992;
Thornborrow 2002; Wodak 1998; see Mills 2004 for a fuller discussion). Very often, critical discourse analysts examine texts
and utterances that seem to display extreme power differentiation (see inequality, linguistic and communicative),
and they draw attention to some of the more troubling aspects
of these texts in order to bring about change at a discoursal level
but also, more importantly, at a material level. For them, as for
Foucault, discourse is crucial for constructing a social identity
and for resisting or affirming the social roles that others construct
for us. By becoming aware of the systemic nature of some of the
ways in which institutions position individuals through discourse,
it is possible to challenge them and construct alternative modes
of representation. These theorists often fuse linguistic analysis,
such as systemic linguistics or conversation analysis, with
a more Foucaultian analysis of discourse. (Other theorists, such
as D. Smith 1990, use Foucaults work on discourse in a more
thoroughly social or cultural analysis, without focusing on language as such they would be considered discourse theorists,
rather than discourse analysts). However, some theorists, such
as J. Blommaert (2004), are critical of the use of Foucaults work
within a broadly linguistic analysis, as for him, this constitutes a
distortion of Foucaults overall project.

261

Discourse Analysis (Foucaultian)


N. Fairclough (1992) draws on Foucaults conception of discourse in order to develop a very systematic type of analysis of
text. He provides working models and forms of practice from
Foucaults theoretical interventions, together with a description
of the effects of discursive structures on individuals. For him,
critical discourse analysis is not only concerned to describe discursive structures but also shows how discourse is shaped by
relations of power and ideologies, and the constructive effects
discourse has upon social identities, social relations and systems
of knowledge and belief, neither of which is normally apparent
to discourse participants (Fairclough 1992, 12). Furthermore,
Fairclough uses Foucaults conception of discourse because of
the stress that Foucault lays on the constitutive nature of discourse the fact that discourse structures the way that we perceive objects and reality. For Fairclough, critical discourse
analysts can unpick commonsense knowledge and views of the
world that present themselves as self-evident and natural, as all
of these types of knowledge will inevitably be profoundly ideological. By foregrounding the constructed and ideological nature
of this knowledge, it will be possible to suggest ways of seeing
that are more productive and egalitarian.
The influence of Foucault can be seen in the emphasis that
these theorists accord to the workings of power. Generally,
within critical discourse analysis, there is an emphasis on what
Foucault would term repressive power, that is, a view of power
relations that stresses the way that individuals are prevented
from doing what they wish because of other individuals or institutions. However, Foucault stresses that power is not simply
the imposition of someones will upon another but, rather, that
power should be seen as a network of power relations among
all members of a social group. Discourse is a key element in the
working out of power relations since discourse not only marks
perceptions of power difference (one displays ones self or position within a hierarchy through ones discursive choices) but also
affirms and contests those perceptions or power differences.
In that sense, individuals engage in power relations even in the
most mundane interactions. For example, within everyday conversation, critical discourse analysts would draw attention to the way
that only certain people consider it their role to sum up an interaction or to comment on the point of an interaction, which is a
very powerful position to construct for oneself through discourse.
J. Thornborrows (2002) work draws on Foucaults notions of discourse and productive power, together with an analytical framework from conversation analysis, to develop a form of discourse
analysis that focuses on the way power relations are effected within
institutions (see also Goodwin 1994). Analysis of discourse, here,
focuses on the language resources available and the way that
they are used by those who have institutional power. Rather than
assuming that certain elements of language are powerful in themselves, Thornborrow considers that there are certain language
styles and procedures likely to be used by those who are in positions of power. Some of these styles, such as the use of indirectness
or politeness, which appear to be relatively neutral styles, will
be understood by others within a framework of power relations.
For example, if a manager makes an indirect request to an office
worker, that request will be understood as a command rather than
as a simple request. However, in addition to analyzing the strategies of those in positions of power, Thornborrow also focuses on

262

Discourse Analysis (Linguistic)


the strategies used by those who are less powerful but who use language strategically to achieve what she terms local status, that is,
a form of interactional power achieved at a local level.
Thornborrow challenges a great deal of the work by critical discourse analysts who focus solely on the way that language is used
to oppress others, the standard example being the way that doctors speak to patients (asking more questions, providing information, deciding the topic of the interaction, interrupting, and so on).
Instead, she examines interactions such as that of a woman who
is being interviewed by police officers in relation to a rape allegation that she had made. She focuses on the way that the police
officers try to take control of the interaction by drawing on powerful language resources, such as interruption. The woman interviewee, however, does not simply submit to their interruptions but
instead tries to structure the interaction from her own perspective
and to meet her own needs. Through the womans interventions in
the interview, it is possible to see that she is not simply a victim of
oppressive linguistic strategies but that she employs a range of discursive tools, such as persistently asking questions, to assert her
right to have her point of view considered.
Other discourse analysts, such as C. Walsh (2001), have
focused on the discursive structures that act upon women who
enter the public sphere and that categorize the interventions
of women as feminine or as trivial. She examines the way that
newspapers report on women in positions of power and the fact
that they often focus on their appearance, their sexuality, and the
way they dress, rather than on the work that they do. She also
examines the way that women are often represented as if they
were in the private sphere instead of in the public sphere. She
focuses on the systematic nature of this type of representation so
that it can be seen to be a general trait, rather than a tendency in
certain newspapers (see also gender and language).
The critical and analytical perspective of theorists and analysts such as Fairclough, Blommaert, C. Goodwin, Thornborrow,
and Walsh is a significant reinterpretation of Foucaults work
through the matrix of linguistics concern for verifiable, replicable analyses.
Sara Mills
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Blommaert, J. 2004. Discourse. Cambridge: Cambridge University Press.
Fairclough, N. 1992. Discourse and Social Change. London: Polity.
Foucault, M. [1969] 1972. Archaeology of Knowledge. Trans. A. M.
Sheridan Smith. New York: Pantheon.
Goodwin, C. 1994. Professional vision. American Anthropologist
96: 60633.
Mills, S. 2004. Discourse. 2d ed. London: Routledge.
Smith, D. 1990 Texts, Facts, and Femininity. London: Routledge.
Thornborrow, J. 2002. Power Talk: Language and Institutional Discourse.
Harlow, UK: Longman.
Walsh, C. 2001. Gender and Discourse: Language and Power in Politics,
the Church and Organisations. Harlow, UK: Longman.
Wodak, R. 1998. Disorders of Discourse. Harlow, UK: Longman.

DISCOURSE ANALYSIS (LINGUISTIC)


Although discourse analysis is variously defined (see examples
in the Introductions to Jaworski and Coupland [1999]; Schiffrin,

Discourse Analysis (Linguistic)


Tannen and Hamilton [2001]; Johnstone [2006]), a generally
accepted linguistic definition of discourse itself is language above
and beyond the sentence. An advantage of this definition is that
it allows several different entry points into linguistic analyses of
discourse. Some discourse analysts, for example, focus on the
ways in which smaller language units (e.g. noun phrases, clauses,
sentences) combine to create a coherent text that makes sense
to others. Other discourse analysts focus on features that help
to co-constitute the text. In other words, just as the structures,
meanings, and functions of a text are continuously projected by
the combinatory patterns of smaller units, so too are they a result
of those combinatory patterns of smaller units. Such features
might include the topic structure of the text or the various relationships across sentences (such as repetition, lexical collocations, or conjunctions) that help create cohesion among smaller
parts. Still other discourse analysts focus on how sequences of
language units (be they clauses or turns at talk) contribute to
social meanings and functions. Interest in the functions of language in social contexts leads to a range of other issues, for example, how repeated use of a particular noun, or distribution of
speech, can reproduce power or initiate resistance in social and
political spheres (see, e.g., entries on inequality, linguistic

and communicative; politics of lanugage; gender


and language). Discourse analyses thus address features of
language within text, context, qualities of texts, and how language in texts is related to contexts.
After a brief history of key works in early discourse analysis,
I show how current approaches address phenomena and processes of discourse and close with some general principles.

of information within the text. In addition to function, parts of


narratives (e.g., abstract, orientation, complicating action, coda)
were identifiable by linguistic (syntactic, semantic) properties.
Likewise, syntactic modifications of a basic X did Y event structure on a clause-by-clause basis convey the subjective meanings
of the narrative the point of the story.
In contrast to the more formal approaches of Harris and Labov,
Michael Halliday and R. Hasan (1976) focused on how language
reveals cohesive connections within a text (see text linguistics) so that the reader (or listener) can understand not just the
meaning of each sentence but the meanings being conveyed
throughout the entire text. The following text, part of a recipe, is
annotated with subscripts for cohesive devices: Reference1, repetition2, substitution3, ellipsis4, conjunctions5, and lexical relations6 provide cohesive ties.
Apple1a pudding1b. First5 you1c peel6 and chop6 the fruit6 (with 1a).
Then5 __ 4 sprinkle it3 =1a with sugar and toss with the raisins6. __4
Bake the mixture3 (for 1a / 1b) for one hour. You1c + 2 may serve the
pudding3 (for 1a and 1b) with vanilla ice cream1.

Whereas the recipe is relatively dense in cohesion (roughly half


of the words in the excerpt are linked in cohesive ties), other
types of texts may be less dense.
In sum, key works in early discourse analysis focused, relatively separately, on patterns of sentence-internal forms, clause
sequences, and cross-sentence meanings those that arise
across sentences. Recent approaches continue the search for
various types of patterns, but add an interest in how those patterns emerge in texts, often as a dialogic process (see dialogism
and heteroglossia), in relation to context.

Early Approaches to Discourse


Although many linguists in the mid-1900s stopped their analyses
at the level of sounds, words, and then sentences, some moved
toward the next level of discourse by examining morphological
patterns across sentences in written texts, the structure of relationships across clauses within spoken narrative, and aspects of
language that display connections across sentences.
Zellig Harris (1952) derived procedures for analyzing
arrangements of morphemes across sentences by building on
the tools of descriptive linguistics. Consistent with models of the
time (predating Chomskys turn to the sentence), Harris took a
bottom-up approach that viewed discourse as the next level in a
hierarchy of morphemes. His procedure examined morphemes
in terms of their co-occurrence with (or distribution in relation
to) other morphemes (or sets of morphemes). Included were not
only actual sequences of equivalent morphemes but also chains
of morphological equivalencies that were seen as representative of different genres or registers. Yet nothing but linguistic
structure within a given text was included: [T]he analysis of the
occurrence of elements in the text is applied only in respect to
that text alone and not in respect to anything else in the language (Harris 1952, 1).
William Labov and J. Waletzky (1967) developed a formal
model of oral narrative (see narratives of personal experience) that was based on temporal relationships among clauses
that had different functions in the verbalization of experience.
Later work (Labov 1972) focused not just on formal relationships among clauses but also on the distribution and function

Contemporary Approaches to Discourse


This section illustrates different approaches by means of a brief
analysis of two examples: Both are the opening phase of a longer discourse, the first, a classroom, the second, an oral history
interview.
OPENING A CLASSROOM LESSON. The discourse analyzed in (1)
is from a fifth grade class during parent visitation day. The classroom was overflowing with parents squeezed into the crowded
room. After greeting the parents and students, Mr. Clark (the
teacher) proceeded to what we see in the following:
(1) Mr. Clark: (a) Okay, lets get started.
(b) First well review the problems from last night.

In applying the approaches to the language of this short discourse, we focus on some aspects of text (the information in,
structure of, and relationships between successive clauses) and
context (social identities, relationships, and institutions). We
progress from a focus on knowledge (how to communicate information and take social action) to the use of the linguistic code
to convey a variety of situated meanings by people in particular roles that are embedded within (and sustain) larger cultural
practices and social structures.
pragmatics analyzes how we communicate more than
the semantic content of language by depending upon our ability to draw inferences based on such general principles as the
cooperative principle (appropriate quantity, quality,

263

Discourse Analysis (Linguistic)


manner, and relevance of information), as well as semantic and
logical meaning, adjacent features of text, and the social context. For example, we can infer to whom Mr. Clark is speaking and therefore who will get started and review even
though he does not explicitly designate their identity. First person plural pronouns include the speaker and someone else, but
the other can be inclusive or exclusive of the hearer. Once we
know about the context and shared schematic knowledge of who
typically does what, when, and where, we can infer that the first
we includes the students and their parents, but the second we
includes only the students.
Another aspect of communication that goes beyond what is
literally meant and said actions performed through speech is
the focus of speech-act analysis and theory. Actions can be
accomplished through language (e.g. requests, promises, warnings, assertions, thanks) only when specific conditions (involving linguistic knowledge, assumptions about speakers/hearers
needs and wants, and background situations) are appropriate to
the realization of that specific action. Okay, lets get started, for
example, is a directive, a general class of actions (including commands, e.g., Begin! and hints, e.g., Its getting late) through which
a speaker directs hearer(s) to take a future (not a past) action that
is something the speaker wants, is not likely to be done by the
hearer otherwise, but is within the hearers ability.
Linguistic alternatives also appear in how we pronounce,
select, or arrange our words. Alternatives that maintain semantic
meaning are studied by variation analysis. Instead of saying lets
(a) and well, for example, Mr. Clark could use the full forms let us
and we will. Although these two variants have the same semantic
meaning, they have different possible social meanings. Because
full forms provide more explicit information, and allow stress on
us and will, they can emphasize the need for particular (possibly
reluctant) people to take an undesired action. Lexical variants
would have different effects. Repetition of go in Lets get going
and Well go over the problems would create a cohesive tie highlighting the continuity of the actions and grouping them together.
Another alternative concerns how to organize information
and arrange sentences: What should be first? What are the consequences of different orders? Narrative analysis focuses on the
organization of information across sentences by analyzing different ways of verbalizing past experiences in textual units that
are also attuned to their contexts. One common feature of narratives is to present events in temporal order. Although Mr. Clark
is not telling a story, he is anticipating the future actions, and
the way he does so reflects some of the underlying features of
narrative: He presents upcoming actions roughly in the order in
which they will occur (rather than saying well review after we get
started) and highlights the transition from one event to another
with language that focuses on the beginnings of activities (get
started).
Sentences appear not just within one persons turn at talk but
also across different peoples turns at talk. How social order is
constructed through sequences of both grammatical units (sentences) and other units of speech production (e.g., clauses, intonation units) on a turn-by-turn basis in talk-in-interaction is a
major focus of conversation analysis. Mr. Clarks Okay, lets
get started has three features (syntactic closure, final intonation,
semantic wholeness) common to turn transitions. Instead of a

264

continuation, then, Mr. Clarks statement could have been followed by others actions. Indeed, Lets get started and the utterance that follows are parts of a particular pair of sequentially
related actions: They sequentially implicate another action, the
students response to the summons (see adjacency pair).
Mr. Clarks ability to maintain his turn and develop the
sequence of classroom activities is one way that his role in the
situation is established and reinforced. Interactional sociolinguistics reveals how numerous features of language provide
clues to (or indices of) the social situation, activities, participant
identities, and relationships that may actually have a role in creating the context of interaction. The use of lets, for example, suggests that the speaker has authority over the hearer, thus evoking
a situation in which participants have an asymmetrical power
relationship (e.g., doctor/patient, parent/child). We can narrow
down the nature of the authority and the situation by noting that
the activity being started by Mr. Clark (review), and the object of
the review (problems), indicates a learning environment or one
in which an expert is instructing a novice. That the problems
were from last night reveals a cyclical pattern, a structured routine often found in formal institutions.
Language choices inferences about meaning, actions, roles,
relationships, and participation are all embedded in broader
cultural matrices of recurrent practices, knowledge, and meanings, which include beliefs about who should do what and how
they should do so, as well as the evaluations based on larger
values and ideologies (see ideology and language) of
particular outcomes of what is said and done. Ethnography of
communication elucidates these connections. For example,
the collaborative review of the problems portrays a cultural
belief system in which learning and attaining information arises
when novices work on their own (problems from last night)
and then, at a given point in time, present and review their solutions with an expert.
Just as language, inferences, actions, roles, relationships,
and participation are all embedded in culture, so too are they
intertwined with social processes and structures that sustain (or
restrict) power and privilege. critical discourse analysis
(see also discourse analysis [foucaultian]) explores how
ways of speaking can put those processes into place and reinforce
(or challenge) received means of authority. Mr. Clarks ability to
manage the use of time, select the activity in which to engage,
and organize the way in which information becomes distributed
as knowledge is consistent with a school setting in which his role
is not challenged. The power created by the institutional setting
links his discourse to broader social, cultural, and civic agendas.
What Mr. Clark says and more fundamentally, his ability to do
so thus positions him as one who can reinforce social structural
norms (who teaches whom? how? when?) and as an arbiter of the
official set of values, beliefs, and ideologies that are sanctioned
means of maintaining a stock of received knowledge. And the
fact that he is speaking on a special day that happens only once
a year when parents are permitted to visit en masse to observe
firsthand their childrens education highlights the public and
civic function of his role.
OPENING AN ORAL HISTORY INTERVIEW. The discourse analyzed
in (2) is from the beginning of an oral history interview. After

Discourse Analysis (Linguistic)

Figure 1. List structure.

greeting and introducing the Interviewee (IVee), the Interviewer


(IVer) asks a question. Rather than illustrate different approaches
to discourse, we use (2) to show how forms, structures, and
meanings are co-constructed. Key features are annotated with
subscripts: question/answer (Q/A) pairs1, turn-taking devices2,
and the use of and3 to build and indicate topic structure (see
Figure 1). In (2), Q and A indicate question and answer, respectively; lowercase letters (e.g., (Qa) indicate the successive Q/A
pairs. Dual numbers (e.g., 2/3 in line 2) indicates multiple features of organization.
(2)

1. IVer: 1Q (a)Id like you to tell me a li- something about


yourself now.
2. Your family and2/3
3. IVee: Mmhmm2.
4. 1AaUh Ive been living in Cleveland for the last 36 years.
5. IVer: mmhmm2
6. I uh at the present time uh I am a housewife,
7. and3 uh uh occupy myself uh uh sometimes helping my
husband with his office, when needed.
8. IVer: 1Q (b)What does he do?
9. IVee: 1A(b) Hes a podiatrist.
10. IVer: uhhuh2
11. IVee: 1A (a) And3 uh other times, I pursue, uh really uh
umthings that I enjoy um going to the museum, and3
swimming, and3 uh visiting ill people, and3 uh um spending time uh decorating my home,
12. IVer: Mmhmm2
13. IVee: and3 thats about2/ 1A(a)
14. IVer: 1Q(c) May I ask how old you are?
15. IVee: 1A(c)Yes2, Im sixty years old.
16. IVer: Mmhmm.2 Sixty2 1A(c).
17. IVee: Mmhmm.2

Interviewers mmhmm (lines 3, 5, 12) and uh huh (line 10)


allow Interviewee to continue to the end of her intonation
and information units (line 13).
Interviewees mmhmm (line 3) or yes (line 15) opens a turn
that will answer a question.
Reciprocal uses of mmhmm (lines 16, 17) open an opportunity for turn exchange;
and in initial or medial position in a turn (lines 7, 11) connects lateral items in the list (as in Figure 1).
Interviewer and Interviewee co-construct a hierarchical topic
structure in which information is organized on the basis of
lexical relationships (e.g., family, husband) and ad hoc categories (e.g., things I enjoy).
In sum, key works in early discourse analysis focused, relatively separately, on patterns of sentence-internal forms, clause
sequences, cross-sentence meanings that arise across sentences. Recent approaches continue the search for various types
of patterns, but add an interest in how those patterns emerge in
texts produced by more than one person in relation to context.

Conclusion
Discourse analysis provides a range of methodologies that are
applicable to different facets of language in text and context.
Although we have been able to consider only some components
of discourse analysis, our discussion and sample analyses help
us extract several general principles (Schiffrin 1994):
(1) Analysis of discourse is empirical: Data are based on people using language, not linguists thinking about how people
use language.
(2) Analyses are accountable to the data: They have to explain
the data in both sequential and distributional terms.

Space prohibits discussion of each feature, but note the


following:

(3) Analyses are predictive: They produce hypotheses that


can be falsified or modified by other data.

A multifaceted question (line 1) receives answers that occupy


several turns, during which short question/answer pairs
(lines 8 and 9, 14 and 15) are embedded.
Question forms (compare lines 1, 14, 18, and 19 to line 8) are
more complex when they shift topic or level of information.

(4) Discourse is not just a sequence of linguistic units: Its coherence (see coherence, discourse) cannot be understood if
attention is limited just to linguistic form and meaning.
(5) Resources for coherence jointly contribute to participant
achievement and understanding of what is said, meant, and

265

Discrete Infinity
done. In other words, linguistic forms and meanings work
together with social and cultural meanings, and interpretive
frameworks, to create discourse.
(6) The structures, meanings, and actions of everyday spoken discourse are interactively achieved.
(7) What is said, meant, and done is sequentially situated;
that is, utterances are produced and interpreted in the local
contexts of other utterances.
(8) How something is said, meant, and done speakers
selection among different linguistic devices as alternative
ways of speaking is guided by relationships among the
following:
(a) speaker intentions;
(b) conventionalized strategies for making intentions
recognizable;
(c) the meanings and functions of linguistic forms in relation to the text and context in which they appear;
(d) the sequential context of other utterances;
(e) properties of the textual mode, for example, narrative,
description, exposition;
(f) the social context, for example, participant identities
and relationships, structure of the situation, the setting;
(g) a cultural framework of beliefs and actions.
When brought together, this set of heuristic tools leads to the
following principle: Our uses of language, and the functions
that it accomplishes, are interactively constructed by people
using language together (e.g., taking turns at talk, drawing inferences about communicative intentionality) and drawing
upon properties of language and its ability to join smaller units
(clauses, sentences) into larger units (texts) that both reflect and
create the social contexts in which they emerge.
Deborah Schiffrin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Halliday, Michael, and R. Hasan. 1976. Cohesion in English.
London: Longman.
Harris, Zellig. 1952. Discourse analysis. Language 28: 130.
Jaworski, Adam, and N. Coupland, eds. 1999. The Discourse Reader.
London: Routledge.
Johnstone, Barbara. 2006. Discourse Analysis. Oxford: Blackwell.
Labov, William. 1972. The transformation of experience in narrative
syntax. In Language in the Inner City. Philadelphia: University of
Pennsylvania Press, 35496.
Labov, William, and J. Waletzky. 1967. Narrative analysis. In Essays
on the Verbal and Visual Arts, ed. June Helm. Seattle: University of
Washington Press, 1244.
Schiffrin, Deborah. 1994. Approaches to Discourse. Oxford; Blackwell.
Schiffrin, Deborah, D. Tannen, and H. Hamilton, eds. 2001. Handbook of
Discourse Analysis. Oxford: Blackwell.

DISCRETE INFINITY
This locution was brought into linguistics by Noam Chomsky
(for instance, Chomsky 1988, 170) to characterize the fact that
human languages are built up from discrete units (morphemes

266

or words) that can be combined into infinitely many possible


sentences. Marc D. Hauser, Chomsky, and W. Tecumseh Fitch
(2002) argue that humans are the only species whose language
is characterized by discrete infinity, and there are debates concerning the evolution of the property.
The opposition discrete versus continuous applies to systems, models, domains, and variables. A digital clock represents
time as discrete; an (idealized) analog clock represents time as
continuous.
A set is finite if its size (cardinality) is some natural number
(0,1,2, ), otherwise infinite. The set of all natural numbers and
the set of all points on a line are infinite. It is standardly argued
that the set of all possible sentences in English is infinite, even
though only finitely many sentences have ever been uttered. If
the set were finite, there would be a longest sentence. But from
any sentence we can construct a longer one, for instance, by adding an and-clause; so the set is infinite.
Some nonhuman communication systems (see animal communication and human language) are arguably infinite, but
because of continuous variation on one or a few parameters, thus
not discrete. The honeybees waggle dance language involves
continuous variations in tempo, body orientation, and intensity,
indicating distance, direction from the hive, and quality of food
source. If these continuous variables can take any real-number
value within some range, then the resulting language is nondenumerably infinite, a cardinality greater than standardly attributed
to human languages, even though each message has only three
words. In Chomskys terminology it is a nondiscrete infinity.
The enterprise of generative grammar aims to account
for the discrete infinity (and other properties) of language; all
versions of generative grammar employ recursion in some
form to provide a finite description of a denumerably infinite set
built from discrete building blocks.
D. T. Langendoen and Paul M. Postal (1984) argue that sentences need not be finite in length and that the class of sentences
of a language is nondenumerably infinite. Would this also be a discrete infinity? There is no known definition of discrete in this context
that would settle the issue (the phrase is rare outside Chomskyan
contexts). The conventional answer is no, apparently because
typical examples of nondenumerably infinite sets, such as the set
of points on a line, involve continuous domains. But if discrete
in discrete infinity is meant to characterize the building blocks
of the system, the answer would be yes. Some mathematicians
think the phrase is unclear and best avoided altogether.
Barbara H. Partee
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1988. Language and Problems of Knowledge: The
Managua Lectures. Cambridge, MA: MIT Press.
Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch. 2002. The
faculty of language: What is it, who has it, and how did it evolve?
Science 298: 156979.
Langendoen, D. T., and Paul M. Postal. 1984. The Vastness of Natural
Languages. Oxford: Basil Blackwell.
Studdert-Kennedy, M., and L. Goldstein. 2003. Launching language: The
gestural origin of discrete infinity. In Language Evolution, ed. M.
Christiansen and S. Kirby, 23554. Oxford: Oxford University Press.

Disorders of Reading and Writing

DISORDERS OF READING AND WRITING


literacy is one of our most important cultural tools. It allows us
to communicate our thoughts and ideas across space and across
time in a manner that seems as natural and effortless as speaking and listening. It is easy to forget that like cars and musical
instruments, writing systems are inventions that need to be
learned, much as one learns to drive a car or play an instrument.
Much has been learned about the reading and writing process by
studying people who have impairments in reading or writing.
Disorders of written language can be split into two broad
camps: developmental disorders and acquired disorders. Recent
years have seen a tendency to use the term neurodevelopmental disorder, rather than developmental disorder, reflecting
the growing consensus that atypical development is often the
product of genetic and/or environmental influences on early
brain development. In contrast, acquired disorders are a consequence of brain damage, typically caused by disease or head
injury. The difference between developmental and acquired
disorders is not analogous to disorders that affect children versus adults: Developmental dyslexia is a lifelong condition
that continues to manifest itself throughout adulthood, and an
acquired disorder of written language can arise following brain
damage inflicted during childhood. However, there are important differences between acquired and developmental disorders.
As developmental disorders affect developing systems, they
are very rarely sharply defined, and one tends to see associated
deficits across a range of behaviors; in contrast, acquired disorders reflect selective damage to what (one assumes) was a fully
working system. Consequently, different patients with damage
to different subsystems can show remarkably different and
remarkably specific types of reading impairment (Bishop
1997).

Defining Reading and Writing


Consider what you are doing as you read a text. Letters and
words are processed visually (see word recognition,
visual) at a rate of many items per minute, their forms recognized and meanings decoded or inferred. Words are only part of
the story: Phrases and sentences need to be interpreted, relevant background knowledge activated, and inferences generated as information is integrated during the course of reading.
Control processes are needed to monitor both ongoing comprehension and the internal consistency of text, allowing the reader
to initiate repair strategies (for example, rereading) if comprehension breakdown is detected. In short, readers need to form a
mental model of the text they are reading. To some extent, one
can think of writing being the reverse of reading, with the writer
beginning with a conceptual message that he or she wishes to
communicate and ending with inkmarks on a page.
Although written language clearly involves visual and motor
processes (identifying letters, scanning text, handwriting), the
cognitive psychology of reading and writing has been most concerned with an understanding of the language bases of reading
and writing. Thus, visuo-motor aspects of reading and writing will
not be considered here. Instead, we focus on reading and writing
as linguistic skills, skills that have their routes in our biological
endowment for spoken language (see genes and language).

Even when we restrict our focus to the linguistic bases of written


language impairments, it is clear that in both developmental and
acquired cases, reading and writing can go wrong for a variety of
reasons.

Developmental Disorders of Written Language


As developmental disorders of written language are disorders of
development, it is important to consider them within the context offered by models of typical development (see spelling
and writing and reading, acquisition of). It is useful to
make a distinction between impairments that affect word-level
processes (recognizing words, spelling) and impairments that
affect the higher level processes involved in comprehending text and producing written narrative. As children learn
to read, there is generally a strong association between decoding (defined as the ability to read a word aloud) and comprehension: Children who are good at decoding tend to have good
comprehension, and children who are poor at decoding tend
to have weak comprehension. For some children, however, the
two sets of skills develop out of step. In dyslexia, a developmental disorder experienced by 310 percent of children, decoding is
slow, effortful, and error prone, yet their actual comprehension
of what they have read can be impressive (Snowling 2000). In
contrast, approximately 10 percent of children can be described
as poor comprehenders: Despite having well-developed decoding skills, they are poor at understanding what they have read
(Nation 2005).
DEVELOPMENTAL DYSLEXIA. Developmental dyslexia is typically
diagnosed when a child experiences profound difficulty reading and spelling words, despite normal educational opportunity
and normal-range general intelligence. Dyslexia runs in families
and there is good evidence from behavioral genetics demonstrating genetic heritability of the disorder. Although genes are
yet to be identified, regions of interest have been implicated on
chromosomes 1, 2, 3, 6, 15, and 18, suggesting that patterns of
inheritance are complex and polymorphic (Nation and Coltheart
2006; Pennington and Olson 2005). It seems likely that genetic
factors (in interaction with environmental factors) influence
the development of brain areas implicated in the neural circuitry that underpins reading and spelling (Price and McCrory
2005; see also writing and reading, neurobiology of).
However, our understanding of how genetic factors influence
brain development and lead to developmental dyslexia is relatively unspecified.
Cognitive explanations of developmental dyslexia are more
specified. It is widely accepted that many people with dyslexia
have underlying impairments in processing phonological
aspects of oral language. According to the phonological deficit hypothesis (Snowling 2000), children with dyslexia have
difficulty representing and processing phonological information. This leads to difficulties on tasks that tap phonological
processing, including aspects of speech perception and
speech production and, most notably, difficulties with
phonological awareness. In an alphabetic language, at
least, learning to read and spell places heavy demands on phonological skills inasmuch as children need to learn to make
fine-grained mappings between phonology and orthography.

267

Disorders of Reading and Writing


Children with poor phonological skills find this process more
difficult, as evidenced by the well-replicated finding that people
with dyslexia are poor at reading novel words: Young children
with dyslexia find nonword reading extraordinarily difficult, and
even well-compensated adults whose more obvious difficulties
with reading have resolved are slower and often less accurate at
reading nonwords a lasting legacy of their dyslexia.
There is some debate as to whether there are subtypes of
developmental dyslexia. According to the dual-route model
(Coltheart 2005), words can be read via one of two independent
routes: a sublexical route, mediated by phonological rules dictating mappings between graphemes and phonemes, and a lexical
route, mediated by visual-orthographic mappings. The sublexical route is needed to read nonwords, whereas the lexical route is
needed to read words that do not obey grapheme-phoneme correspondence rules (i.e., exception words, such as yacht, chaos,
and enough). The majority of children with dyslexia have phonological impairments, and therefore in dual-route terms they
have phonological dyslexia, caused by an impaired sublexical
route. In contrast, a small proportion of children with dyslexia
have less severe phonological deficits. For these children, typically referred to as having developmental surface dyslexia, their
greatest difficulty is with reading and spelling exception words,
caused by an impaired lexical route.
Theorists disagree about the validity and stability of the surface subtype. More generally, they debate the need to evoke
two separate routes to word reading. An alternative account is
provided by the triangle model, a connectionist model in
which reading aloud is accomplished via sets of interactive connections between three sets of units: phonological, orthographic,
and semantic. Unlike the dual-route model, individual differences in reading are not a consequence of an impaired lexical
or sublexical route. Instead, differences in the quality of representations (phonological, orthographic, or semantic) or in the
strength of mappings between different representations, bring
about different patterns (or subtypes) of reading behavior (see
Plaut, 2005 for discussion of the triangle model and how it differs
from the dual-route model; see Snowling 2000 for a discussion of
how developmental dyslexia can be accommodated by the triangle model).
POOR COMPREHENDERS. Unlike children with dyslexia, poor
comprehenders do not have difficulty with decoding and producing words; however, they are poor at understanding what
they read. In particular, they are poor at making inferences
when reading text, and they are less able to integrate information across sentences in order to resolve anomalies (Oakhill and
Yuill 1996). It is also clear that poor comprehenders difficulties
are not restricted to reading comprehension. They are also poor
at listening comprehension and they have relative weaknesses in
oral vocabulary and word knowledge, understanding figurative
language, and with aspects of grammar (Nation et al. 2004). Thus,
poor comprehenders difficulties with reading comprehension
should be seen against a backdrop of more general difficulties
in processing and comprehending language, leading to difficulties in building a mental model of text or discourse. In contrast
to these deficits, poor comprehenders show strengths in phonological processing and phonological awareness, facilitating the

268

development of good decoding and word recognition. Clearly,


however, adequate decoding and strengths in the phonological
skills that underpin decoding are not sufficient to guarantee adequate comprehension.
A similar dissociation exists between word-level versus
higher-level aspects of writing. L. Cragg and K. Nation (2006)
found that poor comprehenders spell at age-appropriate levels.
However, when asked to write a story from a series of picture
prompts, the same children produced narratives that captured
less of the story content and contained a less sophisticated story
structure. These findings are consistent with what we know about
poor comprehenders oral language skills, with strengths in phonological skills promoting adequate spelling but weaknesses in
language comprehension constraining the more compositional
aspects of narrative production.

Acquired Disorders of Written Language


Acquired disorders of reading and spelling are observed in
patients with aphasia following stroke, head injury, or progressive brain disease (see brain and language ). Some
patients show deficits that are a consequence of impairments
in visual processing and letter recognition (e.g., pure alexia;
Behrmann, Plaut, and Nelson 1998). In line with our discussion of developmental disorders, however, we focus here on
acquired disorders of written language that have their bases in
spoken language. The majority of work on acquired disorders
of reading and writing has focused on patients ability to read
single words aloud or spell single words to dictation. Some of
this work is reviewed very briefly here; rather surprisingly, few
studies have investigated aspects of reading comprehension
and narrative production, although difficulties in discourselevel processing have been noted in patients with right hemisphere brain damage.
SURFACE AND PHONOLOGICAL DYSLEXIA: TRADITIONAL COGNITIVE
NEUROPSYCHOLOGY AND THE DUAL-ROUTE MODEL OF READING
ALOUD. The study of patients with acquired dyslexia has played
a central role in the field of cognitive neuropsychology. In
particular, the dissociation between patterns of intact and
impaired behaviors in two types of acquired dyslexia, namely,
surface dyslexia and phonological dyslexia, provided important
support for the dual-route model of reading aloud (Coltheart
2005).
Patients with surface dyslexia are poor at reading exception
words (words that have irregular mappings between orthography and phonology), which they tend to regularize (e.g.,
reading pint to rhyme with mint). According to the dual-route
framework, this is a consequence of damage to the lexical route,
meaning that patients overrely on the sublexical route; hence,
they produce overregularization to irregular forms. The
term phonological dyslexia is used to describe the condition of
patients who show particular impairments in decoding novel
words. Traditionally, this has been interpreted within a dualroute framework as a consequence of damage to the sublexical
route, responsible for translating graphemes to phonemes via
phonological rules. Thus, nonwords tend to be lexicalized, with
patients reading a nonword as if it were a visually similar familiar
word (e.g., reading bem as ben).

Disorders of Reading and Writing


SURFACE AND PHONOLOGICAL DYSLEXIA: THE PRIMARY SYSTEMS
HYPOTHESIS. The dual-route model is a model of the reading system, relatively divorced from the underlying cognitive and linguistic skills that subserve reading. An alternative approach is to
consider the extent to which acquired disorders are a consequence
of impairment to one or more of those underlying primary skills
upon which reading is parasitic (for example, phonology, semantics, and visual processing). This perspective, termed the primary
systems hypothesis, is reviewed in detail by M. A. Lambon Ralph
and K. E. Patterson (2005). It draws heavily on connectionist models of reading, especially the triangle model described previously.
Space precludes a full description of the model (see Plaut 2005 for
detailed review), but it differs fundamentally from the dual-route
model in a number of key ways. As a model of the language system rather than the reading system, it predicts that patients with
reading problems should also show concomitant weaknesses in
aspects of language processing more generally.
How does the triangle model account for surface and phonological dyslexia? Surface dyslexia (i.e., poor exception word reading) is proposed to be a consequence of reduced activation from
semantic representations impacting on the connections between
phonology and orthography. Lambon Ralph and Patterson
(2005) provide technical details underpinning this proposal, and
explain the balance of evidence supporting it. According to the
primary systems hypothesis, if surface dyslexia is a consequence
of impaired semantics, then patients should exhibit semantic
impairments that is, impairments on nonreading tasks that
require knowledge of, or access to, word meanings. In support of
this, patients with semantic dementia show a variety of semantic impairments and show a surface dyslexia reading profile
(Graham, Hodges, and Patterson 1994).
In contrast to the semantic weaknesses considered to underpin surface dyslexia, the triangle model proposes that impairments in phonology underpin phonological dyslexia. Inasmuch
as reading nonwords places heavy demands on the connections
between phonology and orthography (as nonwords have no
meaning, contributions from semantic knowledge is minimal),
if patients have weaknesses in the phonological domain, these
should be exhibited as relatively stronger nonword than word
reading deficits exactly the pattern seen in patients with phonological dyslexia. And, consistent with the primary systems
hypothesis, patients with acquired phonological dyslexia also
show weaknesses on nonreading tasks that tap phonological
skills, including word and nonword repetition and phonological
awareness (Bird et al. 2003).
ACQUIRED DYSGRAPHIA. Often, patients with acquired dyslexia
show associated impairments in spelling words to dictation
(dysgraphia); however, some patients show selective impairments in spelling. As with reading impairment, damage to different aspects of the language system produces different patterns
of spelling impairment. Some patients are very poor at using
spelling-sound conversion rules to spell novel words, similar
to the pattern of reading behavior seen in patients with phonological dyslexia; others tend to make regularization errors when
spelling exception words, akin to surface dyslexia (see Romani,
Olson, and Di Betta 2005 for discussion of these and other types
of acquired [and developmental] dysgraphias).

Summary and Conclusions


Reading and writing are complex processes, and it is clear that
they may be impaired for a variety of reasons. Written language
is parasitic upon spoken language, and, therefore, it is no surprise to find that oral language weaknesses are associated with
impairments of reading and writing in both developmental and
acquired disorders. More specifically, different aspects of the oral
language system (e.g., phonology and semantics) appear to be
more or less associated with different aspects of reading or writing failure. A challenge for future work is to understand how oral
language skills interact with each other and with orthographic
factors to produce different patterns of written language impairment. In addition, many challenges remain for understanding
how genetic and environmental risk factors interact to influence
brain development so as to cause a developmental disorder of
reading or writing.
Kate Nation
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Behrmann, M., D. C. Plaut, and J. Nelson. 1998. A literature review and
new data supporting an interactive account of letter-by-letter reading.
Cognitive Neuropsychology 15: 751.
Bird, H., M. A. Lambon Ralph, M. S. Seidenberg, J. L. McClelland, and
K. Patterson. 2003. Deficits in phonology and past tense morphology.
Journal of Memory and Language 48: 50226.
Bishop, D. V. M. 1997. Cognitive neuropsychology and developmental disorders: Uncomfortable bedfellows. Quarterly Journal of
Experimental Psychology 50A: 899923.
Coltheart, M. 2005. Modeling reading: The dual-route approach. In
Snowling and Hulme 2005, 623.
Cragg, L., and K. Nation. 2006. Exploring written narrative in children with poor reading comprehension. Educational Psychology
21.1: 5572.
Graham, K., J. R. Hodges, and K. E. Patterson. 1994. The relationship
between comprehension and oral reading in progressive fluent aphasia. Neuropsychologia 32: 299316.
Lambon Ralph, M. A., and K. E. Patterson. 2005. Acquired disorders of
reading. In Snowling and Hulme 2005, 41330.
Nation, K. 2005. Childrens reading comprehension difficulties. In
Snowling and Hulme 2005, 24866.
Nation, K., P. Clarke, C. M. Marshall, and M. Durand. 2004. Hidden
language impairments in children: Parallels between poor reading
comprehension and specific language impairment. Journal of Speech,
Hearing and Language Research 47: 199211.
Nation, K ., and M. Coltheart, eds. 2006. The genetics of reading.
Journal of Research in Reading 29. Special issue containing a number
of articles exploring the heritability of reading and related issues.
Oakhill, J. V., and N. Yuill. 1996. Higher order factors in comprehension
disability: Processes and remediation. In Reading Comprehension
Difficulties, ed C. Cornoldi and J. V. Oakhill, 6992. Mahwah, NJ:
Lawrence Erlbaum.
Pennington, B. F., and R. K. Olson. 2005. Genetics of dyslexia. In
Snowling and Hulme 2005, 45372.
Plaut, D. C. 2005. Connectionist approaches to reading. In Snowling
and Hulme 2005, 2438.
Price, C. J., and E. McCrory. 2005. Functional brain imagining studies of
skilled reading and developmental dyslexia. In Snowling and Hulme
2005, 47396.
Romani, C., A. Olson, and A. M. Di Betta. 2005. Spelling disorders. In
Snowling and Hulme 2005, 43148.

269

Division of Linguistic Labor

Dyslexia

Snowling, M. J. 2000. Dyslexia. Oxford: Blackwell.


Snowling, M. J., and C. Hulme, eds. 2005. The Science of Reading.
Oxford: Blackwell.

DIVISION OF LINGUISTIC LABOR


According to Hilary Putnams (1975) division of linguistic labor,
speakers routinely use terms whose extension (see intension
and extension, reference and extension) they would not
be able to fix. For example, most of us cannot tell the difference
between gold and fools gold. Nevertheless, we know that the two
are different, and when we use the word gold, we mean to refer to
the real thing to the material that experts who can distinguish
between gold and fools gold call gold. If there is ever a dispute
about whether our use of the word is appropriate, we can consult
one of these experts. Using examples like this one, Putnam proposed that knowledge of word meaning is not a private mental
property. Instead, it is the responsibility and achievement of the
collective linguistic community: Metallurgists can fix the extension
of the word gold, botanists can fix the extension of the word elm,
and so on. The average speakers use of such terms depends upon
an implicit structured cooperation between that person and the
experts in the relevant domains (see socially distributed
cognition).
Putnam (1975) proposed the division of linguistic labor as
part of a seminal argument against traditional accounts of word
meaning. Many of these accounts hold that knowing the meaning of a word is a function of being in a particular psychological state. In fact, he argued, two speakers can share the same
psychological state (neuron for neuron) but mean different
things. For example, imagine two speakers who know exactly
the same things about beech and elm trees: They know that
both are large deciduous trees, but they cannot tell them apart
(for that, they defer to experts). If one speaker uses the word
beech to refer to an elm and the other uses the word to refer to
a beech, the two speakers share the same psychological state,
but they mean different things. As Putnam (1975, 144) famously
put it, Cut the pie any way you like, meanings just aint in the
head!
The notion that speakers do not know much about many of the
words they use is not controversial. However, there has been vigorous debate about which words are subject to a division of linguistic
labor and whether a division of linguistic labor necessarily implies
that meanings aint in the head (see Pessin and Goldberg 1996).
For example, J. Searle (1983) argued that knowing that there are
experts who can be called upon to fix a words extension should be
considered part of knowing the meaning of a word.
Vikram K. Jaswal
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Pessin, A., and S. Goldberg, eds. 1996. The Twin Earth Chronicles: Twenty
Years of Reflection on Hilary Putnams The Meaning of Meaning.
New York: M. E. Sharp.
Putnam, Hilary. 1975. The Meaning of Meaning. In Minnesota Studies
in the Philosophy of Science. Vol. 7: Language, Mind, and Knowledge.
Ed. Keith Gunderson, 13193. Minneapolis: University of Minnesota
Press.
Searle, J. 1983. Intentionality. Cambridge: Cambridge University Press.

270

DYSLEXIA
Introduction: What Is Dyslexia?
Dyslexia is a specific learning difficulty affecting literacy
development. Children and adults with developmental dyslexia
show difficulties in reading and spelling that are not explicable in terms of their age, intelligence, or educational experience.
Children with dyslexia typically have marked difficulties in learning to read and spell words, though their understanding of what
they read may be good. These difficulties are often accompanied
by difficulties in short-term memory and organization. In adulthood, the word-reading difficulties may resolve, but spelling and
other underlying difficulties remain.

Behavioral Manifestations of Dyslexia


Reading development depends on two foundation skills, letter-sound knowledge and phonological awareness, the
ability to identify the small sounds in speech (Byrne 1998). A
childs ability to establish mappings between the letter strings
of printed words and these speech sounds (phonemes) allows
printed words to be decoded and is the basis for the acquisition
of later and more automatic reading skills. Thus, individual differences in phonological awareness predict differences in the
ability of children to learn to read. The most common pattern
of reading deficit in dyslexia in English is poor nonword reading, a task that requires the decoding of unfamiliar words. To
some extent, spelling draws on the same processes as decoding;
however, English words cannot be spelled solely on the basis of
sound-letter mapping rules but also require knowledge of grapho-tactic or morphological rules and sometimes rote learning. Thus, for children with dyslexia, spelling poses even more of
a significant challenge than reading.
An important issue is whether dyslexia has the same symptoms in more consistent or transparent languages than English.
Findings from a variety of transparent languages show that
orthographic consistency of grapheme-phoneme correspondences affects the rate at which children acquire reading skills.
Specifically, when correspondences between letters and phonemes are regular, children quickly learn the phonological skills
required for reading and spelling. Thus, children with dyslexia
learning to read in transparent orthographies have less serious
difficulties than their English-speaking counterparts; for them,
the main behavioral feature of dyslexia is a problem in reading fluency (Caravolas 2005). Conversely, in languages such as
Mandarin Chinese, in which the orthography does not consistently signal the corresponding phonology, one might expect
that the relationship between dyslexia and phonological awareness differs again. To date, there has been little research on this
issue (Hanley 2005), but the extant literature suggests that both
phonological and morphological processing skills are associated
with reading difficulties in Chinese.

Theories of Dyslexia
Current theories of dyslexia are cast at either the biological or cognitive levels of explanation. The predominant cognitive account
of dyslexia views the primary cause as a phonological processing impairment (Vellutino et al. 2004). According to this hypothesis, children with dyslexia have phonological deficits that cause

Dyslexia
a wide range of symptoms, not all of which are directly related
causally to the reading deficits (e.g., verbal short-term memory
problems and word-finding difficulties). As far as is known, such
symptoms are equally common among children learning to read
in all languages.
Many other theories of dyslexia accept phonological difficulties as a proximal cause of reading problems but cite more lowlevel deficits as their distal cause. For example, the automization
deficit hypothesis (Nicolson and Fawcett 1990) proposes that
difficulties in the cerebellum in dyslexic children place similar constraints on learning of all skills, including phonology,
naming abilities and basic motor skills. The proposal of William
Lovegrove, Frances H. Martin, and Walter L. Slaghuis (1986) that
people with dyslexia have impairments of the magnocellular
system (the division of the visual system that responds to rapid
changes) has also generated much research. Findings are mixed,
with some studies reporting no evidence of abnormal sensitivity and others suggesting that group differences between people
with dyslexia and normal readers may be related to uncontrolled
differences in IQ. Research investigating visual attention problems in dyslexia is also inconclusive.
An influential hypothesis is that dyslexia stems from a deficit in basic auditory processing. Specifically, a rapid auditory processing deficit found with both speech and nonspeech
sounds would affect the perception of consonants distinguished
by rapid changes in the speech signal, and further, poor speech
perception would affect the development of phonological
processing skills (Tallal 2004). Investigation of auditory deficits
in dyslexia has extended to such tasks as frequency discrimination, frequency modulation, binaural processing, and backward
masking. However, as with findings on visual impairments, the
literature is replete with conflicting results, and an alternative
suggestion is that the deficit is not a general auditory impairment
but is specific to the processing of speech sounds. Investigations
of speech perception in dyslexia have highlighted subtle impairments, although again there are conflicting results. The lack of
consensus in the field regarding sensory impairments has led to
the proposal that they frequently occur in dyslexia but are not
causally linked to it (Ramus 2004). Further investigation of this
complex issue is needed.

Etiology of Dyslexia
GENETIC FACTORS. It has long been known that dyslexia runs in
families; however, because families share genes as well as environments, it is important to attempt to disentangle genetic and
environmental influences. Twin studies have been helpful in this
regard (Pennington and Olson 2005). Most twin studies of reading and reading disability report that both reading and phonological awareness are heritable skills, and thus it can be inferred
that dyslexia has a genetic basis. Furthermore, molecular genetic
studies have found gene markers of dyslexia as well as some candidate genes, though it is far from clear what the genetic mechanisms are (Fisher and Francks 2006).
It is important to note that the genes implicated in dyslexia
indicate a susceptibility to reading difficulties but not that reading problems are fully genetically determined. The interaction
of different skills in determining reading outcomes can be seen
in studies of children at family risk of dyslexia followed from

the preschool years (e.g., Snowling, Gallagher, and Frith 2003).


These studies highlight a wide range of different literacy outcomes. Although many are slow in the early stages of reading,
some recover from this slow start to go on to be normal readers,
whereas others have persistent problems.
NEUROBIOLOGICAL BASES. Most children with specific reading
difficulties do not have any detectable neurological abnormality.
However, evidence suggests that atypical brain development is
implicated (Leonard et al. 2001). Other symptoms that co-occur
with dyslexia may also be important in defining subtypes of dyslexia, and the neuroanatomical markers of different forms may
differ.
In addition to studies of brain structure, much recent work
has focused on functional abnormalities in the brains of people
with dyslexia. Typically, people with dyslexia have been reported
to show less activation than controls in the left temporal and
parietal lobes (Price and McCrory 2005). However, it remains
unclear whether differences in brain activation are a sign of
some constitutional limitation of brain processing or whether
they simply reflect activation of a persons inability to read words
using a phonological approach (a task that uses these language
regions).
ENVIRONMENTAL FACTORS. School, home, and broader environmental factors contribute to a childs risk of developing reading
problems. At the broadest level, reading disorders show social
class differences, and direct literacy-related activities in the
home are also important, though evidence suggests that these
activities primarily affect reading comprehension via vocabulary
growth (Phillips and Lonigan 2005).
It is important to note that genes and the environment interact, and there is evidence that children with dyslexia tend to
avoid reading activities, such that their reading problems may
become magnified over time. Where parents themselves have literacy problems, home literacy experiences may also be less than
optimal. In addition, comparisons of children from the same
area attending different schools have emphasized that schooling can make a substantial difference to reading achievement
(Rutter and Maughan 2002). Over time, the cumulative impact
of environmental processes can have a very significant effect on
reading progress.
In keeping with the relevance of both genetic and environmental factors, there is currently a move away from single-deficit
models toward multifactorial models that explain the nature and
causes of dyslexia (Pennington 2006).

Comorbidity
Dyslexia shows some similarities with specific language
impairment, and there is some debate as to whether they
should be characterized as the same disorder (Bishop and
Snowling 2004). There is also evidence of comorbidity between
dyslexia and various emotional and behavioral problems. Most
strikingly, dyslexia is highly comorbid with attention-deficit
hyperactivity disorder (ADHD) and, in particular, attention difficulties (Willcutt and Pennington 2000). Recent research suggests shared genetic risk factors as a possible cause. Children
with dyslexia also show an increased risk of developing clinically

271

Dyslexia

Ellipsis

significant emotional difficulties, possibly as a result of their


reading difficulties (Carroll et al. 2005).

Reading Intervention
Theoretical knowledge of the relationship between phonological
skills and learning to read has led to the development of effective
reading intervention programs that promote phonological skills
in the context of reading (National Reading Panel, 2000) (see
teaching reading). Such interventions are effective both for
diagnosed dyslexics and for children who are at risk of reading
problems. An underresearched issue is the problem of children
who, despite high quality intervention, do not respond to teaching and continue to have reading impairments. These children
are often socially disadvantaged and may show additional emotional and behavioral difficulties.

Conclusions
Dyslexia is a highly researched developmental disorder. There
is now clear evidence that difficulties in phonological skills are
a major proximal cause of reading difficulties across languages.
There is also evidence that reading is a complex skill influenced
both by genetics and by the environment. However, outstanding issues remain. Notably, models of the disorder are moving
toward a multiple deficit model, and it is unknown which is the
most appropriate support for children who do not respond to
standard phonics-based reading intervention.
Margaret J. Snowling and Julia M. Carroll

Nicolson, Rod I., and Angela J. Fawcett. 1990. Automaticity a new


framework for dyslexia research. Cognition 35: 15982.
Pennington, Bruce F. 2006. From single to multiple deficit models of
developmental disorders. Cognition 101: 385413.
Pennington, Bruce F., and Richard K.Olson. 2005. Genetics of dyslexia.
In Snowling and Hulme 2005, 45372.
Phillips, Beth M., and Christopher J. Lonigan. 2005. Social correlates of
emergent literacy. In Snowling and Hulme 2005, 17387.
Price, Cathy J., and Eamon McCrory. 2005. Functional brain imaging
studies of skilled reading and developmental dyslexia. In Snowling
and Hulme 2005, 47396.
Ramus, Franck. 2004. Neurobiology of dyslexia: A reinterpretation of the
data. Trends in Neurosciences 27: 7206.
Rutter, Michael, and Barbara Maughan. 2002. School effectiveness findings 19792002. Journal of School Psychology 40: 45175.
Snowling, Margaret J. 2000. Dyslexia. 2d ed. Oxford: Blackwell.
Snowling, Maragaret J., Alison Gallagher, and Uta Frith. 2003. Family
risk of dyslexia is continuous: Individual differences in the precursors
of reading skill. Child Development 74: 35873.
Snowling, Margaret J., and Charles Hulme, eds. 2005. The Science of
Reading: A Handbook. Oxford: Blackwell.
Tallal, Paula. 2004. Improving language and literacy is a matter of time.
Nature Reviews Neuroscience 5: 7218.
Vellutino, Frank R., Jack M. Fletcher, Margaret J. Snowling, and Donna
M. Scanlon. 2004. Specific reading disability (dyslexia): What have
we learned in the past four decades? Journal of Child Psychology and
Psychiatry 45: 240.
Willcutt, Erik, and Bruce Pennington. 2000. Psychiatric co-morbidity
in children and adolescents with reading disability. Journal of Child
Psychology and Psychiatry 41: 103948.

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Bishop, Dorothy V. M., and Margaret J. Snowling. 2004. Developmental
dyslexia and specific language impairment: Same or different?
Psychological Bulletin 130: 85888.
Byrne, Brian. 1998. The Foundation of Literacy: The Childs Acquisition of
the Alphabetic Principle. Hove, UK: Psychology Press.
Caravolas, Marketa. 2005. The nature and causes of dyslexia in different
languages. In Snowling and Hulme 2005, 33657.
Carroll, Julia M., Barbara Maughan, Robert Goodman, and Howard
Meltzer, 2005. Literacy difficulties and psychiatric disorders: The
case for comorbidity. Journal of Child Psychology and Psychiatry
46: 52432.
Cunningham, Anne, and Keith Stanovich. 1990. Assessing print
exposure and orthographic processing skill in children: A quick
measure of reading experience. Journal of Educational Psychology
82: 73340.
Fisher, Simon E., and Clyde Francks. 2006. Genes, cognition and dyslexia: Learning to read the genome. Trends in Cognitive Sciences
10: 2507.
Hanley, J. Richard. 2005. Learning to read in Chinese. In Snowling and
Hulme 2005, 31635.
Leonard, Christine M., Mark A. Eckert, Linda J. Lombardino, Thomas
Oakland, John Kranzler, Cecile M. Mohr, Wayne M. King, and Alan
Freeman. 2001. Anatomical risk factors for phonological dyslexia.
Cerebral Cortex 11: 14857.
Lovegrove, William, Frances H. Martin, and Walter L. Slaghuis. 1986.
The theoretical and experimental case for a visual deficit in specific
reading disability. Cognitive Neuropsychology 3: 22567.
National Reading Panel. 2000. Report of the National Reading
Panel: Reports of the Subgroups. Washington, DC: National Institute of
Child Health and Human Development Clearing House.

272

E
ELLIPSIS
Ellipsis is the nonexpression of some lexical material specifically, a word or words forming a syntactic constituent that
is needed for the full interpretation of a sentence but is not
expressed because it can be recovered from the linguistic or realworld context. Under a traditional syntactic definition of ellipsis, elliptical gaps must be able to be filled with overt material,
thus distinguishing them from other types of gaps, like traces
of moved constituents. All natural languages permit ellipsis,
but they differ with respect to which constituents can be elided
in which configurations. Ellipsis falls within the larger field of
reference resolution.
Most studies of ellipsis concentrate on formalizing the licensing and recoverability conditions for elided constituents. The
former must account for what makes ellipsis grammatical in
given configurations whereas the latter concerns the ways in
which the meaning of the elided material can be understood
from the context. When the meaning of an elided constituent is
understood by coreference with a previously introduced linguistic constituent, that constituent is called the antecedent.
A cross-linguistic sampling of the many types of constituents
that are subject to ellipsis includes arguments of a verb (1); head
nouns in noun phrases with an overt quantifier, modifier, and so
on (in [2], laps); main verbs in so-called gapping constructions

Embodiment
(in [2], swam); verb phrases selected by an overt auxiliary (3);
and main verbs in sentences containing two or more overt arguments or adjuncts (4). The elided categories in the examples are
indicated by []. Textual antecedents, when present, are shown
in boldface.
1.

[] Pomoe mne? [Russian; the subject is elided]


[] Help 2.SG.FUTURE meDATIVE
Will you help me?

2.

Jack swami 20 lapsj and Beth [i] 25 [j].

3.

Greg is almost finished swimming but Bruce has just started


[].

4.

Kuda ty []? [Russian; the main verb is elided]


whereDIRECTIONAL youNOM
Where are you going?

Although ellipsis is generally defined syntactically, syntactic


approaches to the study of ellipsis (e.g., Lobeck 1995) are, by
necessity, partial because ellipsis decisions can be affected by
nonsyntactic factors like the semantics of the utterance, the
potential for ambiguity, the physical context of the speech situation, and so on (McShane 2005).
Certain types of ellipsis, like gapping, either require or are
promoted by syntactic and/or semantic parallelism.
Ellipsis is particularly challenging for natural language processing (NLP) systems since parsers (see parsing, machine)
must be able to detect the virtual presence of elided constituents,
and language generators must be supplied with rules of ellipsis
usage that go beyond the relatively broad generalizations found
in theoretical treatments.
Marjorie J. McShane
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Lobeck, Anne. 1995. Ellipsis: Functional Heads, Licensing and
Identification. New York: Oxford University Press.
McShane, Marjorie J. 2005. A Theory of Ellipsis. New York: Oxford
University Press.

EMBODIMENT
Embodiment refers to the ways in which persons bodies and
bodily interactions with the world shape their minds, actions,
and personal, cultural identities. Embodied accounts of mind
and language embrace the idea that human symbols are
grounded in recurring patterns of bodily experience, and therefore reject traditional dualistic, disembodied views of human
cognition and linguistic meaning. The study of embodiment
demands recognition that thought and language arise from the
continuous dynamic interactions among brains, bodies, and
the world. There are, in fact, three levels of embodiment that
together shape the embodied mind (Lakoff and Johnson 1999).
Neural embodiment concerns the structures that characterize
concepts and cognitive operations at the neurophysiological
level. The cognitive unconscious consists of the rapid, evolutionarily given mental operations that structure and make possible conscious experience, including the understanding and
use of language. The phenomenological level is conscious and

accessible to consciousness and consists of our awareness of


our own mental states, our bodies, our environment, and our
physical and social interactions.
Scholars opinions about the proper locus of embodiment in
cognition and language tend to privilege their own methodological preferences. For instance, neuroscientists tend to privilege
the brain and some peripheral aspects of the nervous system in
their studies of thought, language, and emotion; anthropologists
focus on cultural-specific behaviors and generally explore how
culture both is written onto bodies and gives cultural meanings
to bodily experiences and behaviors; cognitive linguists, and
some literary theorists, concentrate on the embodied nature
of linguistic structure and behavior, as well as on the embodied nature of speaking/listening and writing/reading; and psychologists tend to study the role of different bodily actions on
various cognitive activities. Despite these differing approaches,
many agree that an embodied understanding of mind and language requires attention to all three levels of embodiment and
their interaction.
There is now a large body of linguistic research demonstrating that the existence and specific meanings of many words and
phrases emerged from recurring patterns of bodily experience.
For instance, peoples frequent experiences of taking physical
journeys (i.e., beginning at some source, moving along a path,
and reaching some destination) appears to influence the development of metaphorical ways of talking about abstract ideas
and events, such as achieving a personal goal (e.g., I finally
am getting close to my Ph.D.) or having difficulties in personal
relationships (e.g., Our marriage has hit a dead-end street). In
this way, peoples bodily experiences of taking journeys is metaphorically extended to conceive of many ideas related to LIFE IS
A JOURNEY (Lakoff and Johnson 1999).
Embodied experience may also directly influence contemporary speakers understandings of many words and phrases.
Neuroscience research demonstrates that perceptual and motor
systems are specifically activated during immediate language
processing. Thus, areas of motor and premotor cortex associated
with specific body parts are activated when people hear language
referring to those body parts. Listening to different verbs associated with different effectors (i.e., mouth/chew, leg/kick,
hand/grab) leads to different firing rates in different regions of
motor cortex (i.e., areas responsible for appropriate mouth/leg/
hand motions exhibit greater activation) (Hauk, Johnsrude and
Pulvermuller 2004).
Psycholinguistic studies also demonstrate the automatic
recruitment of perceptual and motor systems in immediate
language understanding. For instance, people are slower to
understand a phrase like aim a dart when they first form a fist
than when they shape their hand into a dart-throwing position,
which suggests that semantic comprehension may engage relevant motoric processes (Klatzky et al. 1989). People also more
quickly understand a statement like grasp the concept when
they first make, or imagine making, a grasping motion than
when no grasping motion is made (Wilson and Gibbs 2007).
Thus, people need not necessarily inhibit the physical meanings
of certain metaphorically used words, like grasp, because these
meanings are recruited during the on-line construction of metaphorical meanings, such as when concepts are metaphorically

273

Emergentism
understood as things that can be grasped. Studies also show
that people understand metaphorical fictive motion sentences,
such as The road runs along the coast, in terms of implicit,
imaginary sensations of movement implicit in these sentences
(Matlock 2004). People are not aware of these simulations, and
so language processing is not dependent on deliberate thought
about motion. In general, psycholinguistic studies provide additional support for the broad claim, also now made in computational modeling research, known as simulation semantics
(Feldman and Narayanan 2004) that language use is closely tied
to embodied imagination.
The empirical work in cognitive science on embodiment in
language and thought (see Gibbs 2006) mirrors other debates in
philosophy and literary studies on the role of embodied imagination in literary and aesthetic experience. Readers emotional
involvement with fiction, for instance, may arise from their simulations of themselves as the characters they read about and their
fictional actions (Nichols 2006). In this manner, reading may not
be an abstract, purely mental process with little engagement of
the bodily imagination but is fundamentally tied to our powers
to recreate what it must be like to be and move like the people
we are reading about. Debate about this, and about other issues
related to embodiment in thinking and language, is central in
much contemporary scholarship in the humanities and cognitive sciences.
Raymond W. Gibbs, Jr.
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Feldman, J., and S. Narayanan. 2004. Embodied meaning in a neural
theory of language. Brain and Language 89: 38592.
Gibbs, R. 2006. Embodiment and Cognitive Science. New York: Cambridge
University Press.
Hauk, O., I. Johnsrude, and F. Pulvermuller. 2004. Somatotopic representation of action words in human motor and premotor cortex.
Neuron 41: 3017.
Klatzky, R. L., J. W. Pellegrino, B. P. McCloskey, and S. Doherty. 1989. Can
you squeeze a tomato? The role of motor representations in semantic
sensibility judgments. Journal of Memory and Language, 28: 5677.
Lakoff, G., and M. Johnson. 1999. Philosophy in the Flesh: The Embodied
Mind and Its Challenge to Western Thought. New York: Basic Books.
Matlock, T. 2004. Fictive motion as simulation. Memory & Cognition
32: 13891400.
Nichols, S., ed. 2006. The Architecture of The Imagination: New Essays on
Pretense, Possibility, And Fiction. Oxford: Oxford University Press.
Wilson, N., and R. Gibbs. 2007. Real and imagined body movement
primes metaphor comprehension. Cognitive Science, 31: 72131.

EMERGENTISM
A significant body of linguistic research can be situated in the
philosophical and scientific tradition known as emergentism.
This entry offers a brief overview of this work, with a focus on its
guiding principles and on the proposals it makes concerning the
nature of human language.

The Emergentist Tradition


The roots of emergentism can be traced to the work of John
Stuart Mill ([1843] 1930), who proposed that a system can have

274

properties that amount to more than the sum of its parts. The
physical world offers many examples of this, as Mill observes
(p. 243):
The chemical combination of two substances produces, as is well
known, a third substance with properties different from those of
either of the two substances separately, or both of them taken
together. Not a trace of the properties of hydrogen or oxygen is
observable in those of their compound, water.

Mills insight is relevant to the study of so-called complex systems ranging from atoms to the weather whose dynamic
nonlinear behavior involves many interacting and interconnected parts. (A system is dynamic if it is constantly in flux; it
is nonlinear if effects are out of proportion to causes, as when
a neglected candle causes a fire that destroys an entire city.
See self-organizing systems.) However, the question of
whether and to what extent language is an emergent phenomenon remains controversial.

Linguistic Emergentism
Although it is widely agreed that emergentist approaches to language necessarily stand in opposition to theories of the language
faculty that posit an innate universal grammar, other tenets
of linguistic emergentism are less well defined, and there is no
consensus within the field as to how precisely the standard problems of linguistic analysis should be confronted. Nonetheless,
the starting point for a substantial portion of emergentist work
seems to involve a commitment to the emergentist thesis for
language:
The phenomena of language are best explained by reference to
more basic nonlinguistic (i.e., nongrammatical) factors and
their interaction.

An appealing tag line for linguistic emergentism comes from


Elizabeth Bates and Brian MacWhinney (1988, 147): language,
they say, is a new machine built out of old parts. While there
is no general agreement concerning just what those parts might
be, the list is relatively short, ranging from features of physiology
and perception, to processing and working memory, to pragmatics and social interaction, to properties of the input and of
the learning mechanisms.
A significant amount of emergentist work within linguistics adopts the techniques of connectionism, an approach to
the study of the mind that seeks to model learning and cognition in terms of networks of (assumedly) neuron-like units. In its
more extreme forms, connectionism rejects the existence of the
sorts of symbolic representations (including syntactic structure)
that have played a central role in explanatory work on human
language. Gary Marcus (1998, 2001) and Kevin R. Gregg (2003)
offer a critique of this sort of eliminativist program while Paul
Smolensky (1999) and Mark Steedman (1999) discuss ways to
reconcile it with traditional symbolic approaches to language,
including the possibility that representations might be
abstract, higher-level descriptions that approximate the patterns
of neuronal activation that connectionist approaches seek to
model.
Although connectionist modeling provides a useful way to
test various predictions about language acquisition, processing,

Emergentism
change, and evolution, the eliminativist position is far from universally accepted within emergentism. Symbolic representations
of one form or another are evident in the work of many emergentists (e.g., Goldberg 1999; Tomasello 2003; OGrady 2001, 2005),
who nonetheless reject the view that the properties of those representations should be attributed to innate grammatical principles (see innateness and innatism).

Language Acquisition
To date, emergentist work within linguistics has focused most
strongly on the question of how language is acquired (see, e.g.,
the many papers in MacWhinney 1999). The impetus for this
focus stems from opposition to the central claim of grammatical nativism, which is that the principles underlying a good deal
of linguistic knowledge are underdetermined by experience and
must therefore be innate.
Emergentism is not opposed to nativism per se the fact that
the brain is innately structured in various ways is not a matter of
dispute. However, there is opposition to representational nativism, the view that there is direct innate structuring of particular
grammatical principles and constraints (Elman et al. 1996, 369 ff;
Bates et al. 1998), as implied by many of the proposals associated
with universal grammar.
Contemporary emergentism often includes a commitment
to explaining linguistic development by reference to the operation of simple learning mechanisms (essentially, inductive generalization) that extract statistical regularities from experience.
It is interesting that there is as yet no consensus as to what form
the resulting knowledge might take local associations and
memorized chunks (Ellis 2002), constructions (Goldberg 1999;
Tomasello 2003), or computational routines (OGrady 2001,
2005). In addition, there is variation with respect to the exact
relationship that is assumed to hold between learning and relative frequency in the input. Some work implies a quite direct
relationship (e.g., Ellis 2002), but other work suggests something
less direct (e.g., Elman 2002).
Emergentist work on language acquisition often makes use
of computer modeling to test hypotheses about development.
Jeffrey Elman and his colleagues (e.g., Elman 2002) have been
able to show that a simple recurrent network (SRN) can achieve
at least some of the milestones associated with language acquisition in children, including the identification of category-like
classes of words, the formation of patterns not observed in the
input, retreat from overgeneralizations, and the mastery of subjectverb agreement. (An SRN learns to produce output of its own
by processing sentences in its input; it is specifically designed to
take note of local co-occurrence relationships or transitional
probabilities given the word X, whats the likelihood that the
next word will be Y?)
Emergentist modeling has yielded impressive results, but it
raises the question of why the particular statistical regularities
exploited by the SRN are in the input in the first place. In other
words, why does language have the particular properties that it
does? Why, for example, are there languages (such as English) in
which verbs agree only with subjects, but no language in which
verbs agree only with direct objects?
Networks provide no answer to this sort of question. In fact,
if presented with data in which verbs agree with direct objects

rather than subjects, an SRN would no doubt learn just this


sort of pattern, even though it is not found in any known human
language.
There is clearly something missing here. Humans dont just
learn language; they shape it. Moreover, these two facts are surely
related in some fundamental way, which is why hypotheses about
how linguistic systems are acquired need to be embedded within
a more comprehensive theory of why those systems (and therefore the input) have the particular properties that they do. There
is, simply put, a need for an emergentist theory of grammar.

Emergentist Approaches to Grammatical Theory


A substantial amount of analytic work has addressed the traditional concerns of linguistic analysis, including core phenomena in the major areas of traditional grammatical theory.
SYNTAX. It is possible to identify several strands of emergentist
work on syntax, each devoted to explaining the structural properties of sentences without reference to inborn grammatical principles. Differing views have been put forward by MacWhinney
(2005) and William OGrady (2001, 2005), both of whom address
a series of issues that lie at the heart of contemporary syntactic
analysis the design of phrase structure, coreference, agreement, the syntax-phonology interface, and constraints
on long-distance dependencies. MacWhinney seeks to explain
these phenomena in terms of pragmatics, arguing that grammar
emerges from conversation as a way to facilitate accurate tracking and switching of perspective. In contrast, OGrady holds that
syntactic phenomena are best understood in terms of the operation of a linear, efficiency-driven processor that seeks to reduce
the burden on working memory in the course of sentence formation and interpretation.
Still other work, such as that done within construction
grammar, seeks to reduce syntax to stored pairings of form and
function (constructions). Some of this work has a strong emergentist orientation (e.g., Goldberg 1999; Tomasello 2003), but
some retains a commitment to universal grammar (Goldberg
and Jackendoff 2004, 563).
MORPHOLOGY. Very early connectionist work on morphology called into question the existence of morphological rules
and representations, even for phenomena such as regular past
tense inflection. Instead, it was suggested, a pattern-associator
network learns the relationship between the phonological form
of stems and that of past tense forms (run~ran, walk~walked,
etc.), gradually establishing associations (connections) of different strengths and levels of generality between the two sets of
elements the most general and strongest involving the -ed past
tense form. James McClelland and Karalyn Patterson (2002) offer
a succinct overview of this perspective.
More recent work has raised important questions about the
nature of morphemes in general. A key claim of this research
is that morphological structure emerges from statistical regularities in the formmeaning relationships between words. (Hay and
Baayen 2005 offers an excellent review of this research.)
Intriguing experimental work by Jennifer Hay (2003) suggests
that the internal structure of an affixed word is gradient rather
than categorical, reflecting its relative frequency compared to

275

Emergentism
that of its base. The words inadequate and inaudible are a case in
point. Because adequate is more frequent than the affixed form
inadequate, its presence in the derived word is relatively salient,
leading to a high native-speaker rating for structural complexity. In contrast, inaudible, which is more frequent (and therefore
more salient) than audible, receives a low rating for structural
complexity.
If this is right, then morphological structure exists but not
in the categorical form commonly assumed. Rather, what we
think of as morpheme boundaries emerge to varying degrees
of strength from the interaction of more basic factors, such as
frequency, semantic transparency, and even phonotactics.
(The low-probability sequence in inhumane creates a sharper
morpheme boundary than the high-probability sequence in
insincere.)
THE LEXICON. There have been various attempts to develop an
emergentist approach to the lexicon, which is traditionally seen
as the repository of information about morphemes and words.
One possibility, suggested by Joan Bybee (1998), among others,
is that the lexicon emerges from the way in which (by hypothesis)
the brain responds to and stores linguistic experience by creating units whose strength and productivity is determined largely
by frequency of occurrence. Some of these units correspond to
words, as in a traditional lexicon, but many are phrases and other
larger units of organization, including possibly abstract constructions (see usage-based theory).
Elman (2005) also argues against a pre-structured lexicon,
proposing instead that lexical knowledge is implicit in the effects
that words have on the minds internal states, as represented
in the activation patterns created by an SRN. Because an SRN
focuses on co-occurrence relationships (see above), these effects
are modulated by context a words meaning, like its syntactic
category, emerges from the contexts in which it is used rather
than from an a priori vocabulary of linguistic primitives.
PHONOLOGY. Pioneering work on emergentist phonology was
carried out by Patricia Donegan (1985), who noted the unhelpfulness to language learners of classic distributional analysis. As
she observed, it is implausible to suppose that children record
sets of phonetic representations in memory and then compare
them in the hope of determining which phonetic contrasts are
distinctive and which are predictable from context (see speech
perception in infants and speech production). Instead,
Donegan suggests, children begin with a set of processes (nasalization, devoicing, and so forth) that emerge as responses to the
physical limitations of the human vocal tract and the auditory
apparatus. (These limitations are inborn, of course, but are not
inherently linguistic in character, despite their linguistic consequences.) A languages phonemic inventory and allophonic
patterns then emerge as specific processes are suppressed in
response to experience.
A simple example involves the process that palatalizes /s/ in
front of a high front vowel, giving the pronunciation [i] for /si/
in many languages (e.g., Japanese). A child learning English is
forced to suppress this process upon exposure to words such
as see, which is pronounced [si], without palatalization. This, in
turn, results in the admission of // to the phonemic inventory

276

of English: Because the palatalization process has been suppressed, the [] in words such as [i] she must be interpreted
as a sound in its own right, rather than as a process-induced
variant of /s/.
Crucially, this conclusion is drawn without the need for comparison of minimal pairs or similar distributional analysis; the
phonemic inventory emerges in response to a much simpler
and more basic phenomenon the suppression of processes
based on exposure to particular individual words. Boersma
(1998) and Hayes, Kirchner, and Steriade (2004) discuss a broad
range of other phonological phenomena from an emergentist
perspective.

Concluding Remarks
There is currently no comprehensive emergentist theory of language or its acquisition, but there are various emergentist-inspired
research programs devoted to the construction of such a theory.
For the most part, this work is based on the simple thesis that the
core properties of language are best understood by reference to
the properties of quite general cognitive mechanisms and their
interaction with one another and with experience. The viability
of this idea can and must be measured against its success in confronting the classic empirical challenges of linguistic analysis
figuring out how language works and how it is acquired.
William OGrady
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bates, Elizabeth, Jeffrey Elman, Mark Johnson, Annette Karmiloff-Smith,
Domenico Parisi, Kim Plunkett. 1998. Innateness and emergentism.
In A Companion to Cognitive Science, ed. W. Bechtel and G. Graham,
590601. Oxford: Blackwell.
Bates, Elizabeth, Judith Goodman. 1999. On the emergence of grammar
from the lexicon. In The Emergence of Language, ed. B. MacWhinney,
2979. Mahwah, NJ: Erlbaum.
Bates, Elizabeth, Brian MacWhinney. 1988. What is functionalism?
Papers and Reports on Child Language Development 27: 13752.
Boersma, Paul. 1998. Functional Phonology: Formalizing the Interactions
between Articulatory and Perceptual Drives. The Hague: Holland
Academic Graphics.
Bybee, Joan. 1998. The emergent lexicon. In Proceedings of the 34th
Regional Meeting of the Chicago Linguistic Society: The Panels, 42135.
Chicago: Chicago Linguistic Society. An excellent example of emergentist thinking about the lexicon.
Donegan, Patricia. 1985. How learnable is phonology? In Papers on
Natural Phonology from Eisenstadt, ed. W. Dressler and L. Tonelli,
1931. Padua: Cooperativa Libraria Editoriale Studentesca Patavina.
Ellis, Nick. 2002. Frequency effects in language processing. Studies in
Second Language Acquisition 24: 14388.
Elman, Jeffrey. 1993. Learning and development in neural networks: The
importance of starting small. Cognition 48: 7199. A much-cited and
widely admired illustration of the value of computational modeling in
the study of language acquisition.
. 2002. Generalization from sparse input. In Proceedings of the
38th Regional Meeting of the Chicago Linguistic Society, 175200.
Chicago: Chicago Linguistic Society. A highly readable summary of
several important SRN-based studies of language acquisition.
. 2004. An alternative view of the mental lexicon. Trends in
Cognitive Science 8: 3016.
Elman, Jeffrey, Elizabeth Bates, Mark Johnson, Annette Karmiloff-Smith,
Domenico Parisi, Kim Plunkett. 1996. In Rethinking Innateness: A

Emergent Structure
Connectionist Perspective on Development. Cambridge, MA: MIT
Press.
Goldberg, Adele. 1999. The emergence of the semantics of argument structure constructions. In The Emergence of Language, ed. B.
MacWhinney, 197212. Mahwah, NJ: Erlbaum.
Goldberg, Adele, and Ray Jackendoff. 2004. The English resultative as a
family of constructions. Language 80: 53268.
Gregg, Kevin R. 2003. The state of emergentism in second language
acquisition. Second Language Research 19: 4275.
Hay, Jennifer.and 2003. Causes and Consequences of Word Structure. New
York: Routledge.
Hay, Jennifer, and R. Harald Baayen. 2005. Shifting paradigms: Gradient
structure in morphology. Trends in Cognitive Science 9: 3428. An
excellent survey of work on emergentist morphology.
Hayes, Bruce, Robert Kirchner, and Donca Steriade, eds. 2004. Phonetically Based Phonology. Cambridge: Cambridge University Press.
MacWhinney, Brian. 1998. Models of the emergence of language.
Annual Review of Psychology 49: 199227.
. 2002. Language emergence. In An Integrated View of Language
Development: Papers in Honor of Henning Wode, ed. P. Burmeister,
T. Piske, and A. Rohde, 1742. Trier, Germany: Wissenshaftliche Verlag.
. 2004. A multiple process solution to the logical problem of language acquisition. Journal of Child Language 31: 883914.
. 2005. The emergence of grammar from perspective. In Grounding
Cognition: The Role of Perception and Action in Memory, Language and
Thinking, ed D. Pecher and R. Zwaan, 198223. Cambridge: Cambridge
University Press.
MacWhinney, Brian, ed. 1999. The Emergence of Language. Mahwah,
NJ: Lawrence Erlbaum.
Marcus, Gary. 1998. Rethinking eliminative connectionism. Cognitive
Psychology 37: 24382.
. 2001. The Algebraic Mind. Cambridge, MA: MIT Press.
McClelland, James, and Karalyn Patterson. 2002. Rules or connections
in past-tense inflection: What does the evidence rule out? Trends in
Cognitive Science 6: 46572. An update and survey of connectionist
work on inflection.
Mill, John Stuart. [1843] 1930. A System of Logic Ratiocinative and
Inductive. London: Longmans, Green, and Co.
OGrady, William. 2001. An emergentist approach to syntax. Available
online at: http://www.ling.hawaii.edu/faculty/ogrady/. A summary of
the detailed arguments for an emergentist theory of syntax found in
OGrady (2005).
. 2005. Syntactic Carpentry: An Emergentist Approach to Syntax.
Mahwah, NJ: Erlbaum. An emergentist approach to syntax that seeks
an understanding of many of the classic problems of syntactic theory
in terms of processing.
Palmer-Brown, Dominic, and Jonathan Tepper, Heather Powell. 2002.
Connectionist natural language parsing. Trends in Cognitive Science
6: 43742.
Smolensky, Paul. 1999. Grammarbased connectionist approaches to
language. Cognitive Science 23: 589613.
Steedman, Mark. 1999. Connectionist sentence processing in perspective. Cognitive Science 23: 61534.
Tomasello, Michael. 2003. Constructing a Language: A Usage-Based
Theory of Language Acquisition. Cambridge: Harvard University
Press. A widely cited example of a usage-based approach to language
acquisition.

EMERGENT STRUCTURE
This term is used in many fields, including scientific and social
disciplines, where it has been applied to a variety of adaptive
self-organizing systems from termite mounds to grocery

checkout lines. The term emergent refers especially to an openended process in which systematicity is partial and incomplete
and in which a system is in a constant course of (re)formation.
In the study of language, the expression emergent grammar
was coined by Paul Hopper (1987) as a methodological proposal for approaching the relationship between grammar and
the local structure of natural discourse. Logically prior, fixed
grammar was, Hopper argued, inconsistent with the kinds of ad
hoc linguistic decisions made by speakers. The notion of emergent grammar inverts the standard picture of grammar, as well
as the generally accepted logical priority of structure over text.
Linguistic structure is thus to be seen as a product of, rather than
a prerequisite to, discourse. Since discourse is ongoing, structure
is emergent, that is, continually in a process of formation according to the current needs of the interaction. (See Weber 1997 for
further discussion.)
The database for the study of language from this perspective is a corpus of transcribed texts, usually oral, and, recently,
conversational (Ochs, Schegloff, and Thompson 1996). In this
respect, too, emergent grammar differs from structural and cognitive grammar, in which conclusions are normally based on
isolated constructed sentences. The explanation for grammar,
according to this theory, lies in frequency (Bybee and Hopper
2001) and the associated routinization of forms (Haiman 1994).
High-frequency forms tend to become phonetically reduced
and to be restructured (Bybee 2001). Typical examples are the
English pronoun+modal sequences like Ill, youre, weve, and so
on. Emergent grammar is thus relevant to the more general study
of grammaticalization.
Incipient structure that is, looking backward at the historical origins of a structured system or forward to the predicted
course of events leading to a structured system, as in the study
of first language acquisition and of most varieties of cognitive linguistics is more properly described as emerging
than as emergent. The noun emergence is ambiguous in this
respect.
Paul J. Hopper
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bybee, Joan. 1998. The emergent lexicon. Papers of the Annual Meeting
of the Chicago Linguistic Society 34: 42135.
. 2001. Phonology and Language Use. Cambridge: Cambridge
University Press.
Bybee, Joan, and Paul Hopper, eds. 2001. Frequency and the Emergence of
Linguistic Structure. Amsterdam: Benjamins.
Haiman, John. 1994. Ritualization and the development of language.
In Perspectives on Grammaticalization, ed. William Pagliuca, 328.
Amsterdam: John Benjamins.
Hopper, Paul. 1987. Emergent grammar. Papers of the Annual Meeting
of the Berkeley Linguistic Society 13: 13957.
Lewin, Roger. 1992. Complexity: Life at the Edge of Chaos. New
York: Macmillan. A readable account of chaos theory and the emergence of structured systems.
Ochs, Elinor, Emanuel Schegloff, and Sandra Thompson, eds.1996.
Interaction and Grammar. Cambridge: Cambridge University
Press.
Weber, Thilo. 1997. The emergence of linguistic structure: Paul
Hoppers emergent grammar hypothesis revisited. Language Sciences
19.2: 17796.

277

Emotion and Language

EMOTION AND LANGUAGE


A vast domain of research on emotion and language cuts across
many disciplines, methodologies, and theoretical frameworks.
To render this topic coherent and manageable, we focus on the
current resurgence of research on emotional words. Emotional
words (e.g., flower, shit) contrast with connotatively neutral
words (e.g., toaster, being) and include subcategories such as
taboo words (insults, scatological references, and swearing or
curse words), threatening words (e.g., negative valence words
referring to menacing situations such as murder and abuse),
and some emotion words (e.g., terror, disgust). In a continuum of vocal emotional expression ranging from nonverbal
(e.g., screams) to abstract verbal (e.g., figurative language; see
idioms, irony, metaphor, verbal humor), T. B. Jay (2003)
argues, taboo words constitute the strongest form of emotional
language: Taboo words are more arousing than figurative language and yield reliable and robust emotional effects more often
than do threatening words.
We review research on emotional words from historical,
methodological, and theoretical perspectives.

Historical Perspectives
Historical perspectives illustrate the multiple domains and methodologies of research on emotional words. In the mid-1800s,
neuropsychological case studies of Hughlings Jackson (1958)
and others helped shape current ideas concerning automatic
or uncontrollable production of emotional words (see, e.g., Van
Lancker 1987). Carl Jungs (1910) work with emotional words
in free association tasks also shaped procedures for diagnosing clinical disorders such as schizophrenia (see also psychoanalysis and language). From 1950 to 1975, experimental
psychologists used classical conditioning concepts to analyze
the learning of emotional words (e.g., Staats 1968) and adopted
perceptual defense paradigms to determine whether ego-protective processes shield threatening stimuli (taboo words) from
awareness (e.g., Dixon 1971). However, both lines of research
were largely abandoned: perceptual defense because of methodological flaws and the learning of emotional words because computer metaphors dominated the study of language and cognition
and downplayed emotion during the period of 1975 to 1990
(Jay 2003).

Methodological Perspectives
RATING STUDIES. Rating studies provide a method for determining the emotional qualities of words. A classic example is
the semantic differential (Osgood, Suci, and Tanenbaum 1957),
where ratings of words on bipolar connotative scales reflect three
underlying dimensions: evaluation (the valence component,
e.g., negativepositive); activity (e.g., fastslow); and potency
(e.g., strongweak). L. H. Wurm and D. A. Vakoch (1996) argued
that evolutionary considerations and relations between processing time data and the evaluation, activity, and potency ratings for
words indicate an affective lexicon (for avoiding threats) that differs from the general lexicon (for obtaining valuable resources).
Other rating studies involving the affective lexicon include
Bellezza, Greenwald, and Banaji (1986), Bradley and Lang
(1999), and Jay (1992). Unrepresented in current rating studies

278

are gender, age, psychological history, personality factors, social


context, political and religious affiliation, and cultural factors
(see culture and language), all of which powerfully influence peoples perception of emotion-linked words (Jay 2000).
SELF-REPORT AND FIELD STUDIES. Field studies of taboo word use
indicate that emotional language is learned early and persists
well into old age (Jay 2000). Self-report studies suggest that punishment for cursing fails to alter the actual likelihood of swearing but nevertheless serves a function because the same people
admit that they would punish their own children for cursing (Jay,
King, and Duncan 2006).
NEUROPSYCHOLOGICAL STUDIES. Neuropsychological studies
have focused on two primary dimensions of emotions: arousal
(excitement) and valence (positivenegative). A primary neuropsychological measure of arousal and unconscious autonomic activity is the skin conductance response (SCR; see,
e.g., LaBar and Phelps 1998). For emotional words presented
to bilinguals, the SCR decreases as a function of the order in
which a language is learned (Harris, Aycicegi, and Gleason
2003). The SCR also varies with the estimated emotional force
of aversive words (Dewaele 2004) and occurs even when
presentation times are too brief for word identification (Silvert
et al. 2004).
amygdala activity also indexes arousal: Threatening words
trigger increased amygdalar activation (Isenberg et al. 1999),
and amygdalar damage impairs recognition of arousal but not
valence characteristics of emotional words (Adolphs, Russell,
and Tranel 1999; see also Lewis et al. 2007 for the role of other
subcortical structures in arousal). Some cortical and subcortical areas respond only to valence, some respond only to arousal,
and some respond to an interaction of valence and arousal,
particularly when valence is negative (Lewis et al. in press).
Finally, some cortical areas respond to valence per se, while
others respond selectively to either positive or negative valence
(Maddock, Garrett, and Buonocore 2003).
Relative activity in the left hemisphere (LH) versus right
hemisphere (RH) also indexes emotional processing, albeit
less consistently across studies, and the nature and scope of
emotion-linked processing in the RH is an ongoing issue (see
Borod, Bloom, and Haywood 1998). RH brain damage is associated with emotional blunting (Gainotti 1972) and difficulties in
identifying emotional words or the emotion they represent, in
matching words and emotions, in interpreting emotional content, in describing emotional autobiographical information,
in self-expression with emotional words (Borod, Bloom, and
Haywood 1998), and in comprehending and expressing humor
(Blake 2003). The corpus callosum that links the RH and LH
also plays a role in comprehending emotion-linked prosody,
humor, and figurative usages (Brown et al. 2005; Paul et al. 2003).
The frontal lobe seems to regulate or inhibit socially inappropriate uses of emotional words, with links between frontal
lobe damage and verbal aggression, such as excessive cursing
(e.g., Grafman et al. 1996).
CLINICAL AND INDIVIDUAL DIFFERENCE STUDIES. Clientpatient
interactions focus on emotions, and an inability to express ones

Emotion and Language


emotions in words may reflect a serious psychiatric problem
known as alexithymia. Alexithymic individuals have few words
for describing their feelings and communicating emotional
distress, are unable to identify and describe subjective states,
and have difficulty interacting with others, including therapists
(Taylor, Bagby, and Parker 1997).
Clinical studies have developed strategies for facilitating therapeutic communication and emotional interactions in general,
for example, use of metaphor (see Stine 2005). Clinical studies
have also developed new ways of using emotion-linked words to
diagnose psychopathology. An example is the emotional Stroop
task where clients name the font color of words while attempting
to ignore their meaning: Longer color naming times for specific
word classes (e.g., web, spider) are associated with clinical problems such as phobias (e.g., arachnophobia), anxiety and depressive disorders, alexithymia, eating disorders, drug abuse, and a
range of other psychopathologies (see Williams, Mathews, and
MacLeod 1996 for a review).
EXPERIMENTAL STUDIES. Recent experimental studies have made
extensive explorations of the effects of emotional words on cognitive processes such as memory and attention. For example,
in a variant of the emotional Stroop task known as the taboo
Stroop (MacKay et al. 2004), people name the font color of taboo
and neutral words (equated for length, familiarity, and category
coherence) while ignoring the meaning of the words and their
screen location. They then receive a surprise memory test for the
words, the font color of particular words, or the screen location
of particular words, and the results indicate better memory for
taboo than neutral words and better memory for the font colors
and screen locations of taboo than of neutral words (see, e.g.,
MacKay et al. 2004; MacKay and Ahmetzanov 2005). These and
other results suggest that taboo words facilitate recall of contextual details in the same way as do flashbulb memories for
traumatic events such as the September 11, 2001 tragedies,
after which people vividly recall contextual details associated
with the emotion-linked event, for example, how and when
they first learned of the event, where they were, what they were
doing, and who else was present (see MacKay and Ahmetzanov
2005).
Other results indicate that taboo words impair immediate
recall of prior and subsequent neutral words in rapidly presented
mixed lists containing taboo and neutral words (e.g., MacKay,
Hadley, and Schwartz 2005), without impairing recall of neighboring words in pure, all-taboo lists (Hadley and MacKay 2006).
However, lexical decision times (the time to identify a stimulus
as a word) do not differ for taboo versus neutral words (MacKay
et al. 2004). We discuss theoretical perspectives on this pattern
of results next.

Theoretical Perspectives
Current research on emotional words illustrates a gamut of theoretical perspectives that differ in their scope and goals and in
the nature and specificity of the predictions they make. Jays
(2000) neuro-psychosocial theory of cursing summarizes likelihood estimates of various forms of cursing, based on neurological (e.g., conscious state, brain damage), psychological (e.g.,
personality, age, history), and social context (e.g., culture, class)

factors. American males are more likely to curse than females


both as children and as adults, although women also learn a
range of taboo words, whether they use them or not. Similarly,
Americans with high sexual anxiety but no religious training are
less likely to use sex-linked curse words than profanity or blasphemy, especially in conversations with same-sex others (see Jay
1992, 2000, 2003).
W. Buccis (1997) multiple code theory (MCT) of emotional information processing links Freudian and connectionist concepts via the concept of referential activity (RA).
RA is an index of the ability to link primary (e.g., emotional,
unconscious) and secondary (e.g., verbal, conscious) levels
of processing within a connectionist network. Applied in the
domain of clinical psychology, MCT has provided explanations for negative psychological states, such as repression, in
terms of the nature or quality of connections between these
fundamentally linguistic versus emotional levels of processing. Under MCT, people with high versus low RA differ in their
ability to express and describe their emotions, in the structure
and organization of their narratives, and in their therapeutic
success rates.
Resource theories of emotion and attention (e.g., Wells
and Matthews 1994) perhaps provide the broadest conceptualization of emotion and cognitive processes. Under resource
theories, threatening stimuli attract limited-capacity cognitive
resources, thereby reducing resources available for processing
and responding to other stimuli, for example, font color in clinical, emotional, and taboo Stroop tasks. This hypothesis readily
describes phenomena such as the taboo Stroop effect (longer
times for naming the font color of taboo than of neutral words)
but cannot describe other phenomena, for example, superior
memory for the font color and screen location of taboo than of
neutral words (see MacKay et al. 2004).
Two exceptions to the descriptive or post hoc approach that
characterizes resource theories are noteworthy. One is arousal
theory (e.g., LeDoux 1996) as applied to emotional words (e.g.,
Kensinger and Corkin 2003). Under arousal theory, low-level
sensory aspects of emotional stimuli, such as taboo words,
directly engage an emotional reaction system (in the amygdala)
independently of other stimulus factors, such as context and
presentation rate. The emotional reaction system then triggers
enhanced skin conductance and facilitates memory consolidation for the emotional stimuli and their context of occurrence (in
the hippocampus).
What makes arousal theory attractive is its generality and testability. For example, arousal theory explains flashbulb memories
under the hypothesis that arousal tends to induce storage of perceptual images that include both the emotional stimulus and
its context of occurrence. However, arousal theory as applied to
emotional words has not fared well in recent tests: Contrary to
arousal theory, if presented in mixed taboo-neutral lists at relatively slow rates (e.g., 2,000 ms/word) or if presented in pure (alltaboo or all-neutral) lists at rapid rates (e.g., 200 ms/word), taboo
words are no better recalled than neutral words equated for
familiarity, length, and category coherence (Hadley and MacKay
2006). Also contrary to arousal theory, recent data indicate that
taboo words do not trigger imagelike memories (MacKay and
Ahmetzanov 2005).

279

Emotion and Language


The second notable exception to the summary-description
approach is node structure binding theory, or binding theory
for short (e.g., Hadley, and MacKay 2006). Under binding theory, emotion-linked stimuli, such as taboo words, engage the
emotional reaction system, which delays activation of binding
mechanisms (located in the hippocampus) for linking concurrent neutral stimuli to their context of occurrence. As a result,
(less important) neutral stimuli only form links to their context of
occurrence after links to context for (more important) emotionlinked stimuli have been formed.
These binding theory assumptions have generated counterintuitive predictions that subsequent experimental tests have
verified. For example, unlike other theories, binding theory
correctly predicted impaired recall of neutral neighbors before
and after a taboo word if and only if mixed (taboo-neutral) word
lists are presented rapidly (Hadley and MacKay 2006). Binding
theory also correctly predicted no difference in recall of taboo
versus neutral words in pure (all-taboo or all-neutral) lists presented rapidly or slowly (Hadley and MacKay 2006). Unlike
other theories, binding theory also correctly predicted no difference in lexical decision times (the time to identify a stimulus
as a word) for taboo versus neutral words (MacKay et al. 2004).

Conclusion
Both historical and contemporary research on emotional
words reflects a wide variety of theoretical and methodological
approaches in fields ranging from neuroscience to psycholinguistics to cognitive and clinical psychology. Further research is
required to piece together these multiple domains and to develop
a general understanding of emotional words and their relation to
other cognitive processes. However, emotional words currently
seem poised to resume their central position in the language sciences and related disciplines.
Kristin L. Janschewitz and Donald G. MacKay
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Adolphs, R., J. A. Russell, and D. Tranel. 1999. A role for the human
amygdala in recognizing emotional arousal. Psychological Science
10: 16771.
Bellezza, F. S., A. G. Greenwald, and M. R. Banaji 1986. Words high and
low in pleasantness as rated by male and female college students.
Behavior Research Methods, Instruments, and Computers 18: 299303.
Blake, M. L. 2003. Affective language and humor appreciation after
right hemisphere brain damage. Seminars in Speech and Language
24:10719.
Borod, J. C., R. L. Bloom, and C. S. Haywood. 1998. Verbal aspects
in emotional communication. In Right Hemisphere Language
Comprehension: Perspectives from Cognitive Neuroscience, ed. M.
Beeman and C. Chiarello, 285307. Mahwah, NJ: Erlbaum.
Bradley, M. M., and P. J. Lang. 1999. Affective Norms for English Words
(ANEW): Instruction Manual and Affective Ratings. Technical Report
C-1, The Center for Research in Psychophysiology, University of Florida.
Brown, W. S., L. K. Paul, M. Symington, and R. Dietrich. 2005.
Comprehension of humor in primary agenesis of the corpus callosum. Neuropsychologia 43: 90616.
Bucci, W. 1997. Symptoms and symbols: A multiple code theory of somatization. Psychoanalytic Inquiry 17: 15172.

280

Dewaele, J. 2004. The emotional force of swearwords and taboo words


in the speech of multilinguals. Journal of Multilingual and Cultural
Development 25: 20422.
Dixon, N. F. 1971. Subliminal Perception: The Nature of a Controversy.
London: McGraw-Hill.
Gainotti, G. 1972. Emotional behavior and the hemispheric side of the
lesion. Cortex 8: 4155.
Grafman, J., K. Schwab, D. Warden, A. Pridgen, H. R. Brown, and
A. M. Salazar. 1996. Frontal lobe injuries, violence, and aggression: A
report of the Vietnam Head Injury Study. Neurology 46: 12318.
Hadley, C. B., and D. G. MacKay 2006. Does emotion help or hinder
immediate memory? Arousal versus priority-binding mechanisms.
Journal of Experimental Psychology: Learning, Memory, and Cognition
32: 7988.
Harris, C. L., A. Aycicegi, and J. B. Gleason. 2003. Taboo words and reprimands elicit greater autonomic reactivity in a first language than in a
second language. Applied Psycholinguistics 24: 56179.
Isenberg, N., D. Silbersweig, A. Engelien, S. Emmerich, K. Malavade, B.
Beattie, A. C. Leon, and E. Stern. 1999. Linguistic threat activates the
human amygdala. Proceedings of the National Academy of Science
96: 104569.
Jackson, H. 1958. Selected Writings of Hughlings Jackson. Vol 2. New
York: Basic Books.
Jay, T. B. 1992. Cursing in America. Philadelphia: John Benjamins.
. 2000. Why We Curse. Philadelphia: John Benjamins.
. 2003. Psychology of Language. Upper Saddle River, NJ: PrenticeHall.
Jay, T. B., K. King, and T. Duncan. 2006. Memories of punishment for
cursing. Sex Roles 32: 12333.
Jung, C. G. 1910. The association method. American Journal of
Psychology 31: 21969.
Kensinger, E. A., and S. Corkin. 2003. Memory enhancement for emotional words: Are emotional words more vividly remembered than
neutral words? Memory and Cognition 31: 116980.
LaBar, K., and E. Phelps. 1998. Arousal-mediated memory consolidation: Role of the medial temporal lobe in humans. Psychological
Science 9: 4903.
LeDoux, J. 1996. The emotional brain: The mysterious underpinnings of
emotional life. New York: Simon and Schuster.
Lewis, P. A., H. D. Critchley, P. Rotshtein, and R. J. Dolan. 2007. Neural
correlates of processing valence and arousal in affective words.
Cerebral Cortex 17: 7428.
MacKay, D. G., and M. V. Ahmetzanov. 2005. Emotion, memory, and
attention in the taboo Stroop paradigm: An experimental analog of
flashbulb memories. Psychological Science 16: 2532.
MacKay, D. G., C. B. Hadley, and J. H. Schwartz. 2005. Relations between
emotion, illusory word perception, and orthographic repetition blindness: Tests of binding theory. Quarterly Journal of Experimental
Psychology 8: 151433.
MacKay, D. G., M. Shafto, J. K. Taylor, D. Marian, L. Abrams, and J. Dyer
2004. Relations between emotion, memory and attention: Evidence
from taboo Stroop, lexical decision, and immediate memory tasks.
Memory and Cognition 32: 47488.
Maddock, R. J., A. S. Garrett, and M. H. Buonocore. 2003. Posterior cingulate cortex activation by emotional words: fMRI evidence from a
valence decision task. Human Brain Mapping 18: 3041.
Osgood, C. E., G. J. Suci, and P. H. Tannenbaum. 1957. The Measurement
of Meaning. Urbana: University of Illinois Press.
Paul, L. K., D. Van Lancker-Sidtis, B. Schieffer, R. Dietrick, and W. S.
Brown. 2003. Communicative deficits in agenesis of the corpus callosum: Nonliteral language and affective prosody. Brain and Language
85: 31324.

Emotion, Speech, and Writing


Silvert, L., S. Delplanque, H. Bouwalerh, C. Verpoort, and H. Sequeira.
2004. Autonomic responding to aversive words without conscious
valence discrimination. International Journal of Psychophysiology
53: 13545.
Staats, A. W. 1968. Language, Learning, and Cognition. New York: Holt,
Rinehart and Winston.
Stine, J. J. 2005. The use of metaphors in the service of the therapeutic
alliance and therapeutic communication. Journal of the American
Academy of Psychoanalysis and Dynamic Psychiatry 33: 53145.
Taylor, G. J., R. M. Bagby, and J. D. A. Parker. 1997. Disorders of
Affect Regulation: Alexithymia in Medical and Psychiatric Illness.
Cambridge: Cambridge University Press.
Van Lancker, D. 1987. Nonpropositional speech: Neurolinguistic studies. In Progress in the Psychology of Language, ed. A.W. Ellis, 49118.
London: Erlbaum.
Wells, A., and G. Matthews. 1994. Attention and Emotion: A Clinical
Perspective. Hove, UK: Lawrence Erlbaum.
Williams, J. M. G., A. Mathews, and C. MacLeod. 1996. The emotional
Stroop task and psychopathology. Psychological Bulletin 120: 324.
Wurm, L. H., and D. A. Vakoch. 1996. Dimensions of speech perception: Semantic associations in the affective lexicon. Cognition and
Emotion 10: 40923.

EMOTION, SPEECH, AND WRITING


In our everyday life, we are frequently exposed to expressions such
as a thousand words cannot express a single emotion, what I
feel is something that is beyond words, and so on. This kind of
utterance, reflecting the difficulty of expressing emotions, evokes
special interest: Can speech and writing really express emotions?
In todays world, with the increasing awareness of emotion
as part of the self and the importance of expressing emotion as
part of human communication, language becomes vital to the
understanding and analysis of emotions. Lexical choices reflect
how people experience the world around them and, thus, constitute mediators between individuals emotions, which are
internal and subjective, and external entities, such as society and
environment.
Theorists of emotion stress that language is the most convenient channel for approaching research into emotions and
that emotion words are the best way to reflect the emotional
experience. Psychologists and psychoanalysts (Freud and his
followers) recognize that in spite of the importance of nonverbal behavior, words are the natural way of exteriorizing the inner
emotional world (see psychoanalysis and language).
Theorists of emotion (e.g., Ortony, Clore, and Foss 1987) also
stress that language offers the most convenient access for researching emotions, and that emotion words are the best way of reflecting emotional experiences. Linguists such as N. J. Enfield and A.
Wierzbicka (2002) went further, stating that it would be impossible to examine peoples emotions without putting language at the
center, both as the object of the research and as the research tool.
One reason for the complexity of such studies, according to several researchers, is linguistic usage that confuses emotion terms.
Criticism of psychological research into emotions focuses mainly
on the fact that most research in this field relies largely on linguistic labels and not on direct measurement of the emotion itself. If
this is, in fact, the case, it is particularly important to investigate
the language of emotions as a discrete issue, with tools exterior to
the emotion itself such as those of linguistics.

In her critical essay, J. T. Irvine (1990) wrote that many linguists tend to get cold feet when it comes to considering how
emotions are expressed verbally. Accordingly, if we use the terminology of Ferdinand de Saussure, we can say that emotion
is accepted as integral to the parole, which is linguistically less
meaningful than langue language in its broadest sense. Thus,
examination of emotion is pushed to the periphery of linguistics. Irvine also remarks that though there are languages with
phonological and morphological units that indicate
emotional states in speech, linguists frequently tend to combine
such elements with general descriptions of grammar, rather than
emphasizing such verbal expressions of emotion. However, she
notes two important linguistic texts that also deal with emotion
in language, namely, Edward Sapirs lexicon of emotions as mirroring culture and Roman Jakobsons work relating to the emotive function of language.
C. Caffi and R. W. Janney (1994) examined the rhetorical
strategies for expressing emotion by comparing psychological
categories of emotions with linguistic categories. They define
emotional communication as directed strategies for imparting
emotional information in speech or writing, insisting that such
expressions must be analyzed linguistically, because language
spoken or written is the means for conveying emotion. Their
model comprises linguistic markers, including specific emotion
words, obligatory words, syntax markers, and spoken language
mechanisms (i.e., tones and intonation, prosody, length of
syllables, etc.).
Caffi and Janneys writings infer that there are significant
connections between textual linguistic usage and emotions,
as evidenced also in diaries, letters, and other autobiographical writings. For instance, language was used to measure emotion in a study by G. Collier, D. Kuiken, and M.E. Enzle (1982).
The researchers noted that when describing negative emotions,
people use more complex constructions than in describing positive emotions, and this also applies to expressions of negative
as opposed to positive personal qualities. The positive is always
more clearly expressed. Assessing descriptions of levels of positivity or negativity of emotions or traits indicated that the feelings or traits described in more complex detail tended to be more
negative, that is, the more complex the descriptions, the more
likely they were to be negative.
An earlier study by C. E. Osgood (1958) dealt with the connection between emotion and language, establishing the link
between the lingual characteristics of a text and the motivation
level of the author when writing it. The research studied suicide
notes, written under the influence of very strong emotion the
last letter as compared with ordinary correspondence with
family or close friends.
In her chapter How and why is emotion communicated?
S. Planalp (1999) writes that verbal expression of feelings is
endemic to the process of communication, even though people do not always use words. They do not, as a rule, announce
that Im angry or Im feeling depressed at the moment, but
there are other verbal indications, like swearing or extravagant
outbursts such as I could kill him!
J. W. Pennebaker and M. E. Francis (1996) analyzed personal texts describing thoughts about commencing higher studies by first-year students at college. Linguistic and cognitive

281

Emotion, Speech, and Writing


parameters were classified according to specific verbal categories. This included classifying emotion words used by the subjects (in particular, positive and/or negative expressions), while
the cognitive parameters included clarity, accessibility to the
reader, and schematic organization of the text. The connection
between these linguistic and cognitive variables and mental
health was then examined, as were the academic achievements
of the subjects in their ongoing studies.
A. Boals and K. Klein (2005) examined how the words used
in a narrative can convey stress or distress in regard to levels of
pain after a negative emotional event. Their subjects were more
than 200 students who had undergone a romantic crisis or the
breaking up of a relationship. The students were asked to write
about both the relationship and the effects of the separation.
The researchers found that there was conspicuously more use
of negative emotion expressions, both of physical words and
of first person utterances, as compared with descriptions of the
relationship before breaking up. It was possible to pinpoint linguistic differences between rejection/repression and intensive
internalization of an experience. The rejecters tended to use
more casual language, negative emotion words, and the first
person singular, as well as pronouns when referring to others,
but also used fewer cognitive words. Cognitive expressions imply
actively searching for meaning and comprehension of a stressful
event and of depression, so that using them is characteristic of
people who thoroughly work through such an event.
Psychotherapy also offers sources for researching emotions
via language. The therapists diagnoses and therapeutic methods are frequently based on patients language choice of words,
slips of the tongue, narratives in their stories, repetitions when
describing a trauma, and other markers (see, for example, Bucci,
2001).
D. D. Danner, D. A. Snowdon, and W. V. Friesen (2001) examined autobiographies of nuns as part of research known as the
Nun Study. They examined the link between writings about
positive emotional incidents, expressed in positive terms, and
the life span of the writers. Emotional stability, measured according to the use of emotion words, was found to have a significant
positive effect.
Use of nonautobiographical texts not produced under laboratory conditions are derived from Internet chat, blogs, and e-mails
in various frameworks. For example, M. A. Cohn, M. R. Mehl, and
J. W. Pennebaker (2004) examined changes in the language use
of American citizens after September 11, 2001. The researchers
studied randomly selected blogs over a period of four months
two months prior to and two months after the traumatic event.
Linguistic markers to psychological differences were studied.
Two weeks after the attack, the writers had returned to baseline
in regard to use of lexical expressions of emotion. Pennebaker,
who collaborated both with Francis in researching academic
achievement according to linguistic parameters and in Cohns
work on linguistic markers in blogs written after 9/11, developed a tool for categorizing various types of text according to
style, called the LIWC (linguistic inquiry and word count), which
is available online (Pennebaker 2007). Pennebaker maintains
that his computer program collates words from various categories and translates them, according to their relative number
in the text, to psychological meanings. Main categories include,

282

for example, words relating to the self, social words directed


to others, words expressing positive or negative emotion, cognitive words, long words (more than six letters), and others.
He believes that constant use of positive expressions indicates
optimism, whereas negative expressions indicate depression.
Cognitive terms (In my opinion, It seems to me, I think,
etc.) indicate that the writer does a lot of considering and preparation when writing and is thus more thoughtful and self-aware.
Constant use of long words suggests that the writer is alienated,
keeping his/her distance.
Another group of researchers (Pennebaker, Slatcher, and
Chung 2005) attempted to learn from the speeches of U.S. presidential candidates about their personalities and emotions. It was
shown that in spite of advice received by political candidates from
their advisors about using words correctly (e.g., using the first
person plural instead of singular), they sometimes speak more
freely, revealing more about their personalities. Pennebaker
and his colleagues emphasized the use of functional words that
indicate the ability to absorb and organize thoughts and ideas.
Using the program developed for this study, they also examined
positive and negative expressions of emotion, cognitive words,
exclusives, singular and plural expressions, etc.
From all these diverse studies, we learn that language, the
dominant aspect of intercommunication between individuals,
is the simplest method of revealing a different system, one that
has its own attributes and influences all aspects of our lives, that
is, the emotional system. One can, as a rule, consciously control
the content of the story one tells, but it is more difficult to control
the exact choice of each word. When it comes to what to write
or say, we are aware of what we are doing, but this is not always
the case with how we do it. Even the most practiced speaker finds
it difficult to monitor all the words he or she selects in order to
communicate. Thus, linguistic markers are, in fact, the building
blocks that must be used as the foundation for researching emotion in language.
Osnat Argaman
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Boals, A., and K. Klein. 2005. Word use in emotional narratives about
failed romantic relationships and subsequent mental health. Journal
of Language and Social Psychology 24: 25268.
Bucci, W. 2001. Pathways of emotional communication. Psychoanalytic
Inquiry 21.1: 4070.
Caffi, C., and R. W. Janney. 1994. Toward a pragmatics of emotive communication. Journal of Pragmatics 22: 32573.
Cohn, M. A., M. R. Mehl, and J. W. Pennebaker. 2004. Linguistic markers of psychological change surrounding September 11, 2001.
Psychological Science 15: 68793.
Collier, G., D. Kuiken, and M. E. Enzle. 1982. The role of grammatical
qualification in the expression and perception of emotion. Journal of
Psycholinguistic Research 11: 63150.
Danner, D. D., D. A. Snowdon, W. V. Friesen. 2001. Positive emotions
in early life and longevity: Findings from the Nun Study. Journal of
Personality and Social Psychology 80: 80413.
Enfield, N. J., and A. Wierzbicka. 2002. Introduction: The body in description of emotion. Pragmatics and Cognition 10.1/2: 125.
Irvine, J. T. 1990. Registering affect: Heteroglossia in the linguistic expression of emotion. In Language and the Politics of Emotion, ed. C. A. Lutz
and L. Abu-Lughod, 12661. Cambridge: Cambridge University Press.

Emotion Words
Ortony, A., G. L. Clore, and M. A. Foss. 1987.The referential structure of
the affective lexicon. Cognitive Science 11: 34164.
Osgood, C. E. 1958. Some effects of motivation on style of encoding.
In Style in Language, ed. T. A. Sebeok, 293306. Cambridge, MA: MIT
Press.
Pennebaker J. W. 2007, The world of words. Available online at: http://
www.liwc.net/.
Pennebaker, J. W., and M. E. Francis. 1996. Cognitive, emotional and
language processes in disclosure. Cognition and Emotion 10: 60126.
Pennebaker, J. W., R. B. Slatcher, and C. K. Chung. 2005. Linguistic
markers of psychological state through media Interviews: John Kerry
and John Edwards in 2004, Al Gore in 2000. Analyses of Social Issues
and Public Policy 5: 197204.
Planalp, S. 1999. Communicating Emotion: Social, Moral and Cultural
Processes. Cambridge: Cambridge University Press.

EMOTION WORDS
What Counts as an Emotion Word?
Languages differ in the size and range of their emotion vocabularies. There are, for example, more than 500 words in English,
750 in Taiwanese Chinese (Russell 1991), and 256 in Filipino
(Church, Katigbak, and Reyes 1996). In addition, translation
equivalents often cover overlapping but not identical semantic
space (Wierzbicka 1999). Clearly, the investigation of the emotion lexicon requires the careful delimitation of what counts as
an emotion word.
Empirical approaches to this question are driven by
prototype theory (Fehr and Russell 1984; Rosch 1978), according to which semantic categories are recognized not by lists of
necessary and sufficient features but in terms of a gestalt
or configurational whole. This approach suggests that emotion
is a fuzzy category, and emotion words fit the category in a
graded manner.
A number of taxonomies have been proposed. G. L. Clore,
A. Ortony, and M. A. Foss (1987) distinguished eight categories
in English: 1) pure affective states (e.g., happy), 2) affective-behavioral states (e.g., cheerful), 3) affective-cognitive states (e.g.,
encouraged), 4) cognitive states (e.g., certain), 5) cognitivebehavioral states (e.g., cautious), 6) bodily states (e.g. sleepy),
7) subjective evaluations of character (e.g., attractive), and 8)
objective conditions (e.g., abandoned). Analyses of prototypicality ratings of 585 candidate emotion words confirmed the
empirical discriminability of the eight categories, and words in
the first three (affective) categories had the highest typicality
ratings.
Phillip Shaver and colleagues (1987) used cluster analysis of
prototypicality ratings of English emotion words to display a prototype hierarchy, with two superordinate categories encompassing positive versus negative terms and five basic-level terms: love,
joy, anger, sadness, and fear. The rest of the terms are subordinates under these basic terms (Shaver et al. 1987; Storm and
Storm 1987). It is interesting to note that negative emotion words
generally outnumber positive emotion words, perhaps explained
by the greater cognitive processing required by negative events
in comparison with positive events (Schrauf and Sanchez 2004).
The Indonesian emotion lexicon has the same overall structure
(Shaver, Murdaya, and Fraley 2001), but in the Chinese lexicon,
a love category does not emerge separate from happiness-

related words (Shaver, Wu, and Schwartz 1992). Recent studies


on the Italian (Zammuner 1998) and the French (Niedenthal et
al. 2004) emotion lexicons suggest that prototypicality ratings are
driven by valence, intensity, duration, familiarity, age of acquisition, and frequency in the corpus.

How Are Emotion Words Represented in the Mind?


psycholinguistics distinguishes abstract and concrete words
as separate classes of words, and recent work suggests that emotion words may form yet a third class of words. In general, concrete versus abstract words are easier to imagine, more quickly
recalled, and more easily recognized. In addition, concrete words
are more easily associated with a context, perhaps because of
prior association with those contexts (Schwanenflugel, Akin, and
Luh 1992). When approached as a third class of words, emotions
words are rated as less concrete and lower in context availability
than abstract and concrete words. Nevertheless, they are rated
higher in imageability than abstract words, perhaps because
of some connection to scripts or typical situations in which
they are experienced. Further, when participants give the first
word that comes to mind in response to concrete, emotion, and
abstract words, emotion words garner the highest number of
different associates (Altarriba and Bauer 2004). If associates are
stored together (as an associative model of memory suggests),
then emotion words would seem to be linked to a richer conceptual base than either of the other two word types. It is interesting that when Spanish-English bilinguals perform these tasks in
Spanish, ratings of context availability in Spanish are higher than
in English. This raises the possibility that emotion words might
be encoded in language-specific ways (Altarriba 2006).
Robert W. Schrauf
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Altarriba, J. 2006. Cognitive approaches to the study of emotion-laden
and emotion words in monolingual and bilingual memory. In Bilingual
Minds: Emotional Experience, Expression, and Representation, ed. A.
Pavlenko, 23256. Clevedon, UK: Multilingual Matters.
Altarriba, J., and L. M. Bauer. 2004. The distinctiveness of emotion concepts: A comparison between emotion, abstract, and concrete words.
American Journal of Psychology 117: 389410.
Church, A. T., M. S. Katigbak, J. A. S. Reyes. 1996. Toward a taxonomy of
trait adjectives in Filipino: Comparing personality lexicons across cultures. European Journal of Personality 10: 324.
Clore, G. L., A. Ortony, M. A. Foss. 1987. The psychological foundations
of the affective lexicon. Journal of Personality and Social Psychology
53: 75166.
Fehr, B., and J. A. Russell. 1984. Concept of emotion viewed from a
prototype perspective. Journal of Experimental Psychology: General
113: 46486.
Niedenthal, P. M., C. Auxiette, A. Nugier, N. Dalle, P. Bonin, M. Fayol.
2004. A prototype analysis of the French category emotion.
Cognition and Emotion 18.3: 289312.
Rosch, E. 1978. Principles of categorization. In Cognition and
Categorization, ed. E. Rosch and B. B. Lloyd, 2748. Hillsdale,
NJ: Lawrence Erlbaum.
Russell, James A. 1991. Culture and the categorization of emotions.
Psychological Bulletin 110.3: 42650.
Schrauf, R. W., J. Sanchez. 2004. The preponderance of negative
emotion words in the emotion lexicon: A cross-generational and

283

Emplotment
cross-linguistic study. Journal of Multilingual and Multicultural
Development 25.2/3: 26684.
Schwanenflugel, P. J., C. Akin, W. Luh. 1992. Context availability and
the recall of abstract and concrete words. Memory and Cognition
20: 96104.
Shaver, P., U. Murdaya, R. C. Fraley. 2001. Structure of the Indonesian
emotion lexicon. Asian Journal of Social Psychology 4: 20124.
Shaver, P. R., S. Wu, J. C. Schwartz. 1992. Cross-cultural similarities and
differences in emotion and its representation: A prototype approach.
In Review of Personality and Social Psychology. Vol. 13: Emotion. Ed.
M. S. Clark, 175212. Newbury Park, CA: Sage.
Shaver, Phillip, Judith Schwartz, Donald Kirson, Cary OConnor. 1987.
Emotion knowledge: Further exploration of a protoytpe approach.
Journal of Personality and Social Psychology 52: 106186.
Storm, C., T. Storm. 1987. A taxonomic study of the vocabulary of emotions. Journal of Personality and Social Psychology 53: 80516.
Wierzbicka, A. 1999. Emotions Across Languages and Across
Cultures: Diversity and Universals. Cambridge: Cambridge University
Press.
Zammuner, V. L. 1998. Concepts of emotion: Emotionness and dimensional ratings of Italian emotion words. Cognition and Emotion
12: 24372.

EMPLOTMENT
Emplotment is the organization of events into a narrative. The
concept was developed most influentially by Hayden White in
his treatment of historiography. White distinguishes five levels
of conceptualization in the writing of history (1973, 5). The first
is the chronicle, a simple listing of events in the order of their
occurrence. The second is the formation of these events into a
basic causal sequence or story. The third, emplotment proper,
is their further elaboration into a narrative with a point. (The
fourth and fifth levels, mode of argument and mode of ideological implication, go beyond emplotment and thus the main
concerns of this entry.) According to White, different historians
commonly organize even the same sequence of events into divergent histories, reflecting different strategies of emplotment. The
same point applies beyond writers on history. Everyone emplots
events, from political figures shaping public policy to ordinary
people in conversational storytelling.
We might consider the events of September 11, 2001, by way
of illustration. A chronicle would simply list the events of the
day. A basic story would set out the causal relations the organization of the conspirators, their practice, their final execution
of their plans, and so on. It should be clear that, even here, there
are different ways in which events may be selected and grouped
together and different ways in which causal links may be posited. For example, in the lead-up to the invasion of Iraq, some
commentators suggested that various actions of the Iraqi government were part of the September 11 causal sequence; others
denied the connection, arguing that this was not even a plausible part of the basic story. The third level, emplotment, embeds
the causal sequence in a more elaborated structure. In the case
of the Bush administration, that structure was a war narrative in
which the events of September 11 constituted an act of war. For
many others, that structure was a crime narrative in which the
events were a (massive) criminal violation. As these cases suggest, differences at the level of causal interpretation and differences at the level of emplotment not only manifest intellectual

284

disagreements but also entail highly significant practical divergences as well.


For White, the formation of the story level is roughly
Aristotelian, the shaping of a beginning, middle, and end.
Emplotment proper follows Northrop Fryes modes (see Frye
1957). Drawing on more recent work in cognition, we might
preserve Whites (and Fryes) insights, while understanding the
precise structures and operations of those structures slightly
differently.

Emplotment and Cognition


The sort of emplotment discussed by White is part of our ordinary causal thought. Indeed, our everyday thought about everything, from our personal lives to larger social patterns, is bound
up with emplotment in roughly Whites sense. Thus, we might
consider the more professional forms of emplotment alongside
more ordinary forms in order to better understand both.
Historiography and everyday causal thought share several
salient tendencies and constraints. First, they tend to be concerned with particularity. Although we try to isolate general
principles for any sort of explanation, history and daily life are
unlike paradigmatic natural sciences, for in history and daily life,
generalities are most often a means of understanding particulars rather than the reverse. In addition, our concerns in history
and everyday life are not subject to repetition in controlled circumstances where we can manipulate variables. As a result, our
causal accounts in these cases must range over a vast number
of possible causal factors. We tend to chose the factors that are
important by a more or less loose comparison across sequences
that we have grouped together as parallel not experimentally
but conceptually. For example, in ordinary life, I may categorize
several failed friendships together and infer their common properties, thus why the most recent friendship failed. This may occur
self-consciously or implicitly. Similarly, a presidential advisor
might categorize several failed foreign policy initiatives together
in order to infer what led to the failure of the most recent policy
or to avoid such failure in a current policy.
It is worth pausing over this point for a moment. In ordinary
cognition, our grouping together of (putatively parallel) event
sequences is almost invariably bound up with prototype formation. A prototype results from a weighted averaging across
instances of a category. Weighting is determined by several factors, prominent among them salience and distinctiveness for the
category (cf. Tversky 1977; Barsalou 1983, 212; and Kahneman
and Miller 1986, 143, on contrast). For example, our prototype
for a man will result from averaging across individual men, but
this is not a pure statistical average. Men to whom we pay more
attention (e.g., heroes and villains in movies) will be weighted
more heavily than men of whom we are only peripherally aware.
Moreover, for individual men, distinctive characteristics (e.g.,
facial hair, as a distinctive difference from both boys and women)
will weigh more heavily than nondistinctive characteristics.
Thus, our prototypical man is more manly than the statistical average. Finally, once established, even in a minimal form,
our prototypes guide categorization. They do so by directing our
attentional focus to distinctive (thus putatively identifying) characteristics of individual men. Given that this is part of the general
operation of the human mind, it presumably occurs with other

Emplotment
sorts of prototype as well, including prototypes for categories of
event sequences thus, narrative prototypes.
We may broadly distinguish, then, between two types of causal
understanding. The first sort, found in what might be called the
general sciences, is experimental, based on the isolation of causal
features by the controlled manipulation of variables. The second sort, found in ordinary life and in what might be called the
particularistic sciences, is prototypical, based on the formation
of distinctive, (loosely) statistical structures. Of course, there
are intermediate cases. Moreover, there are different degrees to
which statistical derivations may be made explicit and rigorous.
For example, there are areas of economic history where we might
achieve relatively high levels of explicitness and rigor. In other
cases, however, it is very difficult to make the statistical process at all scientific, for the selection of a comparison set (which
guides causal inferences) is already so thoroughly imbued with
the implicit prototypes of the researcher.
This division in types of causal understanding is, of course,
connected with the orientation of the particularistic sciences to
the explanation of particulars. But there are many particulars.
Just how does our interest in certain particulars arise? In both
particularistic and general sciences, we attend to the explanation of individual objects or events when we care about them.
We care about something when it has an emotional impact,
which is to say, when it engages some emotion system. Thus, an
understanding of emotion systems is crucial for understanding
our explanatory aims in particularistic study. As it turns out, an
understanding of emotion systems also gives us a way of understanding Whites first and second levels of conceptualization.
Specifically, both a chronicle and a basic story involve three
fundamental cognitive operations selection, segmentation,
and structuration. (On these processes, see Hogan 2003, 3840.)
There are countless aspects of any given sequence of events and
countless construals of those events and their components.
Even a chronicle selects certain aspects while ignoring others,
clusters those aspects together into the events that compose
the chronology, and gives those aspects at least some degree of
internal structure. For example, in a chronology of the events of
September 11, we might include a statement that the hijackers
took over one airplane at such and such a time. That statement
selects various aspects of the situation and organizes them into
a brief causal moment. Moreover, in going beyond a chronicle
and telling the basic story, we might begin with the first plane
crash, or we might begin with the conspiracy of the hijackers, or
we might begin with various aspects of U.S. policy in the Muslim
world (seen by the hijackers as justification for the September 11
attacks). The questions that arise here concern just why we select
certain aspects and construals over others and how we come to
decide that events begin and end at certain points.
The simple answer to these questions is that our initial selection and construal (as manifest in a chronicle) are the product of
our emotion systems. (For a more technical discussion of these
issues, and for research supporting this analysis, see Hogan
2008 and Chapter 4 of Hogan 2009.) We are emotionally sensitive to certain sorts of properties, conditions, alterations, and so
on. These draw our attentional focus. Our sense of a beginning
and an ending (thus, our fashioning of a basic story) are equally
guided by our emotional responses. The beginning of the story

is the point at which our emotion systems are engaged. The end
of the story is the point at which our emotion systems return to
their normal state. This is why Americans tend to view the conspiracy of the hijackers as the beginning of the story. With limited
exceptions, they lack emotional interest in what preceded and
motivated the attacks.
Of course, things do not end with this level of selection, and so
on. Whenever we isolate aspects of a particular event sequence
due to their emotional force, we simultaneously activate cognitive
structures for understanding and responding to that sequence.
These structures crucially involve prototypes, which supplement
our emotional responses in elaborating interpretations, explanations, expectations, directing attentional focus, and so on. In the
case of event sequences, these are crucially narrative prototypes,
including subprototypes bearing on actions and on agents. This,
then, leads us to the level of emplotment proper.
The narrative prototypes that guide emplotments undoubtedly include the broad structures of narrative universals.
For example, the hijackers may have emplotted their actions in
terms of a sacrificial narrative in which the suffering of the home
society will be relieved by God due to the voluntary death of a
member of that society. In contrast, the U.S. government emplotted these same actions as the foreign aggression component of a
heroic plot. Beyond these cross-cultural patterns, emplotments
also derive from more culturally specific narrative structures,
including structures related to culturally defined practices, such
as those of legal systems (as in the emplotment of the September
11 attacks as criminal acts).
Both universal and culturally specific narratives are bound up
not only with emotions but also with values related to those emotions. Thus, it is unsurprising that different emplotments tend to
import different social agendas and different political attitudes
into the interpretation of the event sequence. (In Whites system, this appears in the fifth level of conceptualization, mode
of ideological implication [1973, 5].) It is also unsurprising that
they tend to be points of consequential political contestation.

A Note on Emplotment and Grammar


The study of narrative is a consequential part of linguistic
discourse analysis and sociolinguistics. In this way,
emplotment necessarily has an important place in the language
sciences. However, it is worth mentioning that emplotment may
be related to more narrowly grammatical issues as well. One
might argue that thematic roles are, first of all, narrative positions that have grammatical consequences. Of course, one might
also see thematic roles as orienting our emplotments by way of an
initial operation in grammar. Similarly, one might argue that the
different causal relations encoded in causative constructions are a function of our tendency to emplot experience or
at least that the linguistic propensity realized in some languages
gives us a clue as to the diversity of causal sequences that broadly
constrain our emplotments. Finally, the grammatical encoding of
event individuation in some languages (see Kroeger 2004, 2335)
may indicate the dependency of certain grammatical features on
a prior (implicit) emplotment, or it may point us toward a further
area of research that will help us understand event individuation
and its relation to emplotment. In any case, there are reasons to
believe that emplotment is closely related not only to broad issues

285

Encoding
in discourse but to more narrowly grammatical concerns as well.
The most radical view of this relation would be that emplotment
is cognitively fundamental to certain aspects of grammar (a point
suggested by authors such as Mark Turner [1996]). Alternatively,
it may simply be that certain features recur in grammar and narrative, due to shared cognitive sources (an account that may be
suggested by certain aspects of frame semantics) or due to the
effects of grammar on emplotment.
Patrick Colm Hogan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Barsalou, Lawrence. 1983. Ad hoc categories. Memory and Cognition
11.3: 21127.
Frye, Northrop. 1957. Anatomy of Criticism: Four Essays. Princeton,
NJ: Princeton University Press.
Hogan, Patrick Colm. 2003. Cognitive Science, Literature, and the Arts: A
Guide for Humanists. New York: Routledge.
. 2008. Stories, wars, and emotions: The absoluteness of narrative
beginnings. In Narrative Beginnings, ed. Brian Richardson, 4462.
Lincoln: University of Nebraska Press.
. 2009. Understanding Nationalism: Narrative, Identity, and
Cognitive Science. Columbus: Ohio State University Press.
Kahneman, Daniel, and Dale Miller. 1986. Norm theory: Comparing
reality to its alternatives. Psychological Review. 93.2: 13653.
Kroeger, Paul R. 2004. Analyzing Syntax: A Lexical-Functional Approach.
Cambridge: Cambridge University Press.
Turner, Mark. 1996. The Literary Mind. Oxford: Oxford University Press.
Tversky, Amos. 1977. Features of similarity. Psychological Review
84: 32752.
White, Hayden. 1973. Metahistory: The Historical Imagination in
Nineteenth-Century Europe. Baltimore: Johns Hopkins University
Press.

ENCODING
Encoding refers both to the process of laying down information in memory as the result of exposure to certain stimuli and
to the organization of that information once it has been laid
down. From the viewpoint of language, this encompasses both
the encoding of linguistic knowledge and the encoding of verbal
experience (Francis 1999). The former refers to what we think of
as knowing a language and the latter refers to knowing things
in language (e.g., recalling a conversation). In either case, most
of what we know about encoding comes from studying its interaction with retrieval.

Encoding of Linguistic Knowledge


Linguistic knowledge includes the phonological forms, morphosyntactic patterns, lexico-semantic items, pragmatics, and
so on, that are held in long-term memory. It is information that
we usually produce automatically without attention to processing and for which we have no memory of the specific contexts in
which we acquired the individual items or skills. Studies in first
language acquisition focus on how individuals come to learn
(encode) all of the linguistic knowledge necessary to be competent speakers of a particular language (Gass and Selinker 2001;
Ritchie and Bhatia 1998). Such encoding involves complex interactions between environmental input and physiological maturation. For instance, infants 46 months old can encode phonetic

286

distinctions in sounds from any language, but those 1012 months


old can only distinguish sounds that are meaningful in their own
language (Stager and Werker 1997). Reacquisition of the ability
to encode sounds from other languages can be difficult later in
life. On the other hand, the stagelike encoding of complex lexical and grammatical information over the course of childhood
probably has as much to do with the nature of language learning as with developing cognitive maturity (Snedeker, Geren, and
Shafto 2007). In the same vein, proponents of the controversial
critical period hypothesis suggest that ultimate proficiency
in a second language is a function of earlier age at acquisition,
due in part to maturational abilities and exposure to the language
(DeKeyser and Larson-Hall 2005; see also second language
acquisition).
Investigations of the encoding of linguistic knowledge have
relied primarily (though not exclusively) on priming paradigms
in which some language knowledge (phonological, lexical,
semantic) stored in long-term memory is activated, and then
its effect is measured on some task that relies on that implicit
activation (e.g., associative priming with lexical decision as the
task; see also spreading activation).

Encoding of Verbal Experience


A great deal of our world knowledge is initially learned verbally,
and language scholars have been particularly interested in the
extent to which linguistically encoded information retains its
linguistic form at retrieval. Experimental work in the laboratory often focuses on new information learned as lists of words,
sentences, or paragraphs. Participants are then tested for their
memory of both the information (the conceptual information)
and any accompanying linguistic detail in which it was presented (words, phrases, sentences, etc). bilingualism provides an ideal test case in this regard because the language of
both encoding and retrieval can be experimentally manipulated. An influential theory in this field is encoding specificity,
which suggests that successful retrieval is premised on a match
between information in the retrieval cue and information stored
in the encoded memory trace (Tulving 1983). In this case, the
language used at the time of encoding the information is putatively a feature of the mnemonic trace and may be reactivated at
the time of retrieval. For instance, in their research on bilingual
recall for word lists, J. Altarriba and E. G. Soltano (1996) suggest that as bilinguals store concepts across languages, they
also associate language-tags with the concepts that correspond to the language in which the concepts were presented in
the lists. Important recent work suggests that language-specific
information is deeply embedded at the level of semantic representations, and new methods of investigation may be needed to
explore how such encoding takes place and how the information is reactivated at retrieval (Pavlenko 2008).
At higher levels of complexity (beyond memory for information
in words or phrases), memory for narratively organized personal
events, or autobiographical memories, also seems to be linguistically tagged. Thus, research in this area shows that bilinguals
recall events from their personal past in the language in which
these events were encoded (Marian and Neisser 2000; Schrauf
2000, Schrauf and Durazo-Arvizu 2006). These results may also
be explained by the principle of encoding specificity because the

nonc/Statement (Foucault)

Essentialism and Meaning

language in which an event took place (spoken, heard, written,


read) constitutes a feature of the mnemonic trace and predisposes the individual to recall the event in that same language.
Robert W. Schrauf
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Altarriba, J., and E. G. Soltano 1996. Repetition blindness and bilingual
memory: Token individuation for translation equivalents. Memory
and Cognition 24: 70011.
DeKeyser, R., and J. Larson-Hall. 2005. What does the critical period
really mean? In Handbook Of Bilingualism: Psycholinguistic
Approaches, ed. J. F. Kroll and A. M. B. de Groot, 88108. New
York: Oxford University Press.
Francis, W. S. 1999. Cognitive integration of language and memory
in bilinguals: Semantic representations. Psychological Bulletin
125: 193222.
Gass, S. M., and L. Selinker. 2001. Second Language Acquisition: An
Introductory Course. Mahwah, NJ: LEA.
Marian, V., and U. Neisser. 2000. Language-dependent recall of autobiographical memories. Journal of Experimental Psychology (General)
129: 3618.
Pavlenko, A. 2008. Emotion mental lexicon. Bilingualism: Language
and Cognition 11.2: 14764.
Ritchie, W. C., and T. K. Bhatia, eds. 1998. Handbook of Child Language
Acquisition. San Diego, CA: Academic Press.
Schrauf, R. W. 2000. Bilingual autobiographical memory: Experimental
studies and clinical cases. Culture and Psychology 6: 387417.
Schrauf, R. W., and R. Durazo-Arvizu. 2006. Bilingual autobiographical memory and emotion: Theory and methods. In Bilingual
Minds: Emotional Experience, Expression, and Representation, ed.
Aneta Pavlenko, 284311. Clevedon, UK: Multilingual Matters.
Snedeker, J., J. Geren, and C. L. Shafto. 2007. Starting over: International
adoption as a natural experiment in language development.
Psychological Science 18.1: 7987.
Stager, C. L., and J. F. Werker. 1997. Infants listen for more phonetic detail
in speech perception than in word learning tasks. Nature 388: 3812.
Tulving, E. 1983. Elements of episodic memory. Oxford: Clarendon Press.

other utterances (i.e., if certain procedures have been adhered


to) and if it takes place within an institutional setting (i.e., within
a courtroom, by an appointed judge). Whereas Searles and
Austins definition of speech-acts stressed the performative
nature of such utterances the fact that they achieved something
in the real world Foucaults emphasis is much more on the fact
that statements bring about something because of their position
within an institution and because of their interrelationship with
other discursive structures.
For Foucault, the main reason for conducting an analysis of
statements is to discover the material and conceptual supports
that allow it to be said and that keep it in place. These support
mechanisms are both intrinsic to discourse itself and extradiscursive, in the sense that they are sociocultural and institutional. Foucault is concerned to set statements in their discursive
frameworks; thus, statements do not exist in isolation since there
is a set of structures that makes those statements make sense and
gives them their force. Thus, entry into discourse is seen to be
inextricably linked to questions of authority and legitimacy. Each
discursive act maps out the possible uses that can be made of the
statement (although, of course, that is not necessarily what happens to it). Each statement leads to others, and in a sense, it has
to have embedded within it the parameters of the possible ways
in which future statements can be made.
Sara Mills
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Austin, J. 1962. How to Do Things with Words. Oxford: Clarendon Press.
Dreyfus, H., and P. Rabinow. 1982. Michel Foucault: Beyond Structuralism
and Hermeneutics. Brighton: Harvester.
Foucault, M. [1969] 1972 in text. Archaeology of Knowledge. Trans.
A. M. Sheridan-Smith. London: Tavistock.
Searle, J. 1979. Speech Acts. Cambridge: Cambridge University Press.

ESSENTIALISM AND MEANING


NONC/STATEMENT (FOUCAULT)
The term nonc or statement was modified and developed by
Michel Foucault to describe what for him constituted the smallest element within a discursive structure (Foucault [1969] 1972).
Foucault had some heated debates with others about the meaning of the word statement. In some of these discussions, some
critics asserted that the statement was the same as the speechact, as developed by John Austin and John Searle (see Dreyfus
and Rabinow 1982, 449) for a fuller discussion). However, statements the most fundamental building blocks of discourse do
seem to differ from speech-acts in important ways. Statements
are those utterances or parts of text that have an effect. Statements
are not the same as sentences but are those utterances that can
be seen to be grouped around one particular effect. Thus, when a
judge says, I sentence you to three years imprisonment, there
are a number of these effects. The judge is institutionally sanctioned and, therefore, the force of her/his pronouncement is to
transform the accused into a criminal and to enforce a particular
sentence on that person. Thus, I sentence you can be regarded
both as a statement and as part of a discourse, since such a statement can only have effect if it is uttered within the context of

What Is Essentialism?
The idea of essences has been important in Western philosophy
since at least the time of Plato. Discussion of essences flourished
in classical and medieval philosophy and has been revived in
recent decades by philosophers such as Saul Kripke, due primarily to work in modal logic, which is to say, the formalization of
necessity and possibility. In the humanities, essentialism is often
used to refer to any view that is not historicist. In philosophy
including the semantics of formal logic the term is used
more narrowly for the belief that objects have definitive features
and incidental features. The incidental features may change without the identity of the object changing. However, if a definitive
feature changes, then the objects identity changes. For example,
in sufficient quantities and with the right light, water appears to
be blue, but if we take a glass of water from a larger (blue) body,
it appears colorless. In scooping the glass out of the water, we
have altered the color of the water, yet we would not say that it
is a different thing than it was before. An incidental property of
appearance has changed, but the stuff itself remains the same.
In contrast, suppose we take that glass of water and induce a
chemical change so that it continues to appear clear but is no

287

Essentialism and Meaning


longer H2O. Its appearance would not have changed. However,
we would be inclined to say that the stuff is not the same. In other
words, its essence would have altered.
Essentialism is connected with semantics through natural
kind terms. These are terms that (putatively) refer to some naturally delimited set of objects (including substances). Natural
kinds are distinguished from sets of objects that are merely
selected by human choice. In this view, a word such as water
would be a natural kind term.
Semantic essentialism is a form of meaning externalism
in which our natural kind terms are defined (in part) by essences,
thus entities external to the minds of speakers. Crucially, this
holds even in cases where we do not know the essence in question. In perhaps the most famous example of such a case, Hilary
Putnam (1975) imagined a Twin Earth that is identical with
our earth, right down to our brain states when we use the word
water. However, there is one key difference. The actual chemical
composition of the stuff referred to as water on Twin Earth is
not H2O, but something else call it XYZ. According to Putnam,
water (as we use it) does not refer to the waterlike substance on
Twin Earth and it never did even hundreds of years ago, when
we did not realize that our water is H2O. In other words, the natural kind term, water, always referred to H2O and nothing else
because, as a natural kind term, it was always defined (in part)
by the essence of its referents. (I say in part because essentialist theories of meaning commonly allow for various semantic
components. For example, in setting out the meaning of water,
Putnam [1975, 269] includes syntactic markers, such as noun,
semantic markers, such as liquid, and stereotypical properties,
such as being colorless. But none of these determines the extension of the term, thus what the term refers to [see reference
and extension].)
At least two sets of issues arise in connection with the relation
between essentialism and meaning. The first set concerns the
essences. The second set concerns the words for which essences
determine the referents. We consider each in turn.

Essences, Causes, and Possible Worlds


Saying that essences are definitive properties works well enough
as a way of introducing the general notion. But it can hardly
stand as an ontology. Yes, we think that the stuff in the glass has
changed its identity after the chemical reaction, but not previously when it was scooped out of the pool. But what does that
tell us?
Here we might subdivide the problem into whether or not
essences exist, just what essences there are, and how we might
access these essences. Modal logic, with its associated possible
worlds theory, serves as a preliminary way of approaching all
three issues. Again, modal logic treats relations of necessity and
possibility. For example, suppose it is necessary that p entails q.
Suppose also that it is necessary that p. It follows that it is necessary that q. Thus, suppose it is necessary that if substance w is
water, then substance w is H2O. Suppose also that substance w is
necessarily water (i.e., w could not possibly be just this substance
and not be water). It follows that substance w is necessarily H2O.
Here, the question arises as to what differentiates this necessity
from ordinary truth. Suppose everyone in a particular town in
Minnesota call it New Oslo speaks Norwegian. It is then true

288

that If someone lives in New Oslo, he or she speaks Norwegian.


But this is merely a contingent truth. After all, it could happen
that a monoglot Swedish speaker moves to New Oslo. The situation is different with water. Some bit of non-H2O could not
simply be added to the set of things that constitute water. The
idea of possible worlds is a way of capturing this notion. Put simply, there are possible worlds in which not all residents of New
Oslo speak Norwegian. However, there are no possible worlds in
which water is not H2O. Thus, it is necessary that water is H2O.
Thus, H2O is the essence of water. Thus, water is a natural kind
term and its meaning is (partially) determined by the essence,
H2O. (Alternatively, it refers directly to H2O because that is the
essence of water.)
Insofar as we accept that there is a difference between necessary and contingent implications, and insofar as we accept
that in at least some of these cases the necessary implications
have normative bearing on semantic relations (such as the relation between the word water, the things we call water, and the
essence, H2O), we seem to have committed ourselves to the existence of essences and to some form of semantic essentialism.
Moreover, insofar as we accept the relation of all this to modal
logic and possible worlds theory, we seem to have found a way
of determining what the essences are and how we might know
them. Of course, possible worlds theory does not tell us that the
chemical composition of water is H2O. That is learned empirically. What it (purportedly) tells us is that once we know the
chemical composition, we thereby know the essence, because
that is what is unchanging across possible worlds.
Nevertheless, on reflection, it may be that things are not that
clear. Perhaps we have simply noticed that identity is preserved
under a construal (as G. E. M. Anscombe [1963] might have put
it). If I scoop out a glass of water and it stops being blue, then it is
still the same under the construal water. However, it is not the
same under the construal blue. In other words, we would not
say that the set of blue things now includes something clear. Note
that the point holds even if we dont quite know what water is (in
terms of chemical composition) or what being blue is (in terms
of light reflection). Moreover, the point is not confined to terms
such as water and blue. It appears to extend across the board.
Somewhat reminiscent of the problems faced by Platonic essentialism in the Parmenides, these points may seem to suggest a
promiscuous multiplication of essences. This would effectively
undermine any reason for isolating essences in the first place.
(Conversely, these are just the sorts of phenomena that a nonessentialist approach to semantics might lead us to expect; see, for
example, meaning and stipulation.)
On the other hand, none of this really counts for much, either
way. In all these cases, we are relying on intuition. (Kripke is
explicit about the role of intuition in essentialism; see, for example, [1972] 1980, 1012, 39, and 42; see also Putnam 1975, 271).
The rules for modal entailment are fixed formally. However,
the truth or falsity of our premises is not fixed either formally or
through empirical study. It is fixed only by our intuitions. But
just what do our intuitions tell us? Do they tell us about metaphysical possibility and necessity? Or do they tell us something
about the way our minds operate in construing possibility and
necessity. It seems much clearer that our intuitions bear on the
latter (whether or not they bear on the former). Indeed, from

Essentialism and Meaning


the perspective of cognitive neuroscience and evolutionary
psychology, our intuitions in these matters are unsurprising. At an early age, we begin to attribute essences (or hidden
structures, as Putnam would say) to certain objects (see, for
example, Boyer 2001, 10620, and citations). That attribution is
simply a form of semantic organization that operates in the usual
adaptive way (see adaptation). Specifically, it is a simplified mechanism that has adaptive value because it approximates
a function (for more on this distinction, see Hogan 2007). The
function, in this case, is causal inference. Causal inference can
be a slow, complex process. Attributing essences to kinds (e.g., to
water and to tigers) allows us to draw causal inferences quickly
and with a great deal of accuracy. Specifically, it facilitates the
exploitation of opportunities (e.g., for quenching thirst) and the
avoidance of threats (e.g., of being eaten).
Perhaps, then, we should incorporate causality into our understanding of essences; perhaps we should say that the essence of
an object is whatever property explains its other properties. Of
course, not every object or substance has such a causal nexus.
Natural kinds, however, do. Thus, water has a range of properties that may be explained by it being H2O. Its being colorless
and its function in quenching thirst, for example, are explained
by its chemical composition. Conversely, its being colorless
and its function in quenching thirst do not explain its chemical
composition.
This causal criterion turns us away from speculation on possible worlds toward actual empirical science. Our conclusions
might still be framed in terms of modal logic and possible worlds.
However, it is not clear that this will add anything to our understanding. Specifically, in order to make the connection between
empirical science and possible worlds, we have to rely once
more on intuitions or we have to hold real-world causality constant across all possible worlds, which merely makes the modal
logic a translation of our empirical causal analysis. For example,
consider again Putnams case of Twin Earth XYZ. If H2O and XYZ
have no causally distinct consequences, even in chemical tests,
then how do we decide if water on our Earth means/refers to
something different from what water means/refers to on Twin
Earth? We have only intuitions based on what is in effect a form
of Cartesian doubt. Ex hypothesi, we couldnt know or even come
to suspect that there is a difference based on evidence. On the
other hand, if there are causally differentiating consequences,
then distinguishing between the two is merely making a causal
and empirical division for which modal logic and possible worlds
seem superfluous.
This raises a further question. If we are simply seeking causally crucial properties, just what is added to this by the term
essence? What reason do we have to assert that a causally crucial property even a uniquely causally crucial property determines the identity of a substance? Isnt this something we have
merely stipulated reasonably, perhaps, but without any necessity beyond facilitating the achievement of certain practical tasks
(such as doing certain sorts of things with water)? Again, a wide
range of properties of a given substance derive from the putatively definitive property. But shouldnt we still be free to pick
out one of those other properties as definitive, depending on our
interests or specific tasks? The point has semantic consequences
as well as ontological ones. For example, cant we call anything

water that looks, tastes, and quenches thirst like H2O, even if it
has a different chemical composition? This brings us to our second topic.

Words and Meanings


What, then, prevents us from grouping together all the things
that we can drink, or all the things that we can see through, rather
than all the things that have the same chemical composition
(say, H2O)? Well, in fact, nothing prevents us from doing that.
We refer to things that we can drink as beverages and things that
we can see through as clear. An essentialist can respond to this
by saying that it is irrelevant. Only natural kind terms are linked
with essences; beverage, clear, treat, and so on, simply do
not refer to natural kinds.
Given the preceding causal account of essences, it would
seem that a natural kind term is any term used to refer to an
object or substance that has some central causally consequential
property (or perhaps a small number of such properties). Thus,
water is a natural kind term because its molecular composition
has unique causal importance, explaining a wide range of other
properties. But objects are clear due to specifiable properties as
well. Although the molecular composition of water and glass are
different, they share properties that allow the passage of light. So,
by this criterion, it would seem that clear should count as a natural kind term. The same point could be made even about such
extreme cases as treat (roughly, something that someone likes a
lot but experiences only rarely), so long as being a treat is open to
causal explanation.
As the importance of causal relations may suggest, these
issues bear on particulars as well as classes. (After all, real causal
sequences are themselves particular.) Indeed, from the start, we
have implied that in essentialist theory, particulars have essences,
as when we said that a bit of water is the same before and after
we scoop it out. But this returns us to our earlier question about
considering objects under a certain construal. Is there any issue
of this bit of stuff really being or not being the same individual?
Or is the only issue whether or not it is the same under the construal water?
These questions are related to the problem of just how we
fix the relation between a word and a referent. In the context of
standard essentialist semantics, this is to say that it is related to
Kripkes idea of rigid designation. In this view, certain sorts of
terms names and natural kind terms rigidly designate their
referents. A term rigidly designates its referent if it designates that
referent in all possible worlds (Kripke [1972] 1980, 48). Thus water
designates H2O in all possible worlds. The same point holds for
names. Al Gore refers to the same person in all possible worlds.
In contrast, definite descriptions are not rigid designators. The
actual winner of the 2000 presidential election refers to George
W. Bush in some possible worlds. In Kripkes view, almost any
property of an individual can be altered. Al Gore might really
have lost the 2000 presidential election is a perfectly plausible
counterfactual, unlike Al Gore might not have been Al Gore.
But what does it mean to say that Al Gore is the same individual
in all possible worlds?
Kripke explains this by explaining the fixing of reference.
For Kripke, reference is fixed by a causal sequence that is, a
sequence of transmission leading back to an initial linking of

289

Essentialism and Meaning

Ethics and Language

a term with an object. Water names H2O because its use leads
back to a link with a certain substance having a certain essence.
Al Gore names Al Gore because its use leads back to a certain individual, tracing a chain of transmission in reverse. (The
chain of transmission, presumably went from Gores parents to
their friends [Bob, Trudy this is baby Al], and so on.) This is
called the causal theory of reference. (This intentional or semantic causality is, of course, different from the physical causality
of putatively essential properties, such as chemical composition.
In order to keep the two distinct, I refer to the latter as physical
causality in the remainder of this entry.)
As with natural kinds, however, there has to be some limit on
just what we can vary about Al Gore in order to guarantee that
he is the same person across possible worlds. It is not clear that
there is any property that has the sort of physical causal force
that chemical composition has for water. Relying on intuitions
about possible worlds, Kripke concludes that the earliest physical causal factors are the crucial ones here. It seems to me, he
writes, that anything coming from a different origin would not
be this object ([1972] 1980, 113). But here again we run into the
issue of just what intuition tells us. Put differently, do we learn
anything about identity here, or do we only learn something
about the general importance of physical causality for the way
in which we think about the world, and thus the way we draw
intuitive inferences about identity? In other words, does this tell
us something epistemological and ontological, or simply something psychological and evolutionary?

Conclusion
If they are valid, the preceding arguments may suggest that,
ultimately, there is no real issue of essences (or essential identity) in semantics. (For further discussion of these issues, see
Hogan [1996] 2008, Preface, Chapter 2, and Chapter 3, particularly ixxi and 6370.) However, there are important issues of
physical causal analysis (issues to which the active development of essentialist theories has helped to draw our attention).
More exactly, there is no issue of metaphysics and the foundations of epistemology (requiring guidance by possible worlds
theorization). Rather, there is only 1) an issue of empirical
science regarding the physical causal properties of objects (or
substances) and 2) the related, also empirical, issue of how our
brains organize the world in terms of physical causality, identity relations, and so on. These forms of empirical study, then,
may impact our descriptive account of the way in which meaning operates. They should also have consequences, of a more
limited sort, for our normative definitions of terms, particularly
in scientific contexts where precise physical causal analysis is
paramount.
On the other hand, not everyone agrees with these arguments far from it, in fact. Essentialism is an important and
highly influential position, and advocates of essentialist semantics have responses to the preceding claims. For example, some
writers would insist that we should not take causality as a primitive notion. Rather, we need to explain causality in terms of possible worlds (see Lewis 1986, 157269), leading us back to modality
and, presumably, essences.
Patrick Colm Hogan

290

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Anscombe, G. E. M. 1963. Intention. 2d ed. Ithaca, NY: Cornell University
Press.
Boyer, Pascal. 2001. Religion Explained: The Evolutionary Origins of
Religious Thought. New York: Basic Books.
Hogan, Patrick Colm. [1996] 2008. On Interpretation: Meaning
and Inference in Law, Psychoanalysis, and Literature. 2d ed.
Athens: University of Georgia Press.
. 2007. Laughing brains: On the cognitive mechanisms and reproductive functions of mirth. Semiotica 165.1/4: 391408.
Kripke, Saul. [1972] 1980. Naming and Necessity. Cambridge: Harvard
University Press.
Lewis, David. 1986. Philosophical Papers. Vol. 2. Oxford: Oxford University
Press.
Putnam, Hilary. 1975. Mind, Language, and Reality. Cambridge: Cambridge
University Press.
Salmon, Nathan. 1981. Reference and Essence. Princeton, NJ: Princeton
University Press. An important, relatively early critical analysis of referential essentialism.
Sosa, Ernest, and Enrique Villanueva, eds. 2006. Philosophy of language. Philosophical Issues 16 (Special Issue). Includes recent essays
treating the extensive literature on these issues.

ETHICS AND LANGUAGE


This topic quickly threatens to become an unmanageably broad
because every ethical theory that regards value judgments as
subject to rational evaluation for example, utilitarianism,
Habermasian discourse ethics, Rawlsianism, virtue ethics,
Scanlons contractarianism, and neo-Kantianism contains
an account of the meaning of moral terms. But an account of
all of those theories would require a whole volume and would
take us away from issues specifically about language. Instead,
I focus on an issue that, impinging as it does on the field of
lexical semantics on the one hand and the truth valuedness
of propositions on the other, is certainly linguistic and has
been central in both the philosophy of language and metaethics
for more than a century: the issue of cognitivism versus noncognitivism. I describe how one form of noncognitivism, logical
positivism, became influential, especially in the social sciences, and how the logical positivists arguments came to be disputed. But first, of course, we need to define our terms.
A cognitivist (with respect to ethical discourse) holds that at
least some ethical statements (e.g., George did a good thing
when he saved that child) are true. A noncognitivist holds that
such statements are not truth-apt; no ethical statement is either
true or false. (The logical positivists used to say that such statements are cognitively meaningless.) The following is an exceptionally aggressive statement of the noncognitivist position: All
statements belonging to Metaphysics, regulative Ethics, and
(metaphysical) Epistemology have this defect, are in fact unverifiable and, therefore, unscientific. In the Viennese Circle, we are
accustomed to describe such statements as nonsense (Carnap
1934, 26).
Although most cognitivists regard ethical statements as true or
false sans phrase, there are quasi-realist positions (Blackburn
1984), according to which it is linguistically appropriate to predicate true of an ethical assertion, but the word true doesnt have
the same function as it does when we say of a scientific claim that

Ethics and Language


it is true. (True doesnt ascribe realist truth when applied to
an ethical statement.) The position of Bernard Williams (1985),
according to which scientific statements aspire to absolute
truth while ethical statements can be true in a particular communitys conceptual scheme but not absolutely true, has a
close relation to this quasi-realism. Evidently, such positions
are noncognitivist in spirit, even though writers who adopt such
positions acknowledge that the word true can be used in ethical
discourse.

alone, and which Quine generalized so that it becomes evident


that not only other scientific principles, but also the logic and
mathematics we use in our explanatory and predictive reasoning are implicated (ibid., 255), White concluded that we may
say that just as Duhems view, when pressed to the extreme,
makes it difficult to maintain a radical separation between the
analytic and the synthetic and the method of establishing logical
as opposed to empirical truth, so the view we have advocated will
break down the remaining dualism between logic-cum-empirical science and ethics (ibid., 256; see also Walsh 1987).

Logical Positivism
By calling ethical assertions nonsense, Rudolf Carnap meant
not only that they lack truth value but that they are outside
the sphere of rational argument altogether. The real world
influence of this doctrine was enormous. For example, Lionel
Robbins (1932), one of the most influential economists of the
1930s, enthusiastically endorsed it, as did Milton Friedman and
Paul Samuelson. And the idea of value-free science obviously
influenced other social sciences as well
Even a critic of logical positivism must grant that it had one
enormous virtue, and that was its capacity for self-criticism. If
logical positivists seemed to the economists just mentioned to be
logicians of science who had discovered how to demarcate the
cognitively meaningful from nonsense, the positivists themselves were dissatisfied with their formulations of the supposed
demarcation principle and constantly revised it (Hempel 1963;
see also Putnam 2002, 727).
Besides criticisms faced from within the movement itself,
the attempts to formulate a criterion of cognitive meaningfulness encountered criticisms from W. V. Quine, a lifelong friend
of Carnaps who shared his admiration for science and symbolic
logic and his noncognitivism with respect to ethics.
Basically, all the logical positivist formulations presupposed 1)
that all meaningful language except for pure mathematics and
logic could be reduced to observation terms (it was supposed to
be clear which these are) and 2) that mathematics and logic are
analytic or tautologous. Quine ([1951] 1961) famously demolished these two dogmas of empiricism, as he called them, and
urged to the satisfaction of almost all philosophers of science
and philosophers of mathematics that neither the program of
reducing all meaningful language to the positivists observation
vocabulary nor the idea that mathematics consists of tautologies (or, alternatively, truths by convention) is defensible. The
pillars on which the positivist criterion of demarcation rested fell
in 1950.
Just as it was a friend of Carnap who rebutted many of Carnaps
claims, so it was a friend of Quine, Morton White, who pointed
out that with the positivists criterion of cognitive significance
demolished, the whole basis for the positivists claim that ethical
sentences are nonsensical was also gone. First of all, the notion
of an observation is extremely unclear: Why isnt I saw X steal
Ys wallet? an observation sentence? (Incidentally it would
seem that stealing is a fairly clear notion by comparison to being
an observable predicate [White 1956, 109].) Secondly, after
pointing out that once we accept the holistic view of confirmation originally proposed by Pierre Duhem, according to which
scientific explanation and prediction puts to the test a whole
body of beliefs, rather than the one which is ostensibly under test

Expressivism and Thick Ethical Concepts


Carnaps claim that ethical sentences are nonsense is simply
not believable if nonsense is supposed to have the meaning it
normally has. And Carnap continues in a way that makes it even
more unbelievable. Conceding that there is some sense in which
sentences of metaphysics and ethics (and poetry!) are meaningful, he writes: We do not intend to assert the impossibility of
associating any conceptions or images with these logically invalid
statements. Conceptions can be associated with any arbitrarily
compounded series of words; and metaphysical statements are
richly evocative of associations and feelings both in authors and
readers (1934, 26). Obviously, the lines of Jabberwocky are
also richly evocative of associations and feelings as well, but
All slithy were the borogroves and John is a cruel parent are
linguistically very different indeed!
A year later, Carnap is a bit more sophisticated: [A] value
statement is nothing else than a command in misleading grammatical form (1935, 25). But there are many differences between
imperatives and ethical statements. Carnap has failed to distinguish assertions of very different kinds.
More sophisticated attempts by logical positivists and their
allies to explain the sense in which ethical sentences are meaningful were soon made. The most important for a long time were
Alfred Jules Ayer (1936) and Charles Stevenson (1944). Ayer
held that the function of ethical sentences is to express emotions (hence, the term emotivism for this version of noncognitivism). Stevenson identified the function of ethical sentences with
expressing and influencing attitudes. He further claimed that
the disagreements that occur in science, history, biography are
disagreements in belief, whereas it is disagreements in attitude that chiefly distinguish ethical issues from those of science (1944, 13). The task of explaining just what an attitude is
could be left to psychology, Stevenson thought.
The family of noncognitivist positions that regards ethical
assertions as having the function of expressing attitudes is today
known as expressivism; the most sophisticated contemporary
statement of this position comes from Allan Gibbard (1990).
Another family of noncognitivist positions (foreshadowed by
Carnaps description of ethical statements as commands in
misleading grammatical form) holds that ethical statements
have a basically imperative function. The most famous statement
of prescriptivism is by R. M. Hare (1952; see also Reichenbach
1951).
It will be noted that these positions concern the function of
ethical sentences as wholes. But ethical sentences have parts;
in particular, they contain ethical predicates. And cognitivists
in ethics, in addition to attacking the logical positivist roots of

291

Ethics and Language

Ethnolinguistic Identity

emotivism and prescriptivism, stress the fact that ethical words


have descriptive as well as evaluative functions, thereby attacking emotivism and prescriptivism as inadequate accounts of the
lexical semantics of ethical sentences. The idea that certain concepts used to describe events, people, and actions in ethical discourse have simultaneously evaluative and descriptive functions
attracted wide philosophical attention after the appearance of
Williams (1985), who referred to such concepts as thick ethical
concepts. Williams himself says he first encountered the notion
in a seminar given by Phillippa Foot and Iris Murdoch in the
1940s; Murdochs The Sovereignty of Good Over Other Concepts
(1967) is partly about such concepts, although she doesnt use
this terminology (see also Putnam 2002). These authors argue
that to master the use of cruel, pert, deceit, propaganda,
brave, reasonable, and other thick words, one has to be
able to identify, at least in imagination, with an ethical point of
view. Although Stanley Cavell does not use the term thick ethical
concept, he does ague that the use of such words also requires
the acquisition of a number of practices, such as apologizing,
explaining why one did something when there is an ethical challenge, and offering excuses (1979, Part III). In short, the picture of
these terms as merely expressing attitudes is naive.
The noncognitivist response has been to claim that the meaning of a thick ethical term like cruel can be factored into two components: causes deep suffering (this would be the descriptive
component) and an attitude of moral disapproval (this would be
the evaluative component) (Hare 1981, 72). Hilary Putnam (1981,
2035) and John McDowell ([1981] 1998, 2012) argue that there
is no reason to believe that such a disentangling manoeuvre
(as McDowell calls it) is in general possible. (This issue obviously
impinges directly on the concerns of linguists, as well as moral
philosophers.) Last but not least, the final chapter of Paul Ziffs
Semantic Analysis (1960) contains an interesting argument that
expressivist and prescriptivist analyses of the paradigm thin
(or purely evaluative) term good are unacceptable on purely
linguistic grounds.
Hilary Putnam
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ayer, Alfred Jules. 1936. Language, Truth and Logic. London: Victor
Gollancz.
Blackburn, Simon. 1984. Spreading the Word. Oxford, Oxford University
Press.
. 1993. Essays in Quasi-Realism. Oxford, Oxford University Press.
Carnap, Rudolf. 1934. The Unity of Science. London: Kegan Paul, Trench,
Trubner, & Co.
. 1935. The Logical Syntax of Language. London: Kegan Paul,
Trench, Trubner & Co.
Cavell, Stanley. 1979. The Claim of Reason. Oxford: Clarendon Press.
Gibbard, Allan. 1990. Wise Choices, Apt Feelings. Cambridge: Harvard
University Press. Oxford: Oxford University Press.
Hare, R. M. 1952. The Language of Morals. Oxford: Clarendon Press.
. 1981. Moral Thinking: Its Levels, Methods and Point: Oxford: Oxford
University Press.
Hempel, C. G. 1963. Implications of Carnaps work for the philosophy
of science. In The Philosophy of Rudolf Carnap, ed. P. A. Schilpp, 685
710. La Salle, IL: Open Court. London: Cambridge University Press.
Horgan, Terry, and Mark Timmons. 2006. Metaethics after Moore.
Oxford: Oxford University Press.

292

McDowell, John. [1981] 1998. Non-cognitivism and rule-following. In


Mind, Value and Reality, 198218. Cambridge: Harvard University Press.
Originally published in Wittgenstein: To Follow a Rule, ed. Stephen H.
Holtzman and Christopher M. Leich, 14172. London: Routledge.
Murdoch, Iris. 1967. The Sovereignty of Good Over Other Concepts.
Cambridge: Cambridge University Press.
Putnam, Hilary. 1981. Reason, Truth and History. Cambridge: Cambridge
University Press.
. 2002. The Collapse of the Fact/Value Dichotomy and Other Essays.
Cambridge: Harvard University Press.
Quine, W. V. O. [1951] 1961. Two dogmas of empiricism. In From a
Logical Point of View (2d ed.), 2046. Cambridge: Harvard University
Press.
Reichenbach, Hans. 1951. The Rise of Scientific Philosophy.
Berkeley: University of California Press.
Robbins, Lionel. 1932. On the Nature and Significance of Economic
Science. London: Macmillan.
Stevenson, Charles. 1944. Ethics and Language. New Haven, CT: Yale
University Press.
Walsh, Vivian. 1987. Philosophy and economics. In The New Palgrave: A
Dictionary of Economics. Vol. 3. Ed. J. Eatwell, M. Millgate, and P.
Newman, 8619. London: Macmillan.
White, Morton. 1956. Towards Reunion in Philosophy. Cambridge: Harvard
University Press.
Williams, Bernard. 1985. Ethics and the Limits of Philosophy.
Cambridge: Harvard University Press.
Ziff, Paul. 1960. Semantic Analysis. Ithaca, NY: Cornell University Press.

ETHNOLINGUISTIC IDENTITY
According to Epicurus (Letter to Herodotus), the different languages of the world arose historically from the differences in feelings and sensory perception among peoples: [M]ens natures
according to their different nationalities [ethn] had their own
peculiar feelings and received their peculiar impressions, and
so each in their own way emitted air formed into shape by each
of these feelings and impressions, according to the differences
made in the different nations by the places of their abode as well
(Bailey 1926, 756). The idea of a strict linkage between a people and its language is also found in the Book of Genesis (10:5),
when, after the Flood, Noahs descendants spread out over the
earth, every one after his tongue, after their families, in their
nations (King James Version).
The Bible is the most direct source of the modern (post-Renaissance) conception of the nation as a people linked by birth,
language, and culture and belonging to a particular place. This
had not been the general European way of thinking prior to the
Renaissance, when religious belonging provided a first division
among peoples and dynastic rule a second. Language meant
Latin, pan-European, and largely insulated from the vernacular
dialects spoken by most people going about their daily lives.
These vernaculars were not thought of as language or as having
any importance beyond the practical needs of communication, whereas Latin was the sacred vehicle of divine rites and
divine knowledge.
As the Reformation increased access to the Bible, the sense
of national belonging and the nation-language nexus spread
(see nationalism and language). Concern arose for vernaculars to be raised to the status of the language, making
them eloquent, able to fulfill some of the functions previously

Event Structure and Grammar


reserved for Latin. This would come to be perceived as a duty to
the nation. The biblical-cum-modern conception of nation and
language remains powerful today, despite having been weakened by various attempts to overthrow it. Among these, marxism, with its internationalist aims, was the most potent. Research
into ethnolinguistic identity is at the heart of a broader program
of inquiry into language and identity and forms a key aspect
of the understanding of nationalism. According to social identity
theory, national and ethnic identities are grounded in the knowledge that individuals have of membership of a social in-group.
Anyone we do not perceive as a member gets classified into an
out-group which can come to represent not just the Other, but
the Threat, the Enemy. (See also stereotypes.)
Taken to extremes, ethnolinguistic identity always becomes
oppressive, but kept within bounds, it is a positive force, helping to give people a sense of who they are, anchoring their lives,
and helping them avoid feelings of alienation. Since language
and nation are conceptually so closely bound together, it is not
surprising that the politics of language choice rarely depends on
purely functional criteria, such as the language that will be most
widely understood. The symbolic and emotional dimensions of
ethnolinguistic identity are powerful enough that language
policies that ignore them are likely to prove dysfunctional in
the long run.
John E. Joseph
WORK CITED
Bailey, C., ed. and trans. 1926. Epicurus: The Extant Remains.
Oxford: Clarendon.

EVENT STRUCTURE AND GRAMMAR


The concept of event structure is prominent in many disciplines,
such as cognitive science, computer science, linguistics, and philosophy. Within grammar studies, event structure concerns the
level of linguistic representation of a basic unit or organization of
thought corresponding to individual acts or occurrences in the
world. Speakers attempt to conceptualize and express this unit
or organization the event structure by the use of natural language elements, such as words, phrases, and sentences.

Event Structure, Tense, and Aspect


An understanding of event structure requires a differentiation between it and such related terms as tense and aspect.
Consider the following sentences that capture various situations:
(a) The cook melted the butter.
(b) The farmer pushed the wheelbarrow.
(c) The cook was melting the butter.
(d) The farmer was pushing the wheelbarrow.
Sentences (a) and (b) both encode events that took place in the
past, and are thus encoded by the past tense forms of the verbs,
melted and pushed. But there is a crucial difference between the
two in terms of the nature of the event. The melting has an end
point but the pushing could continue forever. Events with an
end point are termed telic, while those without an end point are

atelic events. Event structure is also distinct from viewpoint or


grammatical aspect. The event in (a) is telic, and in the speakers viewpoint the action is ended. In grammatical aspect, this
is a perfective aspect. In (c), like in (a), the event is telic as it will
end in a change of state and a boundary reached in the melting
process. However, the speaker views the process as incomplete.
This is an imperfective aspect, while in event structure this is a
telic, bounded process. It is interestingly that (d) is seen as both
an imperfective aspect and as an atelic event. The action is not
completed in the speakers viewpoint presentation and, given
the nature of the object or theme, unlike the object of melting in
(a) and (c), the object of pushing will not change state, and thus
is an unbounded, atelic event.

Typology of Events
Event types are often distinguished based on Z. Vendlers (1957)
classic typology of lexical aspect or Aktionsarten, a German term
for kinds of action, which groups verbs into subclasses based
on their temporal features. These subclasses or event types are
processes (e.g., activities such as walk and run), accomplishments (events that culminate by the use of temporal adverbials,
e.g., build or cook in an hour), and achievements (instantaneous
events that finish in a short time period, e.g., win and find). Event
itself is part of a larger notion called situations, divided into
two categories: events and states (e.g., know and love) (Mani,
Pustejovsky, and Gaizauskas 2005). Hence, the term situation
aspect is often used in preference to the term lexical aspect.

Main Theoretical Approaches


Event structure has become an important part of grammar studies, especially in the debate about the exact nature of the relationship between syntax and semantics. Two of many questions
are often posed: At what level of the grammar should we represent event structure, and how should we indeed represent this
notion? Researchers differ on answers to these questions. The
following include the main formal theoretical approaches:
(i) Lexical and decompositional approaches: Under lexical approaches, event structure, in particular telicity, is
an inherent property of lexical items and is represented
in the lexicon (e.g., Vendler 1957; Levin 1999). A subset of
lexical approaches includes decompositional approaches
in which event structure is computed on the basis of a set
of semantic primitives, which are then used to characterize the meaning of every word in the language (e.g., Schank
1975; Jackendoff 1991). For instance, in R. Jackendoffs
decompositional approach, an accomplishment event like
X closes Y is represented on the basis of primitives such as
CAUSE and BECOME: X closes Y is decomposed as X CAUSE
Y TO BECOME NOT OPEN and is represented as CAUSE
(X, BECOME (NOT (OPEN (Y)))).
(ii) Compositional approaches: Event structure is represented
in the lexical structure where telicity is a lexical property and
can be computed from the lexical entry on the basis of accompanying material in the verb phrase (e.g. Folli 2001; Pustejovsky
1991; Goldberg 2006). For instance, in J. Pustejoveskys event
composition approach, sometimes referred to as a generative approach, an accomplishment event like X closes Y

293

Event Structure and Grammar

Evidentiality

is represented as a preparatory stage of ACT (X, Y) & NOT


(CLOSED (Y)) and a result state of CLOSED (Y).
(iii) Semantic approaches: These approaches represented in
works such as Krifka (1998) and Filip (2000) are quite distinct
from the lexical semantic approaches like Jackendoffs works.
These logical semanticists rely more on truth-conditional
resources of words and sentences to refer to the semantics of
events.
(iv) Syntactic approaches: These approaches represented by
works such as Ritter and Rosen (1998, 2000), Travis (2000),
Butt and Ramchand (2001), and Borer (2005) take the position that the event is nonlexical and argue that the event type
is read off of the clausal functional projections.
These lexical, decompositional, compositional, semantic, and
syntactic approaches may overlap and are thus not to be seen
as clearly delineated alternatives. The lexical approaches, while
intuitive, impose a lot of burden on the lexicon. Decompositional
approaches are also intuitive, but a problem is that there is usually no general agreement on what primitives to set. The compositional, semantic, and syntactic approaches may also have their
weaknesses but these are among the most promising.

Future Trends
Beyond the current state, two main trends may be noted. First,
event structure is currently studied with reference to words and
sentences, mostly in isolation. In the future, we need to study it
in context, such as in speech and texts. D. Townsend and colleagues (2003) lead this trend. Second, current research mostly
studies how different types of objects and other functions influence telicity. But we also ought to look at how event structure in
complex verbal constructions, such as serial verbs, is computed.
lexical-functional grammar analyses like Bodomo (1993,
1997) and Alsina, Bresnan, and Sells (1997) lead this trend.
Tenny and Pustejovsky (2000), Mani, Pustejovsky, and
Gaizauskas (2005), and Dolling, Heyde-Zybatow, and Schafer
(2007) are further recent book-length readings that put most of
these issues in perspective.
Adams Bodomo
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alsina, A. Joan Bresnan, and Peter Sells. 1997. Complex Predicates.
Stanford, CA: CSLI.
Bodomo, A. 1993. Complex predicates and event structure: An integrated analysis of serial verb constructions in the Mabia languages of
West Africa. Working Papers in Linguistics 20, Trondheim, Norway.
. 1997. A conceptual mapping theory for serial verbs. In On-line
Proceedings of LFG97, ed. M. Butt and T. King, CSLI, Stanford
University. Available online at: http://www-csli.stanford.edu/publications/LFG2/bodomo-lfg97.html.
Borer, H. 2005. Structuring Sense. Vol. 2. The Normal Course of Events.
Oxford: Oxford University Press.
Butt, Miriam, and Gillian Ramchand. 2001. Complex aspectual structure in Hindi/Urdu. Oxford Working Papers in Linguistics, Philosophy
and Phonetics 6: 130. M. Liakata, B. Jensen, and D. Maillat edited this
volume.
Dolling, Johannes, Tatyana Heyde-Zybatow, and Martin Schafer. 2007.
Event Structures in Linguistic Form and Interpretation. Berlin: Walter
de Gruyter.

294

Filip, Hana. 2000. The quantization puzzle. In Events as Grammatical


Objects: The Converging Perspectives of Lexical Semantics and Syntax,
ed. Carol Tenny and James Pustejovsky, 3993. Stanford, CA: CSLI
Publications.
Folli, R. 2001. Constructing telicity in English and Italian. Ph.D. diss.,
Oxford University.
Goldberg, Adele. 2006. Constructions at Work: The Nature of Generalization
in Language. Oxford: Oxford University Press.
Jackendoff, R. 1991. Parts and boundaries. Cognition 41: 945.
Krifka, Manfred. 1998. The origins of telicity. In Events and Grammar,
ed. Susan Rothstein, 197235. Dordrecht: Kluwer.
Levin, B. 1999. Objecthood: An event structure perspective.
In Proceedings of CLS 35: Part 1: The Main Session, 22347.
Chicago: Chicago Linguistic Society.
Mani, Inderjeet, James Pustejovsky, and Robert Gaizauskas. 2005. The
Language of Time: A Reader. Oxford: Oxford University Press.
Pustejovsky, J. 1991. The syntax of event structure. Cognition 41: 4781.
Ritter,E.,and S.T.Rosen.1998.Delimitingeventsinsyntax.InTheProjection
of Arguments: Lexical and Compositional Factors, ed. M. Butt and
W. Geuder, 13564. Stanford, CA: CSLI Publications.
. 2000. Event structure and ergativity. In Events as Grammatical
Objects, ed. C. Tenny and J. Pustejovsky, 187238. Stanford, CA.: CSLI
Publications.
Schank, R. C. 1975. Conceptual Information Processing. Amsterdam:
North-Holland.
Tenny, C., and J. Pustejovsky, eds. 2000. Events as Grammatical
Objects: The Converging Perspectives of Lexical Semantics and Syntax.
Stanford, CA: CSLI Publications.
Townsend, D., M. Seegmiller, R. Folli, H. Harley, and T. Bever. 2003.
Processing Events in Sentences and Texts. Upper Montclair, NJ: Montclair
State University Press.
Travis, L. 2000. Event structure in syntax. In Events as Grammatical
Objects, ed. C. Tenny and J. Psutejovsky. Stanford, CA: CSLI
Publications.
Vendler, Z. 1957. Verbs and times. Philosophical Review 66.2: 14360.

EVIDENTIALITY
This is a grammatical category that has source of information as
its primary meaning whether the narrator actually saw what is
being described, made inferences about it based on some evidence, or was told about it, and so on. Tariana, an Arawak language from Brazil, has five evidentials marked on the verb. If I
saw Jos play football, I will say Jos is playing-naka, using the
visual evidential. If I heard the noise of the play (but didnt see it),
I will say Jos is playing-mahka, using the nonvisual. If all I see
is that Joss football boots are gone and so is the ball, I will say
Jos is playing-nihka, using the inferential. If it is Sunday and
Jos is not home, the thing to say is Jos is playing-sika since
my statement is based on the assumption and general knowledge that Jos usually plays football on Sundays. And if the information was reported to me by someone else, I will say Jos is
playing-pidaka, using the reported marker. Omitting an evidential results in an ungrammatical and highly unnatural sentence.
About a quarter of the worlds languages have some grammatical marking of information source. The systems vary in their
complexity. Some distinguish just two terms. An eyewitness versus non-eyewitness distinction is found in Turkic and Iranian
languages. Other languages mark only the nonfirsthand information, for example, Abkhaz, a northwestern Caucasian language. Numerous languages express only reported, or hearsay,

Evidentiality
information, for example, Estonian. Quechua languages have
three evidentiality specifications: direct evidence, conjectural,
and reported.
Systems with more than four terms have just two sensory
evidentials and a number of evidentials based on inference and
assumption of different kinds; these include Nambiquara languages, from Brazil, and Foe and Fasu, of the Kutubuan family
spoken in the Southern Highlands of Papua New Guinea.
The terms verificational and validational are sometimes used
in place of evidential. French linguists employ the term mdiatif
(Guentchva 1996). A summary of work on recognizing this
category, and naming it, is in Jacobsen (1986) and Aikhenvald
(2004).
Evidentiality does not bear any straightforward relationship
to truth, the validity of a statement, or the speakers responsibility. The truth value of an evidential may be different from
that of the verb in its clause. Evidentials can be manipulated to
tell a lie: One can give a correct information source and wrong
information, as in saying He is dead-reported when you were
told that he is alive, or correct information and wrong information source, as in saying He is alive-visual when, in fact, you
were told that he is alive but did not see this. The ways in which
semantic extensions of evidentials overlap with modalities
and such meanings as probability or possibility depend on the
system and on the semantics of each individual evidential
term. In many languages (e.g., Quechua, Shipibo-Konibo, or
Tariana, all from South America), markers of hypothetical and
irrealis modality can occur in conjunction with evidentials on
one verb or in one clause. This further corroborates their status
as distinct categories.
Nonvisual and reported evidentials used with the first person
often refer to uncontrolled spontaneous action or have overtones
of surprise, known as mirative.
Every language has some lexical way of referring to information source, for example, English reportedly or allegedly. Such
lexical expressions may become grammaticalized as evidential
markers. Nonevidential categories may acquire a secondary
meaning relating to information source. Conditionals and other
nondeclarative moods may acquire overtones of uncertain information obtained from some other source for which the speaker
does not take any responsibility; the best-known example is the
French conditional. Past tense and perfect aspect acquire
nuances of nonfirsthand information in many Iranian and
Turkic languages, and so do resultative nominalizations and
passives. The choice of a complementizer, or a type of complement clause, may serve to express meanings related to the way
in which one knows a particular fact. In English, different complement clauses distinguish an auditory and a hearsay meaning
of the verb hear: Saying I heard Brazil beating France implies
actual listening, whereas I heard that Brazil beat France implies
a verbal report of the result. These evidential-like extensions are
known as evidentiality strategies. Historically, they may give rise
to grammatical evidentials.
The maximal number of evidentials is distinguished in statements. The only evidential possible in commands is the reported,
to express command on behalf of someone else: eat-reported!
means eat following someones command! Evidentials often
come from grammaticalized verbs. The verb of saying is

Evolutionary Psychology
a frequent source for reported and quotative evidentials, and
the verbs feel, think, hear can give rise to a nonvisual evidential.
Closed-word classes deictics (see deixis) and locatives may
give rise to evidentials, both in small and in large systems.
Evidentials vary in their semantic extensions, depending on
the system. Reported information often has overtones of probability or unreliability, while visual evidentials may develop
meanings of certainty. They can be extended to denote the
direct participation, control, and volitionality of the speaker.
morphemes marking tense, aspect, mood, modality, and evidentiality may occur in the same slot in the structure of a highly
synthetic language.
Evidentiality is a property of a significant number of linguistic
areas, including the Balkans, the Baltic area, India, and a variety
of locations in Amazonia. Evidentials may make their way into
contact languages, as they have into Andean Spanish. The texts
genre may determine the choice of an evidential. Traditional
stories are typically cast in reported evidential. Evidentials can
be manipulated in discourse as a stylistic device. Switching
from a reported to a direct (or visual) evidential creates the effect
of the speakers participation and confidence. Switching to a
nonfirsthand evidential often implies a backgrounded aside.
Evidentiality is interlinked with conventionalized attitudes to
information and precision in stating its source.
Alexandra Aikhenvald
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aikhenvald, Alexandra Y. 2004. Evidentiality. Oxford: Oxford University
Press.
Aikhenvald, Alexandra Y., and R. M. W. Dixon, eds. 2003. Studies in
Evidentiality. Amsterdam: John Benjamins.
Barnes, J. 1984. Evidentials in the Tuyuca verb. International Journal of
American Linguistics 50: 25571.
Guentchva, Z., ed. 1996. Lnonciation mdiatise. LouvainParis: ditions Peeters.
Jacobsen, William H., Jr. 1986. The heterogeneity of evidentials in
Makah. In Evidentiality: The Linguistic Coding of Epistemology, ed.
Wallace L. Chafe and Johanna Nichols, 328. Norwood, NJ: Ablex. See
other papers therein.

EVOLUTIONARY PSYCHOLOGY
This term is used in several different, related senses. Among
behavioral, social, and cognitive scientists, it properly refers to
a new scientific paradigm or framework, together with the discipline that has grown up around this framework, and the body
of knowledge produced by the researchers working within that
framework. Some scholars outside the field, as well as many
journalists and lay people, use it more loosely to refer to any finding, speculation, or discussion that links evolution and behavior,
whether well informed or not. Evolutionary psychology as both
a research framework and a discipline is organized around the
proposition that the design features of the mechanisms comprising a species psychology reflect the character of the adaptive
problems they evolved to solve. This proposition was uncontroversial when applied by biologists to other species (e.g., Williams
1966). However, it generated significant debate and opposition
once it began to be applied to humans, who because of culture,

295

Evolutionary Psychology
intelligence, language and complexly variable social systems
appear notably different from other species (Sahlins 1977).
The field shares some tenets with early Chomskyan proposals that the human mind contains numerous mental organs
specialized for carrying out different cognitive tasks, such as a
language acquisition device (Chomsky 1965). The antifunctionalist strain in Chomskys thinking led him to largely set
aside natural selection for communicative functions in his discussions of language (Chomsky 1972). In contrast, evolutionary
psychologists such as the psycholinguist Steven Pinker (1994),
argue that the existence of mental organs can only be explained
as the consequence of natural selection. This is because selection
is the only process known to science that builds complex functional systems into the designs of organisms (Williams 1966). By
this standard, the intricate functional interdependence of the
various cognitive mechanisms underlying language provides
very strong evidence for the organizing role of natural selection
in constructing such mechanisms (Pinker and Bloom 1992).
Evolutionary psychology began to emerge in the 1970s and
1980s when a small number of researchers tried to synthesize
several distinct research orientations in a mutually consistent
way (Tooby and Cosmides 1992). The most important of these
orientations were cognitive science, with its commitment to
information-processing descriptions of psychological mechanisms; modern primatology, huntergatherer studies, and
paleoanthropology, which together offered the prospect of characterizing the conditions in which humans evolved; evolutionary
biology (including behavioral ecology, sociobiology, ethology,
and evolutionary game theory); and neuroscience, with its
prospect of discovering the physical implementation of cognitive mechanisms. Evolutionary psychologists argued that cognitive mechanisms were, ipso facto, biological adaptations,
a proposition that inevitably connected cognitive science to
evolutionary biology. If cognitive mechanisms are adaptations,
they then must exhibit an evolved organization, have an evolutionary history, and have been naturally engineered to carry out
evolved functions. Most importantly, the identification of cognitive mechanisms with adaptations allowed the entire technical
apparatus developed within biology concerning adaptations to
be imported and validly applied to cognitive science.
Evolutionary psychologists start from the premise that
the brain, like our other organs, is the product of evolution.
Specifically, the brain is viewed as an information-processing
organ that evolved over evolutionary time in order to regulate
behavior in an adaptively successful way. In a world filled with
the disordering force of entropy, biologists and physicists recognize that natural selection is the only known natural physical process that can push the designs of organisms uphill into
functionally organized systems. It follows that whatever functional organization there is to be found in the design of the brain
reflects the history of selection that acted ancestrally on the
species. Evolutionary psychologists use the cause-and-effect
relationships between ancestral selection pressures and the
resulting functional architectures of the brains mechanisms as
one powerful new tool to guide scientific discovery. On this view,
the structure of each psychological mechanism should reflect
the actions of the selection pressures that built it. Consequently,
by considering ancestral adaptive problems, evolutionary

296

psychologists believe that they can more reliably, rapidly, and


effectively derive and test hypotheses about the functional organization of mental mechanisms than would be possible otherwise. They argue that many major wrong turns in the history of
the behavioral sciences for example, many important aspects
of the Freudian, Skinnerian, or Piagetian paradigms would not
have been made if their core propositions had been scrutinized
for consistency with the kinds of outcomes that natural selection
could plausibly have produced. The practice of using models of
ancestral-selection pressures as a guide to discovering previously
unknown psychological mechanisms renders them untroubled
by critics accusations that evolutionary analysis inevitably consists of concocting post hoc just-so stories. To use general principles to derive predictions, and then to use these predictions
to discover something previously unknown, demonstrates that
such explanations are not concocted post hoc.
The primary research goals of evolutionary psychology are
a) the discovery and progressive mapping of each of the evolved
mechanisms of the human brain (or the brains of other species
of interest) and b) the exploration of the systematic behavioral
regularities and population-level phenomena that these evolved
mechanisms generate in different social and cultural environments. So, for example, evolutionary psychologists claim to have
discovered and mapped the information-processing structure
of an evolved program in the human psychological architecture
whose function is to detect the individuals who are close genetic
relatives, and then to generate greater sexual aversion and
greater altruism toward these individuals compared to others
(Lieberman, Tooby, and Cosmides 2007). This evolved program
was predicted to be a part of our species-typical psychological
design, and is believed to explain some of the patterns involving
family sentiments found across cultures (such as disgust at the
prospect of incest with ones sibling).
Similarly, all human societies (and no nonhuman societies)
have complex languages and use them as the primary means of
communication. Evolutionary psychologists view languages as
the population-level expression of a suite of evolved species-typical programs tailored by natural selection to facilitate communication, especially of propositions (Pinker 1994). Although the
evolutionary origins of language are obscure, evolutionary psychologists consider it inevitable that the present design
of the cognitive mechanisms underlying language competence
were naturally selected to function in a linguistic environment
that is normal for our species. In consequence, a) they should be
selected to assume the presence of a linguistic environment that
conforms to human language universals, and b) they should be
designed to exploit the presence of these regularities to accomplish the functions of acquisition, comprehension, and production (as they appear to; Musso et al. 2003). Natural selection
thus provides a causal explanation for Chomskys assertion that
strategies employed by the language acquisition device reflect
abstract uniformities across human languages (see universal
grammar).
One central element that distinguishes evolutionary psychology from other approaches is its focus on integrating what
is known about evolution into the research process, rather than
ignoring this knowledge. Applying information about ancestral conditions and selection pressures allows evolutionary

Evolutionary Psychology
psychologists to derive hypotheses about the design of human
information-processing mechanisms from the large preexisting body of theories already developed and empirically tested
within modern evolutionary biology. For example, evolutionary biologists know that for organisms like humans, mating with
close relatives causes genetic defects to express themselves at far
higher rates in the incestuously produced children. This has led
evolutionary psychologists a) to the general prediction that natural selection had built a program in humans designed to identify close genetic relatives; b) to detailed predictions about the
cues that the program would use to identify genetic relatives; and
c) to detailed predictions about how this kin detection program
would be coupled to increased sexual aversion to individuals
it identified as genetic relatives (as well as increased altruism,
as predicted by kin selection theory). The analysis of ancestral
selection pressures and huntergatherer conditions made it possible to design studies that could test (and did confirm) these
propositions. These studies, in turn, mapped the informationprocessing architecture of these functionally specialized programs (Lieberman, Tooby, and Cosmides 2007). In contrast, the
disregard by sociocultural anthropologists (and Freudians) of the
selection pressures that select strongly against incest prevented
them from discovering the existence of these evolved mechanisms. Once a mechanism is mapped, its population-level social
and cultural expressions can also be analyzed such as moral
attitudes about incest in the case of kin detection and human linguistic variation in the case of language.
Evolutionary psychology originally emerged among anthropologists, cognitive scientists, biologists, and psychologists,
although it has subsequently diffused into many other disciplines. Evolutionary psychology is not a subfield of psychology,
and it is not devoted to the study of a specific class of phenomena. Rather, it is an approach to the behavioral, social, cognitive,
and neural sciences that can be applied to any of the topics they
deal with. Originally reacting against the mutually contradictory
claims about the mind and human nature advanced in different
disciplines, evolutionary psychologists constructed what they
argue is a logically integrated scientific framework that attempts
to reconcile into a single body of knowledge the results drawn
from all relevant fields. Its advocates view it as an interdisciplinary nucleus around which a single unified theoretical and
empirical behavioral science is being crystallized. Of course, not
everyone in behavioral science agrees, with disagreements ranging from disputes over specific analyses to broader rejection of
the program, often in favor of culturalist and social constructionist views.
A second feature that distinguishes evolutionary psychology
is the importance it places on achieving information-processing
descriptions of the designs of evolved mechanisms, rather than
stopping at behavioral or neuroscience descriptions. Along with
most cognitive scientists, evolutionary psychologists believe that
the brain, like any other computational system, can usefully be
mapped both in physical terms (which, for the brain, means in
neurophysiological and neuroanatomical terms) and also complementarily in information-processing terms. Evolutionary
psychologists go on to stress that the brain and its subsystems
evolved as an organ (or set of organs) of computation: The brain
evolved in order to regulate behavior and physiology adaptively

based on information it is exposed to. Because the evolved function of a neural (or psychological) mechanism is inherently
computational (i.e., as a program mapping informational inputs
to outputs), the only form of description that can accurately
characterize how its organization solves its adaptive problem
is an information-processing description. Physical descriptions of brain subsystems cannot, by their nature, fully capture
the information-processing interrelationships that embody the
function of an evolved program (mechanism, adaptation, etc.).
So, for example, however interesting it is to identify the brain
regions implicated in various aspects of language processing, it
is still important to develop a parallel account in terms of computational steps (data structures, operations, etc.). Similarly,
simply observing that humans behaviorally tend to avoid incest
inside the nuclear family is very different from having mapped
the information-processing steps in the evolved programs that
take prespecified cues to kinship as input, compute from them
magnitudes that capture estimated genetic relatedness, and then
pass these magnitudes into the sexual-choice motivational subsystem, where they generate sexual disgust at mating with those
it identifies as genetic relatives.
A third difference in perspective between evolutionary psychologists and most other behavioral scientists is in how numerous and functionally specialized they expect the psychological
mechanisms of a species to be. For most of the last century, the
majority view among learning theorists, cognitive scientists, and
neuroscientists has been that the psychological mechanisms
that operate on experience to produce knowledge are likely to
be small in number, and to be primarily content independent
and general purpose (Pinker 2002; Tooby and Cosmides 1992).
Content independence means that a cognitive procedure (such
as association formation in connectionism) operates in the
same way regardless of the content it is processing. Hence, on
this view, the same cognitive procedures are expected to operate on all contents uniformly, whether language, fighting, eating,
sex, family interactions, or intergroup conflict. This blank slate or
environmentalist view can be expressed by comparing the operation of learning mechanisms or cognitive mechanisms to the
operation of a tape recorder that processes all sounds uniformly,
regardless of their meaning: The content that ends up on the tape
reflects only the content present in the environment, and nothing in the tape-recording machinery itself introduces content of
its own that was not present in the environment.
From a selectionist perspective, however, such a blank-slate
viewpoint seems extremely implausible, as well as inconsistent
with what is known about the cognitive architectures of nonhumans (Gallistel 1990). Mutations for specialized design features
that exploit the rich recurrent structure of particular problem
domains should spread by natural selection whenever they costeffectively improve the organisms propensity to solve important
adaptive problems in a fitness-promoting way. That is, if there is
a particular set of cues that solves the problem of kin detection,
then the mind could evolve a specialization that is designed to
take only those cues as input. For a problem-solving strategy to
be applied generally across contents, it cannot employ problemsolving shortcuts that work only on particular problem subsets,
such as grammar acquisition, depth perception, kin detection,
or mate selection. Hence, evolutionary psychologists consider it

297

Evolutionary Psychology
likely that the mind solves the diverse computational problems
posed by stereopsis, color vision, echolocation, face recognition, object mechanics, navigation, and reasoning about social
exchange by using at least some principles and operations that
are particular to each respective domain. Evolutionary psychologists argue that evolved specializations that are activated only by
certain content domains or adaptive problems seem virtually
inevitable, rather than implausible or exceptional outcomes of
the evolutionary process. This is because selection inherently
favors efficiency and puts no weight per se on uniformity or simplicity (Tooby and Cosmides 1992).
Moreover, unlike a tape recorder, the designs of such evolved
psychological mechanisms might be expected to regularly introduce particular contents, motivations, interpretations, and
conceptual primitives into the human mind that are not simply
derived from the environment. From an engineering perspective, it is easy to see how such reliably developing contents could
enhance adaptive performance. For example, the environmental
regularity of venomous snakes posed an evolutionarily long-enduring adaptive problem. This regularity appears to have selected
for an evolved computational device implemented in the brains
of African primates (including humans). This adaptation contains a psychophysical specification of snakes linked to a system
that motivates snake avoidance. Additionally, this avoidance is
up-regulated to the extent that the individual is exposed to conspecifics who display fear toward snakes (hman and Mineka
2001). This depends on mental content about snakes being built
into the mechanism. The human mind is suspected to contain
neurocomputational versions of what philosophers would once
have called innate ideas, such as snake, spider, mother, predator,
food, word, verb, agency, object, and patient (Tooby, Cosmides,
and Barrett 2005). By augmenting the cognitive architecture in
such a fashion, natural selection could supercharge perceiving,
learning, reasoning, and decision making in evolutionarily consequential domains.
At a minimum, evolutionary psychologists expect that in addition to whatever general-purpose cognitive machinery humans
have, we should also be expected to have a wide array of domainspecific mechanisms, including specialized learning mechanisms. So, for example, although the snake phobia system, the
kin detection mechanism, and the language acquisition system
are all learning mechanisms, they are each specialized only for
their particular type of content (snakes linked to fear intensity,
kinship cues linked to incest aversion and altruistic motivation,
and language inputs linked to linguistic competence). For this
reason, evolutionary psychologists do not regard learning as constituting an alternative explanation for the claim that a particular kind of behavioral output was shaped by evolution. Evidence
that something is learned is not in the least inconsistent with the
claim that much of the knowledge produced was supplied by
specialized learning mechanisms permeated with evolved content. Critics of evolutionary psychology view its multiplication
of hypothesized cognitive mechanisms (e.g., specializations for
language acquisition, kin detection, mate selection, and so on)
to be unparsimonious. Evolutionary psychologists respond that
although parsimony may have been a useful principle in physics,
evolutionarily engineered systems are not designed to be simple
but, rather, to be adaptively effective.

298

Evolutionary psychology has grown rapidly in numbers and


acceptance over the last three decades, and it is now presented
in many sources alongside Freudianism, behaviorism, cognitive science, and neuroscience as one of the basic approaches to
psychology. In that time, evolutionary psychologists have used
evolutionarily derived predictions to discover scores of previously unknown mechanisms and design features in the human
psychological architecture (Buss 2005). Nevertheless, it remains
significantly more controversial than other young fields, such as
cognitive neuroscience, and is still a minority viewpoint whose
specifics are vigorously disputed. Indeed, many researchers
who are reluctant to associate themselves with the controversies
surrounding evolutionary psychology have nonetheless quietly
adopted many of its core principles, so that claims of evolved
functional specializations and evolutionary origins are far more
common and unabashed in the behavioral sciences than they
were even a decade ago. For example, the modularist tradition
in cognitive development adopts what is largely an evolutionary psychological stance: Various specialized competences the
theory of mind module, intuitive physics, and intuitive biology
are viewed as evolved, reliably developing, domain specific, and
designed to reflect the special task demands posed by the adaptive problems special to each domain (Hirschfeld and Gelman
1994).
Some controversies over evolutionary psychology are generated by misunderstandings, while others concern unsettled
theoretical and empirical issues (e.g., how can neural plasticity
be reconciled with the existence of evolved specializations in the
brain?). However, heated resistance is perhaps attributable to
the sensitivity of applying evolutionary theories broadly across
human experience. For example, cognitive science originated
in philosophy and linguistics, and as a result tends to focus on
reflective issues, such as knowledge acquisition and speech comprehension, which have only limited intrinsic personal or social
meaning. In contrast, evolutionary psychologists ambitions
extend to characterizing the mechanisms underlying all human
action. These include social interactions such as aggression, sexual attraction, exploitation, and cooperation. Evolutionary biology provides rich theories about these domains, but analysis of
the causes of these phenomena inevitably triggers strongly felt
personal and ideological reactions.
Language is commonly viewed by evolutionary psychologists as the expression of a set of reliably developing cognitive
mechanisms that evolved to convey propositional information
through a serial channel (Pinker 1994). The high degree of functional elaboration in language suggests that it has been shaped
by selection over long expanses of evolutionary time. Although
it seems likely that many mechanisms involved in language are
general in that they are used in other cognitive tasks, it is difficult from an evolutionary psychological perspective to see how
such an important activity would not have strongly selected for
the emergence of proprietary cognitive specializations designed
to solve languages constituent subtasks with special efficiency.
Several lines of evidence argue that at least some (if indeed not
most) of the cognitive mechanisms underlying language are
adaptations designed by natural selection for language. The
competing hypothesis is that language is a by-product of general intelligence, symbolic capacity, the capacity for culture,

Evolutionary Psychology
neo-associationistic mechanisms, or other general-purpose
alternatives (Pinker 1994). First, computationally intricate linguistic capacities develop precocially far earlier than comparable cognitive achievements in other domains. Second, genetic
and developmental conditions can doubly dissociate language
and general intelligence (i.e., one can speak well with low intelligence and be unable to speak but have otherwise unimpaired
intelligence). Third, underneath linguistic variability are design
features like linear order, constituency (see constituent
structure), predicate-argument structure, case markers,
morphophonemic rules, and phonological rules that are a) universal and b) well designed to communicate propositional information, such as who did what to whom, but poorly designed for
many other cognitive tasks, such as statistical induction, imagery, face recognition, and so on (see phonology, universals

of; morphology, universals of; syntax, universals


of; semantics, universals of).
Finally, some evolutionary psychologists propose that
language was a critical ingredient allowing humans to enter
their peculiar adaptive mode, the cognitive niche. On this view,
the cognitive niche is a way of life in which massive amounts of
contingent information are generated and used for the regulation of improvised behavior that is successfully tailored to local
conditions (Tooby and DeVore 1987; Pinker 1994). Essential to
increasing the supply of useful propositional information was
dramatically lowering the cost of its acquisition from others.
Language appears admirably designed to accomplish this
task.
Daniel Sznycer, John Tooby, and Leda Cosmides
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Buss, D. M., ed. 2005. The Handbook of Evolutionary Psychology.
Hoboken, NJ: Wiley.
Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT
Press.
. 1972. Language and Mind. New York: Harcourt, Brace, and
World.
Gallistel, C. R. 1990. The Organization of Learning. Cambridge, MA: MIT
Press.
Hirschfeld, Lawrence A., and Susan A. Gelman, eds. 1994. Mapping
the Mind: Domain Specificity in Cognition and Culture. New York:
Cambridge University Press.
Lieberman, D., J. Tooby, and L. Cosmides. 2007. The architecture of
human kin detection. Nature 445.7129: 72731.
Musso, M., A. Moro, V. Glauche, M. Rijntjes, J. Reichenbach, C. Bchel,
and C. Weiller. 2003. Brocas area and the language instinct. Nature
Neuroscience 6: 77481.
hman, A., and S. Mineka. 2001. Fears, phobias, and preparedness.
Psychological Review 108: 483522.
Pinker, Steven. 1994. The Language Instinct. New York: Morrow.
. 2002. The Blank Slate. New York: Viking.
Pinker, Steven, and Paul Bloom. 1992. Natural language and natural
selection. In The Adapted Mind: Evolutionary Psychology and the
Generation of Culture, ed. J. Barkow, L. Cosmides, and J. Tooby, 45193.
New York: Oxford University Press.
Sahlins, Marshall. 1977. The Use and Abuse of Biology. Ann Arbor: The
University of Michigan Press.
Tooby, John, and L. Cosmides. 1992. The psychological foundations
of culture. In The Adapted Mind: Evolutionary Psychology and the

Exemplar
Generation of Culture, ed. J. Barkow, L. Cosmides, and J. Tooby, 19136.
New York: Oxford University Press.
Tooby, J., L. Cosmides, and H. C. Barrett. 2005. Resolving the debate
on innate ideas: Learnability constraints and the evolved interpenetration of motivational and conceptual functions. In The Innate
Mind: Structure and Content, ed. P. Carruthers, S. Laurence, and S.
Stich, 30537. New York: Oxford University Press.
Tooby, John, and I. DeVore. 1987. The reconstruction of hominid
behavioral evolution through strategic modeling. In The Evolution of
Primate Behavior: Primate Models, ed. Warren Kinsey, 183237. New
York: SUNY Press.
Williams, George C. 1966. Adaptation and Natural Selection: A Critique
of Some Current Evolutionary Thought. Princeton, NJ: Princeton
University Press.

EXEMPLAR
This term occurs importantly in research and theorization in category identification, recognition, categorization, and learning. It
is used interchangeably with the terms instance or item across
various strands of research, including psychology, religion, and
history.
Within the context of category learning, for instance, the term
exemplar refers to a specific instance, such as a specific cat to
which a parent points when teaching a child the concept and
name of cat. Alternatively, during remediation of language skills
in children with severe disabilities, researchers have utilized various exemplars of graphical symbols to improve communication
(Schlosser 2003). In studies examining category relearning in
individuals who have suffered brain damage, training in naming
of a subset of exemplars results in improved naming of untrained
exemplars within the category (Kiran 2007).
Within the topic of categorization of semantic concepts,
the terms specific usage comes in the context of exemplar
theory. Briefly, this theory suggests that a category is represented by a collection of members (exemplars) that have been
previously encountered, experienced, and stored as unique
and individual memory traces. A new object/item is judged as a
member of a given category provided that it is sufficiently similar
to the stored exemplars (Komatsu 1992). This specific interpretation of exemplar is at odds with an alternate view of categorization, namely, the prototype theory, which suggests that a
category is represented in terms of a single summary representation (i.e., a prototype).
Not all theorists agree that exemplar and prototype models
are competitors; there is yet another class of models according
to which categorization decisions are made using exemplars,
although the effect of using exemplars necessitates the creation
of abstractions that can be later applied to novel exemplars (Ross
and Makin 1999). Similarly, some connectionist networks
assume that a category is represented by summary information
across the entire network and, depending upon the input provided, specific connection strengths in the network have greater
influence on the overall activation (Knapp and Anderson 1984).
Finally, the interpretation of the term exemplar can also be
influenced by the level of category structure. As Edward Smith
and Douglas Medin (1999) argue, the term can refer to a specific instance of the concept (e.g., your favorite blue jeans
in the category clothing) or to a subset of the concept (blue

299

Exemplar Theory
jeans). Further, whereas experimental investigations of exemplars typically refer to them as a basic level concepts (e.g.,
apple), exemplars can also refer to an individual entity such as
Macintosh apple, which is a subordinate concept.
Swathi Kiran
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Knapp, A., and J. Anderson. 1984. Theory of categorization based
on distributed memory strorage. Journal of Experimental
Psychology: Learning, Memory and Cognition 10: 61637.
Kiran, Swathi. 2007. Semantic complexity in the treatment of naming
deficits. American Journal of Speech Language Pathology 16: 112
Komatsu, Lloyd. 1992. Recent views of conceptual structure.
Psychological Bulletin 112: 50026.
Ross, Brian, and Valarie Makin. 1999. Prototype versus exemplar models in cognition. In The Nature of Cognition, ed. R. Sternberg, 20542.
Cambridge, MA: MIT Press.
Schlosser, Ralf. 2003. The Efficacy of Augmentative and Alternative
Communication: Toward Evidence-Based Practice. Amsterdam and
Boston: Elsevier.
Smith, Edward, and Douglas Medin. 1999. The exemplar view. In
Concepts: Core Readings, ed. E Margolis and S. Laurence, 20721.
Cambridge, MA: MIT Press.

EXEMPLAR THEORY
An important goal of linguistic theory has been to develop explicit
approaches to describing, modeling, and explaining linguistic
behavior. Certainly, most familiar to linguists are the rule-based
models that derive abstract rules from linguistic exemplars, then
use those rules to predict linguistic behavior. Such rule systems
are readily shown to be empirically inadequate, both diachronically, as in the shift from digged to dug as the past tense of dig,
and synchronically (see synchrony and diachrony), as in
the overgeneralization (see overregularizations) of glew
as the past tense of glow in place of glowed. Thus, rule-based
models of language behavior must also incorporate some sort of
component that can account for analogical interactions among
linguistic items.
Generally speaking, there are two broad categories of analogical models under investigation in current linguistic research.
One group consists of approaches that use linguistic exemplars
to derive an analogical system but which then do not consult
those individual exemplars of linguistic experience further in
predicting linguistic behavior. Best known among this category of analogical models are the connectionist approaches
to language. Connectionist models typically pool their training
input into schematic, prototype-like representations of a category that do not retain individualizing information about the
exemplars used to train the models. Such representations, however, make the models empirically inadequate in two respects.
First, connectionist models incorrectly predict behaviors such
as categorization and response times in terms of similarity to
the prototype encoded in the network, rather than in terms of
similarity to individual exemplars. Second, there is abundant,
and growing, evidence that memories for individual linguistic experiences do influence subsequent linguistic behaviors.
Research has derived clear evidence of specific exemplars

300

subsequently influencing phonetic and phonological output (Pierrehumbert 2001), lexical and morphological output
(Goldinger 1997 and Bybee 2002), and childrens manipulations of syntactic structures (Tomasello 2003; see also syntax,
acquisition of).
Exemplars are not prototypes. They are individual instances of
linguistic usage retained in memory. Given the empirical necessity for incorporating memory for exemplars into models of language behavior, the crucial research question in exemplar-based
approaches becomes whether they alone can account for the
spectrum of linguistic behaviors or whether there remains independent empirical justification for the rule-based components.
Exemplar-based models of language are founded on the simple notion that in language use, speakers will compare a current
linguistic expression and its context (linguistic and nonlinguistic)
with their personal collections of memories for similar expressions and then choose at least one of the tokens in memory an
exemplar as the basis for deciding how to interpret or otherwise operate on that expression. Usually, the token(s) selected
will be similar to the input currently being considered and its
context. Such models imply that the brain stores vast inventories of memories for individual episodes of linguistic experience
and that it employs some procedure for comparing the features
of the new input or current context to the features of those
remembered exemplars, and then has some basis for choosing
one of those exemplars as the model for an analogical response
(or interpretation).
Given the empirical evidence that memories for individual
exemplars of previous linguistic behavior influence current
linguistic behavior, we restrict our discussion in this entry to
explicitly defined exemplar-based approaches that use actual
exemplars (typically gleaned from linguistic corpora) to predict
linguistic behavior. The approaches discussed here are all computationally based and have actually been tested against real
linguistic behavior. The algorithms are also publicly available to
researchers.
In the three approaches that follow, the exemplars are
retained and directly used to predict linguistic behavior. A data
set of relevant exemplars is constructed from actual linguistic
corpora; then an algorithm is applied that compares the new
input to the exemplars in the data set and selects certain of those
exemplars while lessening or even zeroing out the chances of
other exemplars in the data set being used. Typically, the exemplars in the data sets are composed of outcomes associated with
various linguistic variables or features. A prediction is then made
for the outcome defined by an input set of variables (the given
context). Normally, exemplars that are in some sense closer to
the given context have a higher chance of being selected, but
sometimes the algorithm may select more distant exemplars. In
other words, these approaches sometimes allow exemplars that
are not nearest neighbors that is, most similar to be used.

The Generalized Context Model (GCM)


The generalized context model (Nosofsky 1992) was developed
primarily as a model of concept learning, choice behavior, and
categorization. The GCM has been tested most extensively against
nonlinguistic behavior, but has also been tested on morphological processes, such as predicting the plural forms of German

Exemplar Theory
nouns (Nakisa and Hahn 1996). The GCM determines the conditional probability of assigning a given linguistic form say, the
base form of a noun to a particular form class, for example, a
particular plural form. It does so by comparing the features of the
test form with the weighted sum of those features in all the exemplars of one response category, divided by the weighted sum of
those features across the exemplars of all the possible response
categories. The model also factors in a response bias value for the
different categories. Thus, it arrives at a conditional probability
for choosing any one response over the alternatives.
In the application of the GCM to nonlinguistic data, the
accuracy of the models predictions depends crucially upon
the weightings assigned to the different stimulus features and
the response biases for the alternative categories. Typically,
both are determined ahead of time by constructing a confusion
matrix for the exemplars to be used. The resulting weightings
(said to account for the effects of selective attention during training) and response biases then apply only to the given data set
of exemplars. The feature weightings determined ahead of time
for a given data set are equivalent to the information gain values
described for the memory based learning model that follows and
are subject, therefore, to the same theoretical criticisms in that
they must be calculated ahead of time for a given data set and do
not generalize to a new data set. In applying the GCM to natural
language data, Ramin Nakisa and Ulrike Hahn (1996), of course,
were not able to obtain feature weightings for German nouns in
native speakers of the language, and the model therefore did not
perform as well as a competing connectionist model.

Memory Based Learning (MBL)


Memory based learning (Daelemans and van den Bosch 2005)
is a nearest neighbor model developed specifically for predicting language behavior. Pure nearest neighbor approaches count
each variable, or feature, as equally important for comparing an
input item to the stored exemplars and identifying one or more
of the nearest neighbors, that is, most similar exemplars, as
the basis for predicting an analogical response. However, as is
widely recognized, simple nearest neighbor models are empirically inadequate for predicting actual language behavior. Real
people often give responses that clearly are not traceable to the
most similar exemplar already known.
Daelemans and his colleagues have addressed this empirical shortcoming by determining from the database ahead of
time the overall significance of each individual variable to be
used in predicting outcomes. In this way, the distance of various neighbors from a particular input context can be adjusted
according to the importance of each variable to a particular task.
The researchers have developed a number of similarity or distance metrics for determining the significance of the individual
variables. Among the more important of these are information gain (IG), gain ratio (GR), the chi-square statistic (2), and
shared variance (SV). Depending on the particular data set and
its behavior, one gets different rates of correct prediction, but the
predictions are almost always better than providing no weighting
of the variables at all. Indeed, without such adjustments in the
weighting of features, the nearest neighbor approaches cannot
predict actual language behavior accurately. Unfortunately, one
cannot know in advance which measure of similarity will provide

the best results for a particular task and a particular data set, and
although the differences among them are usually not great, there
appears to be no principled basis for choosing one measure over
another.
One important contribution of the MBL studies to exemplarbased modeling theory is that reducing the size of the data set by
omitting very low frequency exemplars, redundant exemplars,
and very rare but exceptional exemplars actually reduces the
level of correct predictability. Thus, MBL researchers now recognize the need to construct large, complete data sets in order to
maximize overall predictability.

Analogical Modeling (AM)


Analogical modeling (Skousen 1989) is not a nearest neighbor
approach. While it includes nearest neighbors in its predictions,
it also regularly uses non-nearest neighbors to predict behavior.
In certain cases, the nearest neighbor model simply makes the
wrong prediction. (For an explicit example of where AM correctly rejects the nearest neighbors in predicting behavior, see
chapter 2 in Skousen, Lonsdale, and Parkinson 2002.)
AM is an explicit model of analogy. Non-neighbors can be
used, but only under a well-defined condition of homogeneity.
AM uses a simple decision principle to determine homogeneity: namely, never allow the analysis to increase uncertainty,
which means that no analogical analysis will ever allow any
unnecessary loss of information.
Unlike connectionist models, no training stage occurs in
AM, except in the trivial sense that one must collect exemplars
in order to make predictions. There is no setting of parameters
nor any prior determination of variable significance (see principles and parameters theory). The significance of any
combination of variables is always determined in terms of the
given context for which we seek a predicted outcome.
The resulting probability of using a particular exemplar
depends upon three factors: 1) proximity: the closer the exemplar to the given context, the greater its chances of being selected
as the analogical model; 2) gang effect: when a group of exemplars in the same space behave alike, the chances of one of those
exemplars being selected is multiplied; and 3) heterogeneity: the
chances of an exemplar being used is zero whenever there is any
intervening exemplar closer to the given context that behaves
differently (that is, has a different outcome).
Analogical modeling can be reinterpreted in terms of rules, as
follows: 1) Every possible true rule exists, and 2) the probability of
using a true rule is proportional to its frequency squared. A true
rule is a rule whose context is homogeneous in behavior. Despite
this equivalence, AM is not like regular rule approaches. Since all
of the true rules are said to exist, there will be overlapping rules,
redundant rules, and rules based on as little as one exemplar.
These equivalent true rules, when considered from the perspective of AM, are created on the fly; they are not stored somewhere, waiting to be used. In fact, until an outcome is selected,
all the true rules are constructed in a kind of superpositioning
and are processed simultaneously and by the same reversible
procedures. This approach allows AM to be implemented as a
system of quantum computation.
Analogical modeling allows for imperfect memory. In fact, in
order to model the variability of language properly, it is necessary

301

Extinction of Languages
to assume that access to exemplars is not always available and, in
general, can be considered a random phenomenon.
One important result from AM is that one cannot assume in
advance which variables are significant and thus ignore the others. Often, the potential value of a variable remains latent until
the model is required to predict the outcome for an appropriate given context. This kind of result can occur when gangs of
non-neighbors are called upon to predict the behavior of a given
context.
Royal Skousen and Steve Chandler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bybee, Joan. 2002. Phonological evidence for exemplar storage of multiword sequences. Studies in Second Language Acquisition 24: 21521.
Daelemans, Walter, and Antal van den Bosch. 2005. Memory-Based
Language Processing. Cambridge: Cambridge University Press.
Goldinger, Stephen D. 1997. Words and voices: Perception and production in an episodic lexicon. In Talker Variability in Speech Processing,
ed. K. Johnson and J. W. Mullennix, 3365. San Diego, CA: Academic
Press.
Nakisa, Ramin, and Ulrike Hahn. 1996. Where defaults dont help: The
case of the German plural system. In The Proceedings of the 18th
Annual Conference of the Cognitive Science Society, 17782. Hillsdale,
NJ: Lawrence Erlbaum.
Nosofsky, Robert M. 1992. Exemplar-based approach to relating categorization, identification, and recognition. In Multidimensional
Models of Perception and Cognition, ed. F. G. Ashby, 36393. Hillsdale,
NJ: Lawrence Erlbaum.
Pierrehumbert, Janet B. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In Frequency Effects and Emergent Grammar, ed.
J. Bybee and P. Hopper, 119. Amsterdam: John Benjamins.
Skousen, Royal. 1989. Analogical Modeling of Language. Dordrecht:
Kluwer.
Skousen, Royal, Deryle Lonsdale, and Dilworth S. Parkinson. 2002.
Analogical Modeling: An Exemplar-Based Approach to Language.
Amsterdam: John Benjamins.
Tomasello, Michael. 2003. Constructing a Language: A Usage-Based
Theory of Language Acquisition. Cambridge: Harvard University Press.

EXTINCTION OF LANGUAGES
An increasing number of books, scholarly articles, and media
reports have predicted that 5090 percent of the worlds approximately 6,900 some languages may be at risk of extinction within
the next hundred years (see, for example, Krauss 1992; Nettle and
Romaine 2000; Crystal 2000; Abley 2003). This alarming figure does
not include dialects because no one knows exactly how many
languages and dialects there are, and there are no clear criteria for
distinguishing between language and dialect (see Wolfram and
Schilling-Estes 1998 for discussion of dialect endangerment).
Estimates of the number of languages in danger of extinction
vary depending on the criteria used to assess risk. UNESCOs
World Atlas of the Worlds Languages in Danger of Disappearing
(2001) estimates that 50 percent of languages may be in various
degrees of endangerment while Michael Krauss (1992) believes
that up to 90 percent may be threatened. More research is needed
in order to understand the role of various factors, such as size (i.e.,
number of speakers), status, function, and so on, in supporting
or not supporting languages. Most languages are unwritten, not

302

recognized officially, restricted to local community and home


functions, and spoken by very small groups of people (see diglossia and language policy). Languages are most obviously at
risk when they are no longer transmitted naturally to children in
the home by parents or other caretakers. UNESCO suggests that
languages being learned by fewer than 30 percent of the younger
generation may be at risk, yet there is very little information about
the number of languages no longer being transmitted.
Most projections of the scale of the problem rely on size as a
proxy for degree of endangerment, despite lack of agreement on
how many speakers are thought necessary for a language to be
viable. A large language could be endangered if the external pressures on it were great (e.g., Quechua with some millions of speakers) whereas a very small language could be perfectly safe so long
as the community was functional and the environment stable (e.g.,
Icelandic with fewer than 300,000). However, small languages can
disappear much more rapidly than large ones, and forces such as
the spread of farming, colonization, industrialization, and globalization have propelled a few languages all Eurasian in origin to
spread over the last few centuries (see modern world-system,
language and the; colonalism and language).
Manx, for instance, was spoken on the Isle of Man for about 1,500
years. Ned Maddrell, the last-known speaker, died in 1974. Not long
before his birth in 1877, nearly a third of the island (around 12,000
people) still spoke Manx. Today, all the remaining Celtic languages
such as Breton, Scots Gaelic, Irish, Gaelic, and Welsh, and so on,
are threatened to various degrees by the spread of English and/or
French. Marie Smith Jones (d. 2008) was the last person who spoke
Eyak, one of Alaskas twenty or so native languages. Only two,
Siberian Yupik (spoken in two villages on St. Lawrence Island)
and Central Yupik (spoken in seventeen villages in southwestern
Alaska) are being transmitted to children as the first language of the
home. No children are learning any of the nearly hundred native
languages in what is now the state of California.
Based on estimates from the Ethnologue database compiled
by the Summer Institute of Linguistics (Gordon 2005), Table 1
displays the percentage of languages in different continents with
fewer than some number of speakers. The median number of
speakers for the languages of the world is only 5,000 to 6,000, and
nearly 85 percent of languages have fewer than 100,000. Languages
in Australia, the Pacific, and the Americas are mainly very small;
over 20 percent have fewer than 150 speakers, and nearly all have
fewer than 100,000, which is Krausss (1992) threshold for viability.
Africa, Asia, and Europe, however, have a fair number of medium
sized languages with 100,000 to 1 million speakers, in addition to
some giant languages. Such languages are probably safer in the
short term at least. Even if the viability threshold is set at the lower
level of 10,000 speakers, 60 percent of all languages may already
be endangered. The situation is slightly better in Africa (33%), Asia
(53%), and Europe (30%), but much worse in North and South
America (78% and 77%) and Australia and the Pacific (93%).
The issue of language extinction cannot be separated from
people, their identities, their cultural heritage, and their rights.
Maintaining cultural and linguistic diversity is a matter of social
justice because distinctiveness in culture and language has
formed the basis for defining human identities (see ethnolinguistic identity; identity, language and). Because
language plays a crucial role in the acquisition, accumulation,

Family Resemblance
Table 1. Percentages of languages according to continent of origin
having fewer than indicated number of speakers
Continent/
Region

< 150

< 1000

< 10,000

< 100,000

<1
Million

Africa

1.7

7.5

32.6

72.5

94.2

Asia

5.5

21.4

52.8

81.0

93.8

Europe

1.9

9.9

30.2

46.9

71.6

North
America

22.6

41.6

77.8

96.3

100

Central
America

6.1

12.1

36.4

89.4

100

South
America

27.8

51.8

76.5

89.1

Australia/
Pacific

22.9

60.4

93.8

99.5

World

11.5

30.1

59.4

83.8

94.1
100
95.2

Source: From Nettle and Romaine (2000, 40).

maintenance, and transmission of human knowledge, the prospect of extinction raises critical issues about the survival of this
knowledge. Loss of linguistic diversity also threatens the scientific study of language by diminishing the range of structures
for constructing hypotheses about universals.
Suzanne Romaine
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abley, Mark. 2003. Spoken Here: Travels among Threatened Languages.
Toronto: Random House of Canada.
Crystal, David. 2000. Language Death. Cambridge: Cambridge University
Press.
Gordon, Raymond G., Jr., ed. 2005. Ethnologue: Languages of the World.
15th ed. Dallas: SIL International. Available online at: http://www.
ethnologue.com/.
Krauss, Michael. 1992. The worlds languages in crisis. Language
68: 410.
Nettle, Daniel, and Suzanne Romaine. 2000. Vanishing Voices: The
Extinction of the Worlds Languages. Oxford: Oxford University Press.
UNESCO. 2001. World Atlas of the Worlds Languages in Danger of
Disappearing. Paris: UNESCO.
Wolfram, Walt, and Natalie Schilling-Estes. 1998. Endangered dialects: A
neglected situation in the endangerment canon. Southwest Journal of
Linguistics 14: 11731.

F
FAMILY RESEMBLANCE
Ludwig Wittgensteins Philosophical Investigations ([1953] 1958,
hereafter PI) is one of the most influential and well-known
texts of twentieth-century philosophy. A number of terms
have been introduced into the philosophical lexicon by this
work. Alongside language-game, the related term family

resemblance is the most prominent. However, it is crucial to be


clear about the nature and purpose of such terms in PI; they are
offered by Wittgenstein as merely purpose-relative, perspicuous
(ibid., 122), ways of presenting certain aspects of our language
use, proposed neither as elements of a theory of language nor as
methodological devices to be applied mechanically elsewhere.
Understanding the role and purpose of such terms is dependent
on understanding Wittgensteins vision of philosophy.
Wittgensteins vision of philosophy was that it is a therapeutic
activity, the purpose of which is the dissolution of philosophical problems through the achievement of clarity. Such clarity is
not to be achieved by the production of philosophical theses, for
philosophical problems and the consequent desire to produce
theories arise through our misunderstanding of the logic of our
language (1922, 3). Therapeutic dialogue brings to the fore unacknowledged or unconscious commitments to certain pictures
of the way things must be; once conscious, once acknowledged,
such commitments lose their thought-constraining grip.
One thought-constraining commitment that Wittgenstein
considered pervasive in philosophy was the commitment to
language as essentially representative: that words name objects,
the objects being their meaning ([1953] 1958, 1); as he wrote
on the first page of the Blue Book (BB): we are up against one
of the great sources of philosophical bewilderment: a substantive makes us look for a thing that corresponds to it (1958, 1; but
also see [1953] 1958, 4045 and 8990). It is not that such
a view is explicitly espoused by philosophers but, rather, that it
often operates as an unconscious or unacknowledged picture
of the way language must operate. Much of the opening of PI is
an attempt to bring to consciousness this hitherto unconscious
commitment, so that we might recognize it as nonobligatory; in
recognizing it as such, the thought-constraining grip of this picture will be loosened, and the philosophical problems and myths
to which it might lead, as well as the consequent desire to propound theories, will dissipate.
One strategy Wittgenstein employs is to suggest that we might
look at the use of a word when one wants to know its meaning.
This is designed to wean one away from the desire to look for (or
theorize into existence) the object for which one assumes a substantive must stand. As a prophylactic, he suggests that we might
see language use and the practices in which it is embedded as a
game ([1953] 1958, 7). Identification of the language-game being
played such as the language-game of describing, that of ordering, that of predicting, that of naming, and so on is suggested
as a procedure for identifying the use and thus the meaning of an
utterance. Having suggested this strategy, Wittgenstein responds
to an anticipated objection (ibid., 65) that in talking of language
in terms of different language-games, he leaves himself open to
the charge of avoiding the question: What is the essence of a
language-game and thus of language as a whole? His response
to this anticipated objection is to suggest that there is nothing, of
necessity, common to all the activities that he is calling languagegames, as there is nothing, of necessity, common to all uses of the
word game. Rather, when we look and see, different uses of
the word game do not have anything common to all, but rather
a whole series of overlapping similarities and relationships (ibid.,
66). The notion of a family resemblance is a perspicuous way of
characterizing such similarities (1958 [1953], 67).

303

Feature Analysis
The suggestion of family resemblance as a way of understanding the similarities between different language-games or different
uses of words helps bring to consciousness our unacknowledged
enthrallment to a picture of language-as-necessarily-havingan-essence, that is, a picture of something being essential to
all instances of language use and to all uses of a word (such as
game). Family resemblance, as with all the terms introduced
by Wittgenstein in PI, serves, and should stay subservient to,
the therapeutic task (pace authors such as R. Bambrough and E.
Rosch and C. B. Mervis). Methodological readings and employments of the term, as in Bambrough (1960) and in Rosch and
Mervis (1975) (see prototype), therefore, fundamentally misunderstand the language-game in which family resemblance has
its home in Wittgensteins work.
Phil Hutchinson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baker, Gordon. 2004. Wittgensteins Method: Neglected Aspects.
Oxford: Blackwell. Hugely important text on Wittgensteins vision of
philosophy.
Bambrough, R. 1960. Universals and family resemblances. Proceedings
of the Aristotelian Society 61: 20722.
Rosch, E., and C. B. Mervis. 1975. Family resemblances: Studies in the
internal studies of categories. Cognitive Psychology 7: 573605.
Wittgenstein, Ludwig. 1922. Tractatus Logico-Philosophicus. London:
Routledge.
. [1953] 1958. Philosophical Investigations. Oxford: Blackwell.
. 1958. The Blue and Brown Books: Preliminary Studies for the
Philosophical Investigations. Oxford: Blackwell

FEATURE ANALYSIS
Feature analysis extends to the cognitive domain the interest of
early Greek philosophers such as Democritus and Plato, in identifying the fundamental building blocks (atoms) of the physical
world (Greenberg 1967). Cognitive feature analysis starts from
the observation that animals appear to organize their perceptual
worlds in terms of finite sets of discrete abstract elements rather
than simply storing and manipulating unanalyzed streams of
continuous gradient stimuli. Evidence for the existence of discrete representational features governed by abstract combinatorial principles (as opposed to concrete [e.g., phonetic]
categories emerging from lower-level gradient processes) comes
primarily from areas related to phonology but also from other
domains of human cognition, such as vision (Pylyshyn, Blaser,
and Holcombe 2000); computation of object similarity (Tversky
1977), induction (Sloman 1993), typicality, asymmetry, diversity (Heit 1997), speech perception (Stevens 2002; Poeppel,
Idsardi, and van Wassenhove 2007), sign language phonologies, morphology, semantics, alphabet processing, and
vision and object perception (see Morgan 2003 and Vaux 2008
for references).
First explored formally by A. M. Bell (1867) for alphabets,
Alfred Kroeber (1909) for kinship systems, and A. G. Bell (1911)
for speech, feature theory was most famously developed for
phonology by Roman Jakobson, Serge Karcevsky, and Nikolaj
Trubetzkoy (1928 and [1939] 1958), with significant extensions
by Jakobson, Gunnar Fant, and Morris Halle ([1952] 1963),

304

Jakobson and Halle (1956), and Noam Chomsky and Halle


(1968). (It should be noted, though, that before the 1960s most
phonologists maintained that features and phonemes were not
necessarily psychological entities; cf. Sapir 1929.) The distinctive
feature theory developed in these works, which maintains that
phonemes are composed of bundles of abstract features, such
as [round], [nasal], and [high], stands in opposition to attempts
by connectionists and most phoneticians to deny the existence of features and other higher-order symbolic categories
in human linguistic cognition (cf. Shattuck-Hufnagel and Klatt
1979; Soli and Arabie 1979; Lisker 1985; and much work in articulatory phonology [Browman and Goldstein 1989]).

Evidence for Features


Auditory illusions such as the phonemic restoration effect
(Warren and Obusek 1971) crucially involve reference to higherlevel phonological representations. Evidence for these abstract
phonological representations being composed of features comes
from a wide variety of sources (see Vaux 2008 for a review of the
literature). Most frequently cited by linguists (cf. Tatham 1999;
Poeppel, Idsardi, and van Wassenhove 2007) is the patterning of
phonemes in natural classes with regard to synchronic alternations and phonotactics, diachronic sound changes, and phenomena of language acquisition and loss (see synchrony and
diachrony).
Class behavior of this sort, the reasoning goes, is efficiently
captured by assuming that the linguistic processes in question
operate on features rather than phonemes. For example, the
pin-pen merger of /I/ and // before nasal consonants in some
varieties of English (Labov, Ash, and Boberg 2006) makes reference to the distinctive feature [+nasal] rather than the individual
nasal phonemes of English, {m n }. Were the latter the case, we
would incorrectly predict the existence of similar neutralization
rules before arbitrary collections of segments such as {m s h}.
By requiring that phonological generalizations refer to feature
sets, on the other hand, we bring the inventory of possible phonological rules significantly closer to what is actually attested.
(But see Flemming 2005 for a critique of this reasoning. For the
pin-pen merger in particular, a phonetician might respond that
the restriction to nasals can be explained phonetically, without
recourse to phonological features, by the fact that nasalization
renders formant structure less prominent and hence more confusable. Some phonologists, such as Nick Clements, would reply
that phonetics underdetermines the attested range of phonological patterns.)
Acquisition studies also support the idea that phonological generalizations target feature-based natural classes rather
than arbitrary lists of segments. Jenny Saffran and Erik Thiessen
(2003), for instance, show that infants can extract feature-based
generalizations better than other generalizations, and Anne
Pycha and colleagues (2003) demonstrate that adult learners
acquire rules that manipulate a single feature faster and more
accurately than rules that manipulate two features. D. Swingley
and R. Aslin (2002) and Christopher Fennell and Janet Werker
(2003) show that humans are already sensitive to phonological
feature distinctions in the representation of familiar words by the
age of 14 months (see also speech perception in infants
and phonology, acquisition of).

Feature Analysis
Similar feature-based perceptual distinctions have been
found to take place in the auditory cortex of adults (Phillips,
Pellathy, and Marantz 2000) and in lexical access (MarslenWilson and Warren 1994) and masked phoneme identification
(Miller and Nicely 1955). The fact that humans analyze speech
signals in terms of distinctive features may be connected to the
quantal nature of auditory responses to sound, such as responses
to acoustic discontinuities and closely spaced spectral prominences (Chistovich and Lublinskaya 1979; Delgutte and Kiang
1984; Stevens 1972, 1989, 2002; Clements and Ridouane 2006).
The same features have been implicated in speech production as well, notably in studies of speech errors (Fromkin 1973;
Goldrick 2004).
In addition to the natural class patterns discussed, phonological evidence for distinctive feature theory comes from considerations of economy: Languages appear to organize their feature
systems so as to minimize the number of features employed to
distinguish among both consonants and vowels (Archangeli
and Pulleyblank 1994; Clements 2008; Poeppel, Idsardi, and van
Wassenhove 2007).

Content of Feature Theories


Feature theories tend not to strive for maximal economy, though,
preferring to balance their feature inventories in accordance with
the following principles (from Tatham 1999):
The inventory should be able to characterize all contrasting
segments in human languages;
It should be able to capture natural classes in a clear
fashion;
It should be transparent with regard to phonetic correlates.
As a rule, Jakobson prioritized simplicity and generality and
therefore had fewer features (e.g., 1215 in Jakobson, Fant and
Halle [1952] 1963); Trubetzkoy ([1939] 1958) and Chomsky and
Halle (1968) proposed much larger inventories, being more
interested in capturing phonetic detail and phonological generalizations, respectively.
Phonologists also differ as to whether features are:
binary or equipollent (+/-; Jakobson, Fant, and Halle [1952]
1963; Chomsky and Halle 1968);
privative/unary (present vs. absent, as with Trubetzkoys
[1939] 1958 analysis of bilateral oral/nasal vowel contrasts;
cf. also dependency phonology [Anderson and Ewen 1987],
modified contrastive specification [Avery and Rice 1989],
and, with gestures instead of features, articulatory phonology
[Browman and Goldstein 1989]);
ternary (+/-/absent, as in theories that use archiphonemes
[Trubetzkoy [1939] 1958] or underspecification [Dresher,
Piggott, and Rice 1994]);
multivalued (= the gradual oppositions of Trubetzkoy [1939]
1958; cf. also Ladefoged 1971; Lindau 1975; Williamson
1977);
variable (i.e., allowing different numbers of values for a given
feature in different contexts; Trubetzkoy [1939] 1958); or
a combination of privative features and binary features (Sagey
1986; Steriade 1995).

Though the lions share of phonologists currently appear to


prefer privative features, there are reasons to believe that features can be ternary (q.v. Kim 2002). One needs both [+] and []
specifications, for instance, to account efficiently for exchange
rules such as height inversion in Brussels Flemish (Zonneveld
1976; Fitzpatrick, Nevins, and Vaux 2004). A third, un(der)specified value appears necessary to derive i) ternary patterns such
as Turkish voicing alternations (Inkelas, Orgun, and Zoll 1997),
ii) phonetic interpolation effects (cf. Keating 1988; Cohn 1990;
Anderson 1999), iii) permanent underspecification (cf. Odden
2005 on tone in consonants), and iv) phonetic vacillation in
un(der)specified segments (Vaux and Samuels 2005; Hale and
Kissock 2007).

Issues in Feature Theory


In terms of the substance of the features themselves, a number of
outstanding issues remain:
Are the features pure abstractions (as in classical structuralist phonology and Hale and Reiss 2000; cf. Meillet 1903
on the content of phonological reconstructions) or phonetically based (Jakobson, Fant, and Halle [1952] 1963 et seq.)?
David Odden (1991, 364) observes on this point that a theory
in which phonological features [i.e., pure abstractions] are
arbitrarily mapped onto phonetic features is more powerful
than one in which phonological and phonetic features are the
same, since the former property includes the latter.
If the content of features is phonetic, is it acoustic (Jakobson,
Fant, and Halle [1952] 1963; Flemming 1995), articulatory
(Chomsky and Halle 1968; Vaux 2008), or both interchangeably (Stevens 2003)? An articulatory basis for features makes
sense in light of increasingly robust evidence (pace Ohala
1996) that humans cognitively model relevant actions and
events in terms of the physical activities necessary to execute
them (motor theory; Ribot 1890; Taylor 1962; Liberman et
al. 1963; Tettamanti et al. 2005). The claims of motor theory
have recently been bolstered by imaging studies of the activity of mirror neurons in the premotor area of the monkey
brain, which are activated by both execution and observation of manual and oral actions by both first and third person
agents (Fogassi and Ferrari 2004; see also mirror systems,
imitation, and language).
Are features the smallest units of discrete linguistic representations, or are they in turn composed of smaller elements,
either muscular (Halle 1983a) or acoustic (Kingston 2003)?
Are features universal (Chomsky and Halle 1968; Stevens
1972; Kuhl 2000), language-specific (Pulleyblank 2001;
Pierrehumbert 2003), or drawn from a universal inventory
but only on the basis of observed phonological contrasts (the
Toronto School [e.g., Dresher, Piggott, and Rice 1994])?
Are features organized in hierarchical trees (Clements 1985;
Halle, Vaux, and Wolfe 2000), classes (Padgett 2002), or not at
all (Reiss 2003a)?
Do vowels employ some features that consonants lack
(Trubetzkoy [1939] 1958; Clements 1991)?
Do features encode markedness relations? Is it the case,
in other words, that + values of features are marked, and
values are unmarked? This seems to be the position taken

305

Feature Analysis
by Chomsky and Halle (1968). The Toronto School employs
essentially the same system, but with privative rather than
binary features; as a result, the more features a segment contains, the greater its degree of markedness. (See Steriade 1995,
Reiss 2003b, and Clements 2008 for further discussion.)
Bert Vaux
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, John, and Colin Ewen. 1987. Principles of Dependency
Phonology. Cambridge: Cambridge University Press.
Anderson, Stephen. 1999. The nature of phonetic representations. Talk
given at Keio University. Available online at: http://bloch.ling.yale.
edu/Public/Phonetic_Reps.pdf.
Archangeli, Diana, and Douglas Pulleyblank. 1994. Grounded Phonology.
Cambridge, MA: MIT Press.
Avery, Peter, and Keren Rice. 1989. Segment structure and coronal
underspecification. Phonology 6: 179200.
Baltaxe, Christiane. 1978. Foundations of Distinctive Feature Theory.
Baltimore: University Park Press.
Bell, Alexander Graham. 1911. The Mechanics of Speech. New York and
London: Funk and Wagnalls.
Bell, Alexander Melville. 1867. Visible Speech: The Science of Universal
Alphabetics. London: Simkin, Marshall.
Browman, Catherine, and Louis Goldstein. 1989. Articulatory gestures
as phonological units. Phonology 6: 20151.
Chistovich, L., and V. Lublinskaya. 1979. The center of gravity effect in vowel spectra and critical distance between the formants: Psychoacoustical study of the perception of vowel-like stimuli.
Hearing Research 1: 18595.
Chomsky, Noam, and Morris Halle. 1968. The Sound Pattern of English.
New York: Harper and Row. See especially Chapter 7.
Clements, G. N. 1985. The geometry of phonological features. Phonology
Yearbook 2: 22352.
. 1991. Vowel height assimilation in Bantu languages. Working
Papers of the Cornell Phonetics Laboratory 5: 77123. Ithaca, NY: Cornell
University.
. 2008. The role of features in phonological inventories.
In Contemporary Views on Architecture and Representations in
Phonological Theory, ed. Eric Raimy and Charles Cairns. Cambridge,
MA: MIT Press.
Clements, G. N., and Rachid Ridouane. 2006. Quantal phonetics and distinctive features: A review. Proceedings of ISCA Tutorial and Research
Workshop on Experimental Linguistics, 2830 August 2006, ed. Antonis
Botinis, 1724. Athens: University of Athens Press.
Cohn, Abigail. 1990. Phonetic and Phonological Rules of Nasalization.
Ph.D. diss., University of Califonia at Los Angeles.
Delgutte, B., and N. Kiang. 1984. Speech coding in the auditory nerve: IV.
Sounds with consonant-like dynamic characteristics. Journal of the
Acoustical Society of America 75: 897907.
Dresher, B. E., G. L. Piggott, and K. Rice. 1994. Contrast in phonology: Overview. Toronto Working Papers in Linguistics 13.1: iii-xvii.
Fennell, Christopher, and Janet Werker. 2003. Early word learners ability to access phonetic detail in well-known words. Language and
Speech 46.2/3: 24564.
Fitzpatrick, Justin, Andrew Nevins, and Bert Vaux. 2004. Exchange rules
and feature-value variables. Presented at The 3rd North American
Phonology Conference, Concordia University, Montral, Qubec.
Flemming, Edward. 1995. Auditory representations in phonology.
Ph.D. diss., University of California at Los Angeles.
. 2005. Deriving natural classes in phonology. Lingua
115: 287309.

306

Fogassi, Leonardo, and Pier Francesco Ferrari. 2004. Mirror neurons,


gestures and language evolution. Interaction Studies 5.3: 34563.
Fromkin, Victoria. 1973. Speech Errors as Linguistic Evidence. The
Hague: Mouton.
Goldrick, Matthew. 2004. Phonological features and phonotactic constraints in speech production. Journal of Memory and Language
51: 586603.
Greenberg, Joseph. 1967. The first (and perhaps only) non-linguistic distinctive feature analysis. Word 23.1: 21420.
Hale, Mark, and Madelyn Kissock. 2007. The phonetics-phonology
interface and the acquisition of perseverant underspecification. In
The Oxford Handbook of Linguistic Interfaces, ed. Gillian Ramchand
and Charles Reiss, 81102. Oxford: Oxford University Press.
Hale, Mark, and Charles Reiss. 2000. Phonology as cognition. In
Phonological Knowledge: Conceptual and Empirical Foundations,
ed. N. Burton-Roberts, Philip Carr, and Gerard Docherty, 16184.
Oxford: Oxford University Press.
Hall, T. Alan, ed. 2001. Distinctive Feature Theory. Berlin: Mouton de
Gruyter.
Halle, Morris. 1983a. On distinctive features and their articulatory
implementation. Natural Language & Linguistic Theory 1: 91105.
. 1983b. On the origins of the distinctive features. International
Journal of Slavic Linguistics and Poetics 27: 7786.
Halle, Morris, Bert Vaux, and Andrew Wolfe. 2000. On feature spreading and the representation of place of articulation. Linguistic Inquiry
31: 387444.
Heit, Evan. 1997. Features of similarity and category-based induction.
In Proceedings of the Interdisciplinary Workshop on Categorization and
Similarity, University of Edinburgh, 11521.
Inkelas, Sharon, Orhan Orgun, and Cheryl Zoll. 1997. Exceptions and
static phonological patterns: Cophonologies vs. prespecification.
In Derivations and Constraints in Phonology, ed. Iggy Roca, 393418.
Oxford: Oxford University Press.
Jakobson, Roman, Gunnar Fant, and Morris Halle. [1952] 1963.
Preliminaries to Speech Analysis. Cambridge, MA: MIT Press.
Jakobson, Roman, and Morris Halle. 1956. Fundamentals of Language.
The Hague: Mouton.
Jakobson, Roman, Serge Karcevsky, and Nikolaj Trubetzkoy. [1928] 1971.
Quelles sont les mthodes les mieux appropris un expos complet et pratique dune langue quelconque. In Selected Writings (2d
expanded ed.), ed. Jakobson, 36. The Hague: Mouton.
Keating, Patricia. 1988. Underspecification in phonetics. Phonology
5: 27592.
Kim, Yuni. 2002. Phonological features: Privative or equipollent? B.A.
thesis, Harvard University.
Kingston, John. 2003. Learning foreign vowels. Language and Speech
46: 295349.
Kroeber, Alfred. 1909. Classificatory systems of relationship. Journal
of the Royal Anthropological Institute of Great Britain and Ireland
39: 7784.
Kuhl, Patricia. 2000. Language, mind, and brain: Experience alters perception. In The New Cognitive Neurosciences (2d ed.), ed. Michael
Gazzaniga, 99115. Cambridge, MA: MIT Press.
Labov, William, Sharon Ash, and Charles Boberg. 2006. The Atlas of North
American English. Berlin: Mouton de Gruyter.
Ladefoged, Peter. 1971. Preliminaries to Linguistic Phonetics.
Chicago: University of Chicago Press.
Liberman, Alvin, F. Cooper, K. Harris, and P. MacNeilage. 1963. A
motor theory of speech perception. In Proceedings of the Symposium
on Speech Communication Seminar, Royal Institute of Technology,
Stockholm. Paper D3, Vol. 2.
Lindau, Mona. 1975. [Features] for vowels. UCLA Working Papers in
Phonetics 30. Los Angeles: Department of Linguistics, UCLA.

Feature Analysis
Lisker, Leigh. 1985. The pursuit of invariance in speech signals. Journal
of the Acoustical Society of America 77: 11991202.
Marslen-Wilson, William, and Paul Warren. 1994. Levels of perceptual
representation and process in lexical access: Words, phonemes, and
features. Psychological Review 101: 65375.
Meillet, Antoine. 1903. Introduction ltude comparative des langues
indo-europennes. Paris: Hachette.
Miller, George, and Patricia Nicely. 1955. An analysis of perceptual confusions among some English consonants. Journal of the Acoustical
Society of America 27.2: 33852.
Morgan, Michael. 2003. Feature analysis. In The Handbook of Brain Theory
and Neural Networks, ed. M. Arbib, 4447. Cambridge, MA: MIT Press.
Odden, David. 1991. Review of generative and non-linear phonology, by
Jacques Durand. Language 67.2: 3637.
. 2005. Introducing Phonology. Cambridge: Cambridge University
Press. Chapter 6, Feature theory, is particularly relevant.
Ohala, John. 1996. Speech perception is hearing sounds, not tongues.
Journal of the Acoustic Society of America 99: 171825.
Padgett, Jaye. 2002. Feature classes in phonology. Language
78.1: 81110.
Phillips, Colin, Tom Pellathy, and Alec Marantz. 2000. Phonological
feature representations in auditory cortex. Manuscript, University
of Delaware. Available online at http://www.ling.udel.edu/colin/
research/papers/feature_mmf.pdf.
Pierrehumbert, Janet. 2003. Phonetic diversity, statistical learning, and
acquisition of phonology. Language and Speech 46: 11554.
Poeppel, David, William Idsardi, and V. van Wassenhove. 2007. Speech
perception at the interface of neurobiology and linguistics. In
Proceedings of the Royal Society of London B 363: 107186.
Pulleyblank, Douglas. 2001. Defining features and constraints in
terms of complex systems: Is UG too complex? Paper presented at
the Workshop on Early Phonological Acquisition, Carry-le-Rouet,
Marseilles, France.
Pycha, Anne, Pawel Nowak, Eurie Shin, and Ryan Shosted. 2003.
Phonological rule-learning and its implications for a theory of vowel
harmony. In Proceedings of WCCFL 22, ed. M. Tsujimura and G.
Garding, 10114. Somerville, MA: Cascadilla Press.
Pylyshyn, Zenon, E. Blaser, and A. Holcombe. 2000. Tracking an object
through feature space. Nature 408: 1969.
Reiss, Charles. 2003a. Quantification in structural descriptions: Attested
and unattested patterns. Linguistic Review 20: 30538.
. 2003b. Accepting markedlessness: How non-phonological
symbolic computation shapes trends in attested phonological systems. In Proceedings of the 29th Annual Meeting of the Berkeley
Linguistics Society, Parasession on Phonetics Sources of Phonological
Patterns: Synchronic and Diachronic Explanations, 56981.
Ribot, Thodule. 1890. Psychologie de lattention. Paris: Alcan.
Saffran, Jenny, and Erik Thiessen. 2003. Pattern induction by infant language learners. Developmental Psychology 39: 48494.
Sagey, Elizabeth. 1986. The representation of features and relations
in non-linear phonology. Ph.D. diss., Massachusetts Institute of
Technology.
Sapir, Edward. 1929. The status of linguistics as a science. Language
5.4: 20714.
Shattuck-Hufnagel, S., and Dennis Klatt. 1979. The limited use of distinctive features and markedness in speech production: Evidence from
speech error data. Journal of Verbal Learning and Verbal Behavior
18: 4155.
Sloman, S. 1993. Feature-based induction. Cognitive Psychology
25: 23180.
Soli, S., and P. Arabie. 1979. Auditory versus phonetic accounts of
observed confusions between consonant phonemes. Journal of the
Acoustical Society of America 66.1: 4659.

Felicity Conditions
Steriade, Donca. 1995. Underspecification and markedness. In
Handbook of Phonological Theory, ed. John Goldsmith, 11474.
Oxford: Blackwell.
Stevens, Kenneth. 1972. The quantal nature of speech: Evidence from
articulatory-acoustic data. In Human Communication: A Unified
View, ed. P. Denes and E. David, 5166. New York: McGraw-Hill.
. 1989. On the quantal nature of speech. Journal of Phonetics
17: 346.
. 2002. Toward a model for lexical access based on acoustic landmarks and distinctive features. Journal of the Acoustical Society of
America 111.4: 187291.
. 2003. Acoustic and perceptual evidence for universal phonological features. In Proceedings of the XVth International Congress of
Phonetic Sciences, Barcelona, 338.
Swingley, D., and R. Aslin. 2002. Lexical neighborhoods and the wordform representations of 14-month-olds. Psychological Science
13: 4804.
Tatham, Mark. 1999. Distinctive feature theory. Available online at
http://www.essex.ac.uk/speech/teaching-01/documents/df-theory.
html.
Taylor, James. 1962. The Behavioral Theory of Perception. New Haven,
CT: Yale University Press.
Tettamanti, M., G. Buccino, M. Saccuman, V. Gallese, M. Danna, P.
Scifo, F. Fazio, G. Rizzolatti, S. Cappa, and D. Perani. 2005. Listening
to action-related sentences activates fronto-parietal motor circuits.
Journal of Cognitive Neuroscience 17.2: 27381.
Trubetzkoy, Nikolay. [1939] 1958. Grundzge der Phonologie.
Gttingen: Vandenhoeck and Ruprecht.
Tversky, Amos. 1977. Features of similarity. Psychological Review
84: 32752.
Vaux, Bert. 2008. The role of features in a symbolic theory of phonology. In Contemporary Views on Architecture and Representations in
Phonological Theory, ed. Charles Cairns and Eric Raimy. Cambridge,
MA: MIT Press.
Vaux, Bert, and Bridget Samuels. 2005. Laryngeal markedness and aspiration. Phonology 22: 395436.
Warren, Richard, and C. Obusek. 1971. Speech perception and phonemic restorations. Perception and Psychophysics 9: 35862.
Williamson, Kay. 1977. Multivalued features for consonants. Language
53.4: 84371.
Zonneveld, Wim. 1976. A phonological exchange rule in Flemish
Brussels. Linguistic Analysis 2: 10914.

FELICITY CONDITIONS
The distinction between felicity conditions and truth conditions was introduced by J. L. Austin as the basis for distinguishing between two types of utterances, performative
and constative, respectively. As such, it is central to the view
within pragmatics that utterances perform actions, an idea
that gave rise to the theory of speech-acts. Constatives such
as Snow is white have truth conditions, that is, they assign truth
or falsity to the proposition expressed (Snow is white is true if
and only if snow is white). Performatives, on the other hand,
have felicity conditions (i.e., pragmatic conditions that must
be met if the utterance is to achieve its intended goal) and will
only function felicitously or happily if these conditions of use
are met.
So, for an utterance such as I hereby pronounce you husband
and wife to be felicitous, the following conditions must apply
(Austin 1962, 15):

307

Felicity Conditions
(1) a) There must exist an accepted conventional procedure
having a certain conventional effect (in our example, changing the wording to I reckon youre man and wife would make
it infelicitous); and b) the particular persons and circumstances in a given case must be appropriate for the invocation
of the particular procedure invoked.
(2) The procedure should be executed by all participants correctly and completely. Stopping halfway through a marriage
ceremony, for example, would result in its being null and void.
(3) When the participants, as is often the case, are required
to have specific thoughts or feelings or when any subsequent
conduct is specified, the participants involved should actually (intend to) have these requisite thoughts and feelings and
conduct themselves appropriately.
Austin later abandoned the distinction between constatives and
performatives in favor of a model in which all utterances have
felicity conditions.
Acts that fail to meet criteria 1 and 2 are labeled misfires, and
those that fail to meet criterion 3 abuses. This is an important
distinction: Misfires simply result in the intended act not being
realized because of some deficiency in the procedure (e.g., the
speaker not having the requisite authority), whereas abuses
involve a conscious decision on the part of either the speaker (S)
or the hearer (H) to ignore or manipulate the action (e.g., when S
makes a promise that he/she does not intend to keep).
It should be noted that for Austin, the so-called uptake of the
action is an integral part of its being felicitous. So, for instance,
when S offers H something, as in Have a cookie, not only must S
have the requisite intention of wanting to do something for H (as
stipulated in condition 3a), but H must actually act accordingly,
that is, perform the appropriate uptake and respond to the offer
by either accepting or refusing. This notion is closely related to
the perlocution associated with the action, which is thus an
integral part of performing a speech-act happily, an aspect often
neglected by later speech-act theorists.
J. R. Searle (1969) used various classes of felicity conditions
as the main criterion for distinguishing between different types
of illocutionary acts. First of all, there are restrictions on the
propositional content expressed through the speech-act, which,
depending on the type of illocutionary force involved,
must be of a certain type, such as, for instance, a future act by S
(promises) or a past act by H (complaints). Secondly, there are
preparatory conditions associated with speech-act types, that
is, specific social preconditions; for a command to be felicitous,
for instance, it is a precondition that S is in a position of authority over H. Otherwise, the command will be ineffective. Thirdly,
Searles sincerity condition refers to the fact that S is supposed
to be sincere when performing a speech-act in order for it to be
felicitous, which crucially depends on the speaker having the
appropriate beliefs or feelings (as Austin had stated already). So,
when uttering a promise, S must sincerely intend to carry out
the future act. The fourth and most crucial criterion Searle calls
the essential condition, which summarizes the appropriate illocutionary force of the utterance. So, for a promise, the speaker
assumes an obligation to commit to the carrying out of the action
expressed in the promise. It is the essential condition that determines all the others: If something is a request, that is, an attempt

308

Field (Bourdieu)
to get H to do something, it follows that this must be a future act
(propositional content), that S really wants H to carry out this
future act (sincerity condition), and that H would not have done
so spontaneously (preparatory condition). In addition to these
conditions, there are general felicity conditions applying to any
speech-act: Both S and H must be able to speak and understand
the language, they must have no pathological conditions, and
the act should not occur in a parasitical context, such as a joke
or a play.
On the basis of these criteria, Searle (1975) distinguishes five
classes of acts: assertives or representatives (which describe
a state of affairs), directives (which attempt to get H to do
something), commissives (which commit S to some future
course of action), expressives (which express Ss psychological
state), and declarations (which effect institutional changes). As
S. C. Levinson points out, however, these classes are not really
built in any systematic way on felicity conditions (1983, 240).
The question remains how hearers determine the speakers
intended illocutionary force of the utterance in the absence of
any linguistic devices that mark it (such as performative verbs),
as is the case in indirect speech-acts.
Ronald Geluykens
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Austin, J. L. 1962. How to Do Things with Words. Oxford: Oxford University
Press.
Levinson, S. C. 1983. Pragmatics. Cambridge: Cambridge University
Press.
Searle, J. R. 1969. Speech Acts: An Essay in the Philosophy of Language.
Cambridge: Cambridge University Press.
. 1975. A taxonomy of illocutionary acts. In Language, Mind,
and Knowledge, ed. K. Gunderson, 34469. Minneapolis: University of
Minnesota Press.

FIELD (BOURDIEU)
Field is one of Pierre Bourdieus two fundamental concepts
the other being habitus and is addressed in several contexts
in his work, with precise definitions given. For example:
A field may be defined as a network, or a configuration, of objective relations between positions. These positions are objectively
defined, in their existence and in the determinations they impose
upon their occupants, agents or institutions, by their present and
potential situation (situs) in the structure of the distribution of
species of power (or capital) whose possession commands access
to the specific profits that are at stake in the field, as well as by
their objective relation to other positions (domination, subordination, homology, etc.). (Bourdieu 1992, 97)

In fact, the notion of field appears throughout Bourdieus


work. For example, his early studies were very much inspired
by personal experiences in the field, in this case, Algeria and
his home environment in the Barn, southwest France (see
Bourdieu [1958] 1961 and 1962). Both of these works included
studies of language (see Grenfell 2006). However, the concept of
field as an analytic tool developed slowly in the course of further
studies on education (see Bourdieu [1970] 1977 and [1964] 1979)
and culture (see Bourdieu [1965] 1990 and [1966] 1990) in the

Field (Bourdieu)
1960s. Increasingly, field became Bourdieus main theoretical
tool in analyzing a wide range of social phenomena.
Bourdieus early work was developed in opposition to two
salient intellectual traditions, both of which were highly influential during his formative years (the 1950s): existentialism and
structuralism. Existentialism is best represented by the
work of the French philosopher Jean-Paul Sartre, with its philosophy of personal liberation through the subjective choices
we make in defining our lives. Structuralism is best represented
by the work of the anthropologist Claude Lvi-Strauss and his
study of the objective rules that can be found across cultures
and which govern human behavior taboos, myths, and so on.
There is a philosophy of language at the base of both of these
traditions, albeit from an objective and subjective point of view.
Bourdieu referred to the divide between objectivism and subjectivism in the social sciences as the most fundamental, and
the most ruinous ([1980] 1990, 25). His entire theory of practice can be seen as an attempt to bridge this divide. He defined
his approach as a science of the dialectical relations between
objective structures and the subjective dispositions within
which these structures are actualised and which tend to reproduce them [1972] 1977, 3), and the relationship between field
and habitus as one of ontological complicity (1982a, 47). The
same complicity can be seen in the relation between langue and
parole.
The notion of structure affords just such a reconciliation. Both
habitus and field are structured. In other words, social spaces
need to be understood as differentiated and thus structural
in essence. Similarly, individual cognition arises from, generates, and is generated by mental structures that are also essentially structured because of their systems of differentiation. In
a seminal paper in 1966 Intellectual Field and Creative Project
([1966] 1971) Bourdieu builds on the discovery of the historian
E. Panofsky that there was a link between Gothic art, for example in the design of cathedral architecture, and the mental habits of those involved. In other words, each was symptomatic of
the other. Bourdieu used this principle to argue that there was
a structural homology between subjective thought and objective surroundings, the latter most noticeable in forms of social
organization. Such homologies exist because they are both generated by and generate the logic of practice of the field, itself
defined in terms of its substantive raison dtre. For Bourdieu,
therefore, social and mental structures were both structured
and structuring concrete and dynamic. Similarly, language is
both structured and structuring.
Much of Bourdieus own work subsequently brought this
understanding and methodology to a series of field studies: for example, the academic field ([1984] 1988); the artistic
field ([1987] 1993); the religious field (1982b); the judicial field
([1986] 1987); the bureaucratic field ([1989] 1996); the scientific
field (1975); the cultural field (1993); the literary field ([1992]
1996); the economic field ([2002] 2005); and the political field
(1981). Even the academic discipline of applied linguistics can
be regarded as a field (see Grenfell 2004a). Many of these are
shown to be defined in terms of overarching power structures
in society: class, the state, economic interests. However, there
are also fields within fields microcosms which exhibit the
same features as macro fields but in local contexts. For example,

although all fields are in some ways interlocking, they all possess
a degree of independence or autonomy. They are also bounded
with strict rules of entry. Moreover, they are ruled by specific
values, logics, and behaviors (mostly implicit but also explicit),
which determine the legitimate ways of thinking and doing
things within the particular field. Such legitimation is a necessary part of the functioning of the field according to its own
logic and purpose, as well as representing the interests of those
who hold position within it. Such social features also define the
orthodoxy of the field the doxa against which other nonorthodox forms oppose themselves heterodoxa. Similarly, fields
have a specific orthodox language, which can be expressed at
any level: phonetic and syntactic, as well as gesture and
expression.
Fields are dynamic and in a constant state of flux; whole
sections of society can be understood historically as the move
from one field structure and logic to another, for example, in
the way the ruling class in France moved from a situation of
basing their power on inherited money and industrial wealth
to one where intellectual and academic qualifications became
the main form of social legitimation. For Bourdieu, the currency of fields was capital social, economic, and cultural.
Different configurations are held by individuals and groups
in the field and can be used to buy social positioning. Fields
are therefore also the sites of struggle and conflict a process
by which they evolve according to the salient sociostructural
forces of society at large.
Fields can be studied internally or externally. In both cases,
position is all important. Bourdieu lists this method of field
analysis (1992: 1047) as
(1) Analyze the position of the field vis--vis the field of
power.
(2) Map out the objective structural relations between the
positions occupied by those in the field.
(3) Analyze the habitus of those involved.
In this way, both the logic of practice of the field and its procedures can be rendered visible, making what is normally misrecognized open to public scrutiny. Studying language in this
way allows for a focus on linguistic features, while at the same
time relating them to field position and the position of the fields
within the overall network of fields (see Encrev 1983; Fehlen
2004; Grenfell 2004b; and Vann 2004). Field positions and positioning can depend on economic power but also on symbolic
power acquired from taste, style, and language capital that
buys social prestige. Ultimately, field language articulates and
expresses the power order found within it. In social fields, this is
the language of the state and dominant social classes.
Michael Grenfell
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bourdieu, Pierre. [1958] 1961. Sociologie de lAlgrie. New rev. ed.
Paris: Que Sais-je.
. 1962. The Algerians. Trans. A. C. M. Ross. Boston: Beacon.
, with Jean-Claude Passeron. [1964] 1979. The Inheritors: French
Students and their Relation to Culture. Trans. R. Nice. Chicago: University
of Chicago Press.

309

Field (Bourdieu)
, with Luc Boltanski, Robert Castel, Jean Claude Chamboredon,
and Dominique Schnapper. [1965] 1990. Photography: A Middle-Brow
Art. Trans. S. Whiteside. Oxford: Polity.
. 1966. Champ intellectual et project crateur. Les Temps
Modernes 246: 865906.
. [1966] 1971. Intellectual field and creative project. In
Knowledge and Control: New Directions for the Sociology of Education,
ed. M. F. D. Young, 16188. London: Collier Macmillan.
, with Alain Darbel and Dominique Schnapper. [1966] 1990. The
Love of Art: European Art Museums and Their Public. Trans. C. Beattie
and N. Merriman. Oxford: Polity.
. with Jean-Claude Passeron. [1970] 1977. Reproduction in
Education, Society and Culture. Trans. R. Nice. London: Sage.
. [1972] 1977. Outline of a Theory of Practice. Trans. R. Nice.
Cambridge: Cambridge University Press.
. 1975. The specificity of the scientific field and the social conditions
of the progress of reason. Social Science Information 14.6: 1947.
. [1980] 1990. The Logic of Practice. Trans. R. Nice. Oxford: Polity.
. 1981. La representation politique: Elments pour une thorie du champ politique. Actes de la receherche en sciences sociaels
37: 324.
. 1982a. Leon sur une leon. Paris: Les Editions de Minuit.
, with Monique de Saint Martin. 1982b. La sainte famille.
Lpiscopat franais dans le champ de pouvoir. Actes de la recherch
en sciences sociales 44/45: 253.
. [1984] 1988. Homo Academicus. Trans. P. Collier. Oxford: Polity.
. [1986] 1987. The force of law: Toward a sociology of the judicial
field. Hastings Journal of Law 38: 20948.
. [1987] 1993. Manet and the institutionalisation of anomie. In The
Field of Cultural Production, ed. and introd. Randall Johnson, 23853.
Oxford: Polity.
. [1989] 1996. The State Nobility: Elite Schools in the Field of Power.
Trans. L. C. Clough. Oxford: Polity.
, with Loc Wacquant. 1992. An Invitation to Reflexive Sociology.
Trans. L. Wacquant. Oxford: Polity.
. [1992] 1996. The Rules of Art. Trans. s. Emanuel Oxford: Polity.
. 1993. The Field of Cultural Production. Oxford: Polity.
. [2000] 2005. The Social Structures of the Economy. Oxford: Polity.
Dreyfus, Hubert, and Paul Rabinow. 1993. Can there be a science
of existential structure and social meaning? In Bourdieu: Critical
Perspectives, ed. C. Calhoun, E. LiPuma, and M. Postone, 3544.
Oxford: Polity.
Encrev, Pierre. 1983. Le sens en pratique. Construction de la rfrence
et structure social de lintraction dans le couple question/rponse.
Actes de la Recherche en Sciences Sociales 46: 330.
Fehlen, Fernand. 2004 Pre-eminent role of linguistic capital in
the reproduction of the social space in Luxembourg. In Pierre
Bourdieu: Language, Culture and Education, ed. M. Grenfell and M.
Kelly, 6172. Bern: Peter Lang.
Grenfell, Michael. 1993. The linguistic market of Orlans. In
France: Nation and Regions, ed. M. Kelly and R. Bock, 7299.
Southampton: ASM & CF.
. 2004a. Language: Construction of an object of research. In
Pierre Bourdieu: Language, Culture and Education, ed. M. Grenfell and
M. Kelly, 2740. Bern: Peter Lang.
. 2004b. Bourdieu in the classroom. In Culture and Learning: Access
and Opportunity in the Curriculum, ed. M. Olssen, 4972. Wesport,
CT: Greenwood.
. 2006. Bourdieu in the field: From the Barn to Algeria a timely
response. French Cultural Studies 17: 22339.
Grenfell, Michael, and Cheryl Hardy. 2003. Field manoeuvres: Bourdieu
and the young British artists. Space and Culture 6.1: 1934.
. 2007. Art Rules. London: Berg.

310

Film and Language


Grenfell, Michael, and David James. 1998. Bourdieu and Education: Acts
of Practical Theory. London: Falmer.
Swartz, David. 1997. Fields of struggle for power. In Culture and
Power: The Sociology of Pierre Bourdieu, 11742. Chicago: University of
Chicago Press.
Vann, Robert. 2004. An empirical perspective on practice: Operationalising Bourdieus notions of linguistic habitus. In
Pierre Bourdieu: Language, Culture and Education, ed. M. Grenfell and
M. Kelly, 7384. Bern: Peter Lang.

FILM AND LANGUAGE


The thought that film is structured like a language is pervasive in
film studies. It even has echoes in ordinary talk, such as when we
speak of the grammar of film, of film as text, and of reading a
film. An early formulation is due to the Soviet director and theorist
Vsevelod Pudovkin (1958), who argued that each shot is akin to a
word, that the sequence of shots is like a sentence, and that the
editing relations constitute a syntax. A more sophisticated version of the claim was developed by Sergei Eisenstein (1949), who
held films to be like hieroglyphic languages and ideograms.
The idea of film as a language became dominant in film studies through the influence of semiotics in the 1960s, particularly through the work of Christian Metz, who provided the most
nuanced defense of the view. Metz rejects Pudovkins analogy of
the shot with the word since a shot gives information, as does a
sentence but unlike a word. Indeed, he holds that the word, which
is the unit of language, is missing; the sentence, which is the unit of
speech, is supreme (Metz 1974, 69). The main structural parallel
to language lies in the editing relations, which constitute a grammar of film. For instance, the alternate syntagma (parallel editing)
by which two scenes are intercut with each other in an A-B-A-B
sequence is said by Metz to be a code denoting simultaneity.
Although the view of film as a language is pervasive within
film studies, some film scholars are critics of the view, notably
Stephen Prince (1993), who points out that the empirical evidence is at odds with the claim that film images have a merely
conventional relation to their denotata. Moreover, analytic
philosophers of film have almost universally rejected the claim
that film is structured like a language (for instance, Currie 1995,
11337; Harman 1977).
The reasons for this rejection are straightforward. A language
comprises a vocabulary and a syntax. The vocabulary must be
finite so that it can be learned, and the lexical units that comprise it refer to objects and properties by virtue of conventions.
The syntax consists of a finite set of rules for combining lexical units together. From this finite basis, a potentially infinite
number of meaningful sentences can be generated by recursive
procedures (see recursion, iteration, and metarepresentation), and the meaning of these sentences derives systematically from the way that vocabulary is combined by syntax
(see compositionality).
In contrast, films are composed of photographs, which have
a causal, and therefore nonconventional, relation to their denotata. Vocabulary is finite, but there is no limit to the number
of photographs that can be taken. And there is nothing like a
minimal lexical unit in a photograph: Each part of a photograph
of a cat denotes a part of the cat, down to the limits of visual

Film and Language

Filters

indiscriminability, but if the cat is called Felix, this is a minimal unit, since its parts, such as lix, do not denote part of the
cat. Films are composed of pictures, but pictures do not have a
merely conventional relation to the world. From acquaintance
with at most a small number of pictures, one can go on to interpret correctly any other picture in the same style, a feature that
Flint Schier (1986) called natural generativity, grounded on the
fact that we use our object-recognitional capacities to recognize
pictures of those objects. In contrast, acquaintance with a small
number of words in a language does not allow one correctly to
interpret other words in that language: Language lacks natural
generativity.
Nor should we hold that the shot is analogous to the sentence.
Indeed, Metzs version of the claim is, taken strictly, incoherent
since he denies that there is anything corresponding to words
in cinema; yet a sentence is by definition composed of words.
Moreover, there is nothing in a photograph corresponding to
subject-predicate structure. A photograph of a black cat lacks
parts that pick out separately the cat and blackness, unlike the
sentence the cat is black.
The claim that films have a syntax is also untenable: if films have
nothing corresponding to words, they cannot have syntax since
syntax is what couples words together. Film structure does involve
conventions, such as the alternate syntagma. But not all communication conventions are structured linguistically: Pointing is a
convention, but pointing, though it is a referential device, is not
linguistic. And, in general, communication need not be linguistic,
as is shown by animal communication. Nor does the alternate
syntagma denote simultaneity in the way that a linguistic phrase
such as at the same time does. The alternate syntagma joins
shots together; the linguistic phrase joins sentences or clauses
together. However, as just noted, shots are not like sentences (or
clauses), and Metzs view that shots are like sentences that are not
composed of words is incoherent. So, though it is a convention,
the alternate syntagma is not a linguistic convention.
Philosophers of film have given good reasons to reject the film
as language hypothesis. Film is a form of communication that, in
its visual dimension, communicates as pictures do: In terms of
Peirces semiotics, films are composed of icons, not symbols (see
icon, index, and symbol).
Berys Gaut
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Currie, Gregory. 1995. Image and Mind: Film, Philosophy, and Cognitive
Science. Cambridge: Cambridge University Press.
Eisenstein, Sergei. 1949. The cinematographic principle and the ideogram. In Film Form, trans. Jay Leyda, 3844. San Diego, CA: Harcourt
Brace Jovanovich.
Harman, Gilbert. 1977. Semiotics and the cinema: Metz and Wollen.
Quarterly Review of Film Studies 2: 1524.
Metz, Christian. 1974. Film Language: A Semiotics of the Cinema. Trans.
Michael Taylor. New York: Oxford University Press.
Prince, Stephen. 1993. The discourse of pictures: Iconicity and film studies. Film Quarterly 47.1: 1628.
Pudovkin, V. I. 1958. Film Acting and Film Technique. Trans. Ivor
Montagu. London: Vision.
Schier, Flint. 1986. Deeper into Pictures: An Essay on Pictorial
Representation. Cambridge: Cambridge University Press.

FILTERS
Filters are output constraints of the form *xy that rule out illicit
sequences generated by a computational system. Filters can be
added to any computational system, allowing hardware to
be kept maximally general. Filters play different roles in different theories, ranging from statements ensuring satisfaction of a
universal property (the case filter [Chomsky 1981; Rouveret and
Vergnaud 1980), which requires an (overtly realized) noun phrase
(NP) argument to be associated with a case configuration), to the
ranked violable output constraints of optimality theory.
Many named filters, covering diverse phenomena, date from
the 1970s and 1980s. Although they appear to be reflections of
the underlying computational system and architecture, theoretical understanding must await future research.
The doubly-filled C filter (Chomsky and Lasnik 1977) excludes
the co-occurrence of a pronounced wh-phrase and complementizer in Spec (specifier/subject position), CP (complementizer
phrase) and C (complementizer) (I wonder who (*that/*whether)
she saw, *the [man [who that] you saw ). This filter relates to the
more general question of how the distribution of overt and covert
material over hierarchical structures is determined.
The that-t filter (Perlmutter 1971) prohibits an extraction site
next to the C (that, for): who do you think (*that) t left). That-t
reflects a very general, and theoretically not understood, problem with subject extraction.
The person case constraint (Perlmutter 1971) expresses the
interaction of case forms and person marking: It bans first and
second person accusative clitics or agreement markers in the
presence of a (third person) dative clitic or agreement marker.
(French: le(ACC)-lui (DAT) but *me(ACC)-lui(DAT) *me.ACC- te.DAT ).
It is widely assumed to follow from universal person and case
hierarchies, either structurally hardwired or encoded as soft
constraints.
Some filters deal with restrictions on recursion. The doubl-ing
filter (Ross 1977) restricts (complement) recursion of -ing forms
in English. A verb like continue can combine with an -ing complement (it continued raining), unless continue is the complement
of an -ing selecting head (*its continuing raining). The head-final
filter (Williams 1982) restricts recursion on the nonrecursive side
of the head. (a man [afraid of dogs], *an [afraid of dogs] man.)
There is no generally accepted theoretical understanding of how
restrictions on recursion emerge from the derivation (but see
Koopman 2002).
Hilda Koopman
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bresnan, Joan, Shipra Dingare, and Christopher D. Manning. 2001. Soft
constraints mirror hard constraints: Voice and person in English and
Lummi. Proceedings of the LFG 01 Conference. Stanford, CA: CSLI
Publications.
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht,
the Netherlands: Foris.
. 1995. The Minimalist program. Cambridge, MA: MIT Press.
Chomsky, Noam, and Howard Lasnik. 1977. Filters and control.
Linguistic Inquiry 8: 425504.
Koopman, Hilda. 2002. Derivations and complexity filters. In Dimensions of
Movement, ed. A. Alexiadiou et al., 15189. Amsterdam: John Benjamins.

311

Focus
Perlmutter, David. 1971. Deep and Surface Constraints in Syntax. New
York: Rinehart and Winston.
Ross, John. 1972. Doubl-ing. Linguistic Inquiry 3: 6186.
Rouveret, Alain., and Jean-Roger Vergnaud. 1980. Specifying reference
to the subject: French causatives and conditions on representations,
Linguistic Inquiry 11: 97202.
Williams, Edwin. 1982. Another argument that passive is transformational. Linguistic Inquiry 13:1603.

FOCUS
Focus refers to a constituent within a sentence that is highlighted
or emphasized by grammatical means. English sentences with
marked accent patterns provide prototypical examples, like (1)
(focus on the direct object):
(1) They ordered COffeeF at the bar.

Focus realization typically has a high pitch accent, a local


maximum of the voices fundamental frequency, on the syllable
co (indicated by capitals). As important as, or perhaps even more
important than, the pitch accent on the focus itself is the lack of
pitch accents following it, for example, on bar.
The privative syntactic feature F is a common means of focus
representation in the syntax. The F-markers in the tree (its focus
structure) mediate between semantic interpretation and prosodic realization.
The pragmatic functions of focus seem wide and varied, and
are often characterized in vague terms such as speakers highlighting, most important information, evoking alternatives, and
so on. More formal approaches toward focus interpretation start
from certain rather solid facts about the discourse distribution
of focus: In an answer to a constituent question, the phrase corresponding to the question phrase is focused: (1) can answer the
question What did they order at the bar? but not, for example,
Where did they order coffee? Who ordered coffee? or What
happened? Similarly, in corrections, the item that diers from
the corrected sentence is focused: (1) can be a correction to
They ordered beer at bar but not, for example, They spilled
coffee at the bar.
To predict such facts, the notion of alternatives proves useful.
Focused coffee in (1) has, among others, beer, milk, steak, and so
on as alternatives; nonfocused elements (the background) dont
introduce alternatives. By pointwise combination, the whole
sentence gets assigned an alternative set statements of the form
they ordered X at the bar, where X ranges over the alternatives
to coffee. Roughly, the alternative set of an answer must correspond to other conceivable answers to the question; the alternative set of a correction must include the meaning of the utterance
corrected.
Alternative semantics, as it is called, also accounts well for
cases in which focus influences truth conditional content,
in association with particles such as only: (2) excludes ordering
juice, beer, and so on at the bar (i.e., other members of the alternative set), but not ordering coffee elsewhere, or ordering sausages at the grill:
(2) They only ordered COffeeF at the bar.

If one accepts these uses of focus as definitional, one can


explore focus realization beyond pitch accents. Many languages

312

Foregrounding
use cleftlike sentences or, more generally, marked constituent
order in corrections or answers (often in addition to intonation);
also common are special morphemes, as well as prosodic phrase
boundaries, to mark the edge of a focus.
Daniel Bring
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Beaver, David, and Brady Clark. 2008. Sense and Sensitivity: How Focus
Determines Meaning. Malden, MA: Blackwell.
Bring, Daniel. 2007. Intonation, semantics and information structure.
In The Oxford Handbook of Linguistic Interfaces, ed. Gillian Ramchand
and Charles Reiss, 44573. Oxford: Oxford University Press.
Kadmon, Nirit. 2001. Formal Pragmatics: Semantics, Pragmatics,
Presupposition, and Focus. Malden, MA: Blackwell.
Rooth, Mats. 1996. Focus. In The Handbook of Contemporary Semantic
Theory, ed. Shalom Lappin, 27197. Blackwell.
Schwarzschild, Roger. 1999. GIVENness, avoidF and other constraints
on the placement of accent. Natural Language Semantics 7: 14177.

FOREGROUNDING
Foregrounding is the patterned deviation from anticipated language, which is an important characteristic of literariness. In
general usage, the term sometimes describes the many conventional means that language provides for certain information to
be made prominent in discourse, such as by variation of intensity in speech or by given/new organization in syntax. In literary analysis, however, foregrounding refers specifically to the
effects achieved by textual devices that interrupt the automatic
processes of linguistic understanding. Foregrounding shifts a
readers or listeners attention away from linguistic meaning to
linguistic form so as to impart the sensation of things as they are
perceived and not as they are known, in the Russian formalist
Viktor Shklovskys words ([1917] 1988, 20).
At all levels of linguistic organization, perceptual processes
entail anticipation. Foregrounding devices defeat the processing benefits of anticipation either by creating unconventional
patterns or by deviating from established patterns. Soniclevel devices, such as alliteration, rhyme, assonance, (see
rhyme and assonance), and meter, are typically adduced as
examples of foregrounding because the repetition of individual
phones or regular metrical structures falls outside of languages
conventionalized means of sense making. Once these unconventional patterns are detected, some processing resources
are diverted to construing their significance. Devices operating
at more complex levels of structure must work differently, as
ellipsis, metaphor, and irony, for example, regularly occur
in normal speech. Foregrounding with these devices is said to
be achieved by an unusual degree of difficulty, which requires
the reader consciously to attend to construal. Deviation from
expected patterns can also be achieved at any level of linguistic
organization. Creative punctuation, neologism, oxymoron, conversational infelicity, and unusual narrative structuring are just a
few of the many devices that have been recognized as deautomatizing linguistic perception.
If foregrounding involves the creation of new patterns or
deviation from expected patterns, one must ask what gives rise
to the expectations against which foregrounding is recognized.

Forensic Linguistics
In most cases, expectations arise from experience with everyday language, such as those evidenced by garden-path phenomena in syntactic processing (see psycholinguistics),
but expectations can also arise from genre-specific knowledge.
Emily Dickinsons verse, for example, often metrically evokes the
religious hymns that would have been familiar to her community. The masculine end-rhyme patterns of these hymns establish an expectation against which the slant rhymes of her verse
stand out.
Foregrounding devices are not unique to literary texts but can
be found in traditionally nonliterary genres, such as advertising
copy or political speech. In literary texts, however, they usually
participate in larger patterns of meaning or coherence, such as
those established by thematization or iconicity (see literariness for a discussion of the latter two terms). Empirical research
has established some support for the contribution of foregrounding devices to the literariness of a text. In studies, segments of
short stories with a higher degree of foregrounding were read at
a slower rate and were rated as more striking and more emotionally evocative than segments with low foregrounding (Miall and
Kuiken 1994).
Claiborne Rice
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Miall, David S., and Don Kuiken. 1994. Foregrounding, Defamiliarization,
and Affect: Response to Literary Stories. Poetics Today 22: 389407.
Shklovsky, Viktor. [1917] 1988. Art as technique. Trans. Lee T. Lemon
and Marion J. Reis. In Modern Criticism and Theory, ed. David Lodge,
1630. London: Longman.
Van Peer, Willie. 1986. Stylistics and Psychology. London: Croom Helm.

FORENSIC LINGUISTICS
Forensic linguistics refers to the use of linguistic expert evidence
in legal proceedings, and more broadly, to linguistic research in
legal contexts. Most forensic linguistic work published in English
pertains to the common law adversarial legal system of the United
Kingdom, the United States, Canada, and Australia. Any area of
linguistics can have a forensic application. In order to do forensic linguistics, a person must qualify as a linguist, specializing in
a particular area, such as phonetics or sociolinguistics.
There are a few graduate programs in forensic linguistics, but
most practitioners are linguists with a doctorate in their specialization who apply this expertise to legal questions and contexts.
Forensic phonetics is used mostly in disputes over transcription of incriminating recorded speech, and over the identification of individuals who have committed a language crime such
as a threat, bribe, or hoax by means of a recorded voice message. acoustic phonetics is often combined with articulatory phonetics. Analysis of anonymous voice recordings
can be compared with recordings of suspects. Similarly, linguistic discourse analysis of a written text may help to identify
an author, for example, of a so-called suicide letter in a case
where police have grounds for suspecting murder, rather than
suicide. For example, if discourse analysis finds striking similarities between a suicide letter and previous (uncontested) correspondence of the spouse of the deceased, together with a striking

lack of similarity to earlier correspondence by the deceased, this


analysis may be used to support a murder charge, in conjunction with nonlinguistic evidence. Most forensic linguists believe
that their evidence should not be used for positive identification
(to identify a particular person as the perpetrator of the language
crime), but either for negative identification (to exclude one or
more persons) or to provide supporting evidence for either a
positive or negative identification.
Another forensic application of discourse analysis is in the
examination of covertly recorded conversations between undercover agents and suspects, as in drug cases. Topic analysis can
trace the extent to which a suspect initiates talk of criminal plans,
for example, or is merely an interlocutor in a conversation where
such planning is initiated by the agent.
Forensic linguists also analyze comprehension difficulties
in police interviews, mostly of suspects who speak English as a
second language or second dialect. This is of particular relevance to questions involving suspects understanding of their
rights, including the right to silence and to a lawyer. Until the
widespread introduction of recording of police interviews with
suspects, discourse analysis had also been used to examine
claims of fabricated confessions to police. These included the
notorious UK cases of the Guildord Four and Birmingham Six, in
which forensic discourse analysis was part of the complex legal
cases that resulted in overturned convictions of people found
to be falsely accused of involvement in bombings by the Irish
Republican Army.
The analysis of complex legal language often involves examination of morphology, syntax, and semantics in arguments
over legal interpretation (including the comprehensibility
of instructions to the jury) or trademarks. A famous U.S. example
of the latter concerned the use of the Mc prefix to connote budget
quality in the name McSleep Inns for a new hotel chain. While
forensic linguists analyzed its generic use as a productive morpheme, the McDonalds corporation/company argued that they
alone owned the use of this prefix with this meaning.
Forensic linguists who testify in court often face the contradiction between good scholarship, which entails openness to competing theories and explanations, and the adversarial nature of
the legal system, which can make such openness problematic.
Research on courtroom hearings shows how rules of evidence control and construe the contributions of witnesses in a
highly regulated speech event. The syntactic form of questions,
combined with metapragmatic rules that prevent witnesses from
asking questions or introducing their own topics, results in a
highly asymmetrical interaction, where most of a witnesss story
is told by lawyers. lexical semantics can uncover strategic
ways in which word choice constructs opposing views of reality.
critical discourse analysis of cross-examination in rape
cases reveals the central role of linguistic choices in constructing defendants as passive participants without responsibility or
agency, for example through the use of nominalizations such as
the fondling of the breasts, and agentless sentences such as
there wasnt any major sexual activity.
A particular concern of sociolinguists has been the participation of vulnerable witnesses, especially children, second language, and second dialect speakers. The language addressed
to child victim-witnesses in abuse cases is often complex and

313

Formal Semantics
confusing, and compromises their ability to tell their story. Second
language speakers rely on interpreters, whose role is often misunderstood by legal professionals and witnesses. Microanalysis
demonstrates linguistic challenges involved in courtroom interpreting, finding that it is much easier for interpreters to accurately interpret propositional meaning than pragmatic meaning.
Second dialect speakers are disadvantaged by widespread ignorance about subtle dialectal differences. Where there are also
important cultural differences in communicative style, as with
Australian Aboriginal English speakers, the possibilities for miscommunication are disturbing. In Australia, forensic linguistics
has contributed to positive developments in delivery of justice
generally for Aboriginal people, and specifically in legal cases of
some individuals.
Forensic linguistics is also concerned with the language of
lawyer-client interviews and alternative legal processes, such as
mediation. For example, conversation analysis investigates
the extent to which mediators talk is neutral, as well as the ways
in which gender affects talk between lawyers and their clients.
Another recent focus is on the transformation of oral narratives
into written legal documents (see intertextuality). While
most forensic linguistics is concerned with criminal or civil law,
linguists concerns have recently been extended to immigration
law, notably immigration officials use of untrained analysts who
examine the speech patterns of asylum seekers to determine
their national origin.
Diana Eades
SUGGESTIONS FOR FURTHER READING
Eades, Diana. 2010. Sociolinguistics and the Legal Process. Bristol:
Multilingual Matters. Comprehensive examination of sociolinguistic
research, examining how language works and does not work in the
legal process.
Gibbons, John. 2003. Forensic Linguistics: An Introduction to Language
in the Justice System. Oxford: Blackwell. Accessible introduction to the
field.
Tiersma, Peter. 1999. Legal Language. Chicago: University of Chicago
Press. Authoritative textbook on written and spoken language in the
legal process.

FORMAL SEMANTICS
Formal semantics is an approach to semantics, the study of
meaning, with roots in logic, the philosophy of language, and
linguistics, and since the 1980s a core area of linguistic theory.
Characteristics of formal semantics treated in this entry include
the following: Formal semanticists treat meaning as mind independent (though abstract), contrasting with the view of meanings
as concepts in the head (see i-language and e-language
and meaning externalism and internalism ); formal
semanticists distinguish semantics from knowledge of semantics
(Lewis 1975, Cresswell 1978), which has consequences for the
notion of semantic competence. A central part of the meaning
of a sentence on this approach is its truth conditions, and
most, though not all, formal semantics is model theoretic, relating linguistic expressions to model-theoretically constructed
semantic values cast in terms of truth, reference, and possible

314

worlds. This sets formal semantics apart from approaches that


view semantics as relating a sentence just to a representation
on another linguistic level (logical form) or a representation
in an innate language of thought. The formal semanticist
could accept such representations as an aspect of semantics but
would insist on asking what the model-theoretic semantic interpretation of the given representation-language is (Lewis 1970).
Formal semantics is centrally concerned with compositionality at the syntax-semantics interface, how the meanings of
larger constituents are built up from the meanings of their parts
on the basis of their syntactic structure, and with the relation
between compositional sentence meaning and meaning in
discourse.

The History of Formal Semantics


Formal semantics developed out of the work of Richard
Montague (193071) (see montague grammar), with important contributions by other philosophers, logicians, and linguists.
Montague built on Alfred Tarskis recursively defined modeltheoretic semantic interpretation for logical formulas based on
their recursively defined syntax. Donald Davidson, whose own
approach to formal semantics is not model theoretic, was also
influential in urging a truth conditional semantics for
natural language, arguing from learnability that a finitely
specifiable compositional semantics for natural languages must
be possible (1967). Montague grammar evolved into formal
semantics through the work of philosophers and linguists, developing a variety of approaches (Partee 1996; Partee with Hendriks
1997; Portner and Partee 2002).
Gottlob Frege, whose ideas were part of the foundation of
Tarskis and Montagues work, took an antipsychologistic view
of meanings, and so did many other logicians and philosophers
(including Bertrand Russell and Rudolf Carnap). But formal
semanticists who are linguists are very much concerned with
human semantic competence. The history of formal semantics has
been colored by tension between the Fregean antipsychologistic
tradition and the Chomskyan tradition of focusing on linguistic
competence, a tension only partially resolved. One step toward
a resolution has come from recognizing a distinction between
meaning, which may exist outside the head (E-language),
and knowledge of meaning, or semantic competence, which is
very much inside the head (but should probably not be called
I-language, a term better reserved for a semantic representation
language in approaches that include one.)
What is semantic competence? The answers to this question will naturally be different with respect to different semantic frameworks. For formal semanticists, it is common to take
the fundamental characterization of semantic competence to
involve the knowledge of truth conditions: Given a sentence in
a context, and given idealized omniscience about the facts concerning some possible situation, a competent speaker can judge
whether the sentence is true or false in that situation. From that
basic competence, allowing idealizations about computational
capacity, it follows that a competent native speaker can also
make judgments about entailment relations between sentences.
So semantic competence is widely considered to consist in
knowledge of truth conditions and entailment relations of sentences of the language.

Formal Semantics

The Current State of the Field


At the heart of formal semantics are the principle that truth
conditions form a core aspect of meaning and the methodologically central principle of compositionality. Differences among
approaches can often be traced to three crucial theory-dependent terms in the principle of compositionality: The meaning of
a whole is a function of the meanings of its parts.
MEANINGS. David Lewis provided a famous strategy for thinking
about what meanings are: In order to say what a meaning is, we
may first ask what a meaning does, and then find something that
does that (1970, 22). There are different proposals about what
to count as meanings (or as the linguistically relevant aspects of
meanings) within formal semantics. Montague formalized intensions of sentences as functions from possible worlds and variable
assignments to truth values (see intension and extension;
proposition.) R. Stalnaker (1976), David Kaplan (1979), Hans
Kamp (1981), and Irene Heim (1982) emphasized the importance
of context-dependent expressions; Kaplan introduced the character of an expression, a function from contexts to intensions.
Kamp and Heim introduced a more dynamic semantics, treating
meaning as a function from contexts to contexts. Jon Barwise and
John Perry (1983) and Angelika Kratzer (1989) argued for replacing possible worlds by (possible) situations, which for Kratzer are
parts of possible worlds, enabling more fine-grained analysis of
meanings (see Kratzer 2007).
IS A FUNCTION OF. How are meanings put together? How does
the compositional mapping from syntax to semantics work? The
question of the sorts of functions that are used to put meanings
of parts together is inextricably linked to the questions of what
meanings are and what count as syntactic parts. Frege (1892)
took the basic semantic combining operation to be function-argument application: Some meanings are construed as functions
that apply to other meanings (see compositionality). With a
syntax such as categorial grammar providing the relevant
partwhole structure, Freges function-argument principle could
be enough; with other kinds of syntax, other operations may be
needed as well.
Formal semanticists also differ on whether a level of semantic
representation is hypothesized to mediate between syntax
and model-theoretic interpretation. Montagues own work exemplified both direct model-theoretic interpretation (Montague
1970) and two-stage interpretation via a language of intensional
logic (Montague 1973). Many linguists work with some intermediate semantic representation. Either approach can be compositional: A two-stage interpretation procedure is compositional if
the syntax-to-semantic-representation mapping rules are compositional and the model-theoretic semantics of the representation language is also compositional. When those conditions are
met, the intermediate language is, from a model-theoretic perspective, in principle eliminable. But linguists hypothesize that it
may have some psychological reality: It may represent an aspect
of the means by which humans compute the mapping between
sentences and their meanings. But it is a major challenge to find
empirical evidence for or against such a hypothesis.
It is worth noting that it is possible to advocate direct model-theoretic interpretation without being antipsychologistic,

via the notion of mental models (Johnson-Laird 1983). But


approaches using mentally represented formulas (logical forms,
conceptual representations) and computations on such formulas, as advocated by Ray Jackendoff for many years, are preferred
by many Chomskyan linguists.
Kamps (1981) discourse representation theory (DRT) uses
the discourse representation structure (DRS) as a noneliminable
intermediate level of representation, with claimed psychological reality: Kamp hypothesized that his DRS could be a common
medium playing a role in language and as objects of propositional attitudes. Kamp argued against full compositionality;
he was challenged by Jeroen Groenendijk and Martin Stokhof
(1991), who argued that a fully compositional dynamic semantics could accomplish what Kamp could do with DRT. Reinhard
Muskens (1993) proposed a reconciliation with his compositional discourse representation theory.
PARTS (SYNTAX). The relation between the preceding issues and
syntax shows up clearly in debates about direct compositionality: Some linguists argue that a direct compositional modeltheoretic semantics can apply to nonabstract surface structures
(Barker and Jacobson 2007), without abstract syntactic representations, movement rules, or a level of logical form. Advocates
of direct compositionality use an enriched arsenal of semantic
combining rules, including not only function-argument application but also function composition and a number of type-shifting operators. There may or may not be an inevitable trade-off
between optimizing syntax and optimizing semantics; it is a sign
of progress that many linguists work on syntax and semantics
with equal concern for both.

An Example
We illustrate the methods of formal semantics without formal
details by considering one aspect of the analysis of restrictive relative clauses like who fed Fido in (1a) and (1b).
(1)

a. I saw a boy who fed Fido.


b. I saw every boy who fed Fido.

In the 1960s, there were debates about whether the relative


clause combines with the noun boy, as in structure (2), or with
the phrases a boy, every boy, as in structure (3).
(2) I saw [a / every [boy [who fed Fido ]]]
(3) I saw [[ a / every boy] [who fed Fido ]]

There were also debates about the semantics of the relative


clause, with some arguing that in (1a) who fed Fido means and
he fed Fido, whereas in (1b) it means if he fed Fido, creating tension between the uniform surface structure of who fed Fido in
(1a) and (1b) and the very different underlying semantic interpretations posited for them (see Stockwell, Schachter, and Partee
1973 ). The formal semantics perspective suggests searching for
a unitary syntax and meaning for who fed Fido and locating the
semantic difference between (1a) and (1b) in the semantics of
a and every. The solution (due to Quine [1960] and Montague
[1973]) requires structure (2): The noun and relative clause
denote sets, and their combination denotes the intersection of
those two sets. Then the phrase boy who fed Fido denotes the set

315

Formal Semantics
{x: x is a boy and x fed Fido}. Different theories of the semantics
of determiners give different technical implementations of the
rest of the solution, but that first step settles both the syntactic
question and the core of the semantics. Sentence (1a) asserts
that the set of boys who fed Fido and the set of boys that I saw
overlap; (1b) says that the set of boys I saw is a subset of the set of
boys who fed Fido. See Partee (1995) for a fuller argument, and
Barwise and Cooper (1981), Heim (1982), and Kamp (1981) for
treatments of the semantics of determiners.

A Second Example
Also informally, we illustrate the beginnings of the more dynamic
semantics of Kamp (1981) and Heim (1982). Consider the contrasting minidiscourses in (4) and (5):
(4) A baby was crying. It was hungry.
(5) Every baby was crying. #It was hungry. (# means
anomalous.)

On the Kamp-Heim theory, an indefinite noun phrase (NP) like


a baby in (4) introduces a novel discourse referent into the context, and the pronoun it in the second sentence of (4) can be
indexed to that same discourse referent, whose lifespan can
be a whole discourse, not only a single sentence. The discourse
referent introduced by an essentially quantificational NP like
every baby in (5), however, cannot extend beyond its clause, so
the pronoun it in (5) is anomalous. The Kamp-Heim theory also
includes an account of the famous donkey-sentences of Peter
Geach (1962), variants of which are given in (6ab).
(6) a. If a farmer owns a donkey, he always beats it.
b. Every farmer who owns a donkey beats it.

These sentences had previously resisted compositional analysis,


even with the tools of Montague grammar and its early extensions. On Kamps and Heims theories, the indefinite a donkey
introduces a discourse referent into the local context, but has no
quantificational force of its own; it ends up being bound by the
unselective quantifiers always in (6a) and every in (6b). The
theories thus involve the interdependent areas of quantification and anaphora, and relating sentence semantics to discourse
semantics and pragmatics, giving rise to much new work in these
areas.

Formal Semantics and Cognitive Science


Formal semantics as developed within linguistics is as much a
part of cognitive science as any other part of linguistics, but there
are a number of misunderstandings and controversies surrounding this issue. This concluding section touches on a number of
broader issues and controversies.
In semantics, possibly unlike syntax or phonology, ones idiolect is not determined solely by ones own semantic knowledge. Meanings are conventional social constructs (Lewis
1975), and as Hilary Putnam (1975) observed, his inability
to distinguish beeches from elms does not render the words
beech and elm synonymous or vague in his language, since the
meanings of our words and the truth conditions of our sentences depend in significant part on the conventions of our
language community and only in part (for instance, in using

316

vague words with particular less vague meanings) on what


we have in mind when we speak (division of linguistic
labor). How the word elm in Putnams mouth can refer to
elms is a major topic in the philosophy of language (see natural kind terms ). Such issues distinguishing E-language from
I-language are especially acute for proper names (Kripke 1972).
Formal semanticists emphasize that human semantic competence includes the higher-order intention to use words with
their shared conventional meanings, unlike Humpty Dumpty
in Through the Looking-Glass (When I use a word, Humpty
Dumpty said, in a rather scornful tone, it means just what I
choose it to mean neither more nor less.) At the same time,
formal semantics and pragmatics make room for the fact that
in a discourse context, conversational partners may mutually
understand possibly mistaken speakers meanings, so that a
false sentence can sometimes be used to express a true proposition. For instance, That beech is dying might successfully
communicate the fact that a certain tree that is in fact an elm
is dying, since the demonstrative that may be enough to enable
the hearer to recognize which tree is meant.
Some argue that formal semantics does not belong within
linguistics if formal semantics is concerned with truth, since
linguistics is concerned only with mental representations. Such
an attitude reflects both a methodological decision (Noam
Chomskys position excludes model-theoretic semantics because
it extends beyond autonomously linguistic levels of representation) and a misunderstanding, since formal semanticists are in
fact not interested in truth but in truth conditions, and one can
very well know what a situation must be like for a sentence to be
true in it, without knowing what the facts are and so not knowing the actual truth value. For most of us, this is the case with a
sentence like There is now a white cat sitting in a window of an
apartment in Baltimore in which a piano teacher gave piano lessons in the 1950s.
The compositionality principle is arguably one aspect of
universal semantic competence that distinguishes human language from all animal languages (see compositionality).
cognitive linguists sometimes charge that compositionality is incompatible with the prevalence of metaphor. Formal
semanticists reply that metaphor principally involves shifts in
lexical meanings, and that the interpretation of metaphor is
driven in part by compositionality, that is, by the need to find
suitable meanings for the parts so that they can be combined
into a meaningful whole. Compositionality is also a driving force
in childrens acquisition of lexical meanings: When context supplies a plausible meaning for the whole and syntax supplies a
suitable structure, the child can use knowledge of the meaning
of the whole and of the other parts to solve for the meaning
of the unknown part. (Cf. the carefully structured context for the
learning of novel color words in experiments reported in Carey
[1978, 271]): Bring me the chromium tray not the blue one, the
chromium one.)
A word about the relation between word meaning (see
also lexical semantics) and compositional meaning: Some
semantic theories have focused largely on word meaning, while
formal semantics has focused on compositional meaning, but
in principle every semantic theory must address both, as well
as their integration. Formal semanticists give special attention

Formal Semantics

Forms of Life

to the semantic types of lexical items, since that is crucial to the


combining principles. A classic work on lexical meaning and
word-formation rules is Dowty (1979); see also Partee (1995).
As a final observation, we note that in contemporary work,
attention to context dependence has reached such a point that
formal semantics and formal pragmatics are close to becoming a
single discipline, and the role of context in determining meaning
is so important as to require a richly context-dependent notion of
meaning (semantics-pragmatics interaction).
Barbara H. Partee
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bach, Emmon. 1989. Informal Lectures on Formal Semantics. New
York: State University of New York Press. A user-friendly nontechnical
introduction.
Barker, Chris, and Pauline Jacobson, eds. 2007. Direct Compositionality.
Oxford Studies in Theoretical Linguistics. Oxford: Oxford University
Press.
Barwise, Jon, and Robin Cooper. 1981. Generalized quantifiers and natural language. Linguistics and Philosophy 4: 159219.
Barwise, Jon, and John Perry. 1983. Situations and Attitudes. Cambridge,
MA: MIT Press.
Carey, Susan. 1978. The child as word learner. In Linguistic Theory and
Psychological Reality, ed. M. Halle et al., 26493. Cambridge, MA: MIT
Press.
Chierchia, Gennaro, and Sally McConnell-Ginet. 1999. Meaning and
Grammar: An Introduction to Semantics. Cambridge, MA: MIT Press.
An accessible introduction to formal semantics and pragmatics.
Cresswell, M. J. 1978. Semantic competence. In Meaning and
Translation: Philosophical and Linguistic Approaches, ed. F. Guenthner
and M. Guenthner-Reutter, 943. London: Duckworth.
Davidson, Donald. 1967. Truth and meaning. Synthese 17: 30423.
Dodgson, Charles (Lewis Carroll). 1871. Through the Looking-Glass and
What Alice Found There. London: Macmillan and Co.
Dowty, David. 1979. Word Meaning and Montague Grammar: The
Semantics of Verbs and Times in Generative Semantics and in
Montagues PTQ. Dordrecht, the Netherlands: Reidel.
Frege, Gottlob. 1892. ber Sinn und Bedeutung. Zeitschrift fr
Philosophie und philosophische Kritik C: 2250.
Gamut, L. T. F. 1991. Logic, Language, and Meaning. Vol. 2. Intensional
Logic and Logical Grammar. Chicago: University of Chicago Press. An
excellent introduction to formal semantics and its logical foundations;
requires some prior logic.
Geach, Peter. 1962. Reference and Generality. Ithaca, NY: Cornell
University Press.
Groenendijk, Jeroen, and Martin Stokhof. 1991. Dynamic predicate
logic. Linguistics and Philosophy 14: 39100.
Heim, Irene. 1982. The semantics of definite and indefinite noun
phrases. Ph.D. diss., University of Massachusetts.
Heim, Irene, and Angelika Kratzer. 1998. Semantics in Generative
Grammar. Oxford: Blackwell. A systematic introduction to selected
topics in semantics that lays solid foundations for constructing and
arguing for formal semantic analyses.
Johnson-Laird, Philip N. 1983. Mental Models. Cambridge: Cambridge
University Press.
Kamp, Hans. 1981. A theory of truth and semantic representation. In
Formal Methods in the Study of Language: Mathematical Centre Tracts
135, ed. J. A. G. Groenendijk et al., 277322. Amsterdam: Mathematical
Centre.
Kaplan, David. 1979. On the logic of demonstratives. Journal of
Philosophical Logic 8: 8198.

Kratzer, Angelika. 1989. An investigation of the lumps of thought.


Linguistics and Philosophy 12: 60753.
. 2007. Situations in natural language semantics. In The Stanford
Encyclopedia of Philosophy (Spring ed.), ed. Edward N. Zalta. Available
online
at:
http://plato.stanford.edu/archives/spr2007/entries/
situations-semantics/.
Lappin, Shalom, ed. 1996. The Handbook of Contemporary Semantic
Theory. Oxford: Blackwell. Contains articles by leading semanticists on
wide spectrum of topics, representing both formal semantics and other
approaches.
Lewis, David. 1970. General semantics. Synthse 22: 1867.
. 1975. Language and languages. In Language, Mind, and
Knowledge, ed. K. Gunderson, 335. Minneapolis: University of
Minnesota Press.
Montague, Richard. 1970. English as a formal language. In Linguaggi
nella Societ e nella Tecnica, ed. Bruno Visentini et al., 189224.
Milan: Edizioni di Comunit.
. 1973. The proper treatment of quantification in ordinary English.
In Approaches to Natural Language, ed. K. J. J. Hintikka et al., 22142.
Dordrecht, the Netherlands: Reidel.
Muskens, Reinhard. 1993. A compositional discourse representation
theory. In Proceedings of the 9th Amsterdam Colloquium, ed. P. Dekker
and M. Stokhof, 46786. Amsterdam: ILLC, University of Amsterdam
Press.
Partee, Barbara. 1995. Lexical semantics and compositionality. In An
Invitation to Cognitive Science. Vol. 1: Language. Ed. L.Gleitman and
M. Liberman, 31160. Cambridge, MA: MIT Press.
. 1996. The development of formal semantics in linguistic theory.
In The Handbook of Contemporary Semantic Theory, ed. S. Lappin,
1138. Oxford: Blackwell.
Partee, Barbara H., with Herman L.W. Hendriks. 1997. Montague grammar. In Handbook of Logic and Language, ed. J. van Benthem and A.
ter Meulen, 591. Amsterdam and Cambridge, MA: Elsevier and MIT
Press.
Portner, Paul, and Partee, Barbara H., eds. 2002. Formal Semantics: The
Essential Readings. Oxford: Blackwell. A collection of classic papers
from the beginnings of the field to the late 1980s, with an introductory
overview essay.
Putnam, Hilary. 1975. The Meaning of Meaning. In Minnesota Studies
in the Philosophy of Science. Vol. 7: Language, Mind, and Knowledge. Ed.
Keith Gunderson, 13193. Minneapolis: University of Minnesota Press.
Quine, Willard van Orman. 1960. Word and Object. Cambridge, MA: MIT
Press.
Stalnaker, R. 1976. Propositions. In Issues in the Philosophy of Language, ed.
A. Mackay and D. Merrill, 7999. New Haven, CT: Yale University Press.
Stockwell, Robert P., Paul Schachter, and Barbara H. Partee. 1973. The
Major Syntactic Structures of English. New York: Holt, Rinehart and
Winston.

FORMS OF LIFE
Forms of life, an expression associated with Ludwig
Wittgenstein, is not one he made that much use of, employing it
only five times in his Philosophical Investigations. It is not clear
from his usage exactly what he had in mind. Here are statements
in which the phrase occurs:
It is what human beings say that is true or false; and they
agree in the language they use. That is not agreement in opinions, but in forms of life (1958, 38), and
What has to be accepted, the given, is so one could say
forms of life (1958, 226), as well as

317

Forms of Life
I would like to regard this certainty, not as something akin
to hastiness or superficiality, but as a form of life (That is
very badly expressed and probably badly thought as well).
(1969, 46)
Its use may nonetheless be associated with Wittgensteins
attempt to humanize matters especially logic that philosophers
had been prone to idealize to the extent of making them seem to
surpass all possibility of creation by merely human beings, meaning that their objectivity must surely reside elsewhere than in the
contingent facts and incidental variety of human life. He seeks to
reassert the connection between seemingly immutable necessities (of logic and mathematics especially) and the contingencies
of human life. To counteract the idealization of logic into something almost superhuman, Wittgenstein is prepared to say that I
want to conceive of it [logic] as something that lies beyond being
justified or unjustified; as it were, something animal (1969, 47).

Arguments over Interpretation


From the previous quotations, it can be seen that Wittgenstein
does not enumerate the kinds of things that he has in mind, gives
no examples of the forms of life nor specifies what makes them
such. Inevitably, there is disagreement over what he meant by
forms of life. Is it that human beings, of every time and place,
have the same forms of life or, alternatively, that cultural variation amongst them is possible? (Emmett 1990, 213).
Gertrude Conway (1989, 423) lists four rival
interpretations: those
(1) equating forms of life with language games,
(2) interpreting forms of life on an organic model,
(3) equating forms of life with cultural systems, and
(4) presenting human nature as a form of life.
Among the sort of things that forms of life might cover, then,
are a) the human creature, which is a form of life when compared
with other forms of life (such as plants or fish), and b) activities
characteristic of a kind of creature: Periodically going to sleep
is one form of life in which humans engage, digging burrows
characteristic of rabbits. Wittgenstein certainly does not want to
lose sight of the fact that human beings are animals but, equally,
is aware that human beings are a distinctive sort of animal, a
language-using social one. In consequence, c) groups of people
may differ in cultural forms, as rural dwellers might have very
different forms of life than urban ones, or d) may differ in relatively specific ways of acting, including ways of speaking, within
a cultural community their ways of measuring, for example.

The Natural History of Human Beings


The development of language is connected with human biology
(but it is the physiognomy of the organism and the organization
of its life that Wittgenstein has in mind, not the physiology of its
nervous system). Language is made possible by characteristic
human capacities and responses, but the language developed is
not a direct outgrowth of those characteristics. The development
of language is interwoven with the formation of practical life,
for this involves the creation of ways of speaking integral to the
practical activity. The remark that to imagine a language means
to imagine a form of life (Wittgenstein 1958, par. 19) can be

318

understood as a challenge: Try to imagine understanding a language without knowing something about how those who use the
language live their lives, some of the different sorts of things they
do. The language as a totality is an entirely contingent assemblage, accumulated ways of speaking that come from different
parts of life (as add, subtract, divide come from calculation, or
swing, putt, birdie from golf).
The direction in which natural human responses can be cultivated varies greatly, according to the multifarious ways in which
circumstances impact upon them, but Wittgenstein emphasized
both that and the importance of the natural, animal responses
of human beings. For example, consider the way in which the
human animal naturally responds to a pointing gesture (following the direction of the finger) when other animals (cats, cows)
do not, and the role that gesture can play when embedded in a
parentchild relation in associating a color name with a color
sample. This is an effective way of teaching words because children naturally respond to the rulelike connection made between
word and sample, and after a very few examples, can make the
same association for themselves. Such natural, and common,
reactions are what enable practices, as well as the development
of their associated language forms, to find purchase as standard
practices: Agreement in the way color samples are extended to
cases is the basis for the possession of standard color names.
On that same basis, of course, varied color vocabularies have
arisen.

Necessities and Certainties


Forms of life can be considered as points of reference for two key
philosophical concerns, necessities and certainties. The observations on such forms of life as a childs responses to teaching
are not contributions to some general and systematic account of
the nature of human life and its practices, for they are taken by
Wittgenstein to be utterly obvious. They are the sort of incontestable truths that cause philosophical confusion because they are
apt to be left out of account when someone moves into theorizing mode. Wittgensteins common method is to draw attention
back to these apparent matters to show that what seems deeply
puzzling when considered in isolation from the hurly-burly of
ordinary lives may cease to seem puzzling when set against the
background of practical activities from which it has been cut off.

Necessities
There can be a strong temptation to think that our institutions, practices, and ways of acting can (sometimes at least) be
explained in terms of necessities. Logic was especially important
to Wittgensteins reflections throughout his life. It was widely
assumed that the power of logical forms derived from the fact
that they reflected necessities intrinsic to the way things are
logic could be no other way than it is because it corresponds to
unalterable features of reality. Wittgensteins was no attempt to
do away with logical necessities altogether, only to dereify them,
and to persuade, more generally, that our institutions and practices do not follow from external necessities. Rather, our idea
of necessity flows from our institutions, practices, and so on.
Getting things the right way round is, as Peter Winch summarized it, a matter of understanding that logical relations among
propositions themselves depend on social relations between

Forms of Life
men (1958, 126). Wittgenstein suggested that when it seemed
that things could not be conceived of as being otherwise, this
was, if not an illusion, at most a function of the part that they
played in our ways of acting, not a sign of their essential relation
to any necessitating external reality. The temptation to appeal to
such externally imposed necessities could be alleviated by drawing attention to the degree to which seemingly transcendental
necessity is connected with some familiar, unobtrusive feature of
the human organism or its life, by imagining the consequences
of some gross change in the character of the human organism,
or by imagining a cultural practice significantly different from
our own. Thus, one need only imagine children not naturally
responsive to the rulelike link between instruction and sample to
quickly realize that color vocabularies and arithmetic (for example) depend on the trainable susceptibilities of children for their
existence as stable practices. Alternatively, imagine how another
people might have different practices from ours in counting or
measuring, and then appreciate that the specific forms that serve
these tasks in our society do not comprise the unique, eternal, or
universal essence of counting or measuring, and work as well as
they do only because they have connections with the practical
needs of our distinctive way of life.

Certainties
Certainty, too, is to be connected to natural human reactions,
to the way in which we commonly act without doubts or hesitations we go about many of our activities in a way in which doubt
is simply absent; it does not occur to us. The philosophers stock
idea of certainty is something that we are entitled to only after
we have arrived at some final, unquestionable justification, have
given grounds for our practices that allow no logical possibility
of error. Rather than accepting the skeptics query as to whether
it is logically possible to doubt, Wittgenstein prefers, instead,
to reflect on where we do doubt. We do not possess certainties
because we have arrived at them through critical reflections, but
acquire them as an integral part of coming to mastery in various
activities. Is this not, however, simply a complete concession to
the skeptic? All our practices lack justification; we cannot identify any ultimate, unquestionable self-evident truth that underpins them. No, it is a counterskeptical move: The certainties that
are incorporated into our learning and thus into our practices
are ones that serve as conditions of intelligibility, that shape our
capacity to understand what a doubt could be in an actual case.
These certainties are not held in place by an utterly uncritical attitude toward them but by their connection with our ways of doing
practical things, in which they act as the setting within which the
notions of doubt and justification can have content.
The doubts that the skeptic tries to induce are pretend ones
only, ones that by their very design are empty of empirical
content, and are, as genuine doubts, unintelligible. Skepticism
is right only insofar as its continuing pressure for further justification until a rock-bottom certainty is reached leaves us (quite
soon) in a position where we have no further justifications. It is
misguided to think that this shows that we have been forced to
admit our inability to respond to legitimate demands for justification. Rather, we run out of justifications at the same point at
which the skeptics demands for justification themselves pass
the limits of what can make sense as a doubt. We can settle the

Frame Semantics
doubts about the length of a table by recommending the use of a
tape measure as a dependable way of precisely determining this,
but what sorts of serious doubts are there about a tape measures
dependability in such a case? Our notion of reliable measurement is very much tied up with the ways we use tape measures,
but, at the same time, we should not think that our practices of
measurement are the only ones that could possibly make sense,
and that other people might not have ways of measuring for
which our tape measures would be useless.
Wes Sharrock
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Conway, Gertrude.1989. Wittgenstein on Foundations. New
Jersey: Humanities Press International.
Emmett, Kathleen. 1990. Forms of life. Philosophical Investigations 13
(July): 21331.
Garver, Newton. 1990. Form of life in Wittgensteins later works.
Dialectica 44.1/2: 175201.
Hanfling, Oswald. 2002. Wittgenstein and the Human Form of Life.
London: Routledge.
Hunter, John. 1968. Forms of life in Wittgensteins Philosophical
Investigations. American Philosophical Quarterly 5: 23343.
Winch, Peter. 1958. The Idea of a Social Science. London: Routledge.
Wittgenstein, Ludwig. 1958. Philosophical Investigations 2d ed.
Oxford: Blackwell.
. 1969. On Certainty. Oxford: Blackwell.

FRAME SEMANTICS
This term refers to a wide variety of approaches to the systematic description of natural language meanings. The one common feature of all these approaches which, however, does not
sufficiently distinguish frame semantics from other frameworks
of semantic description is the following slogan from Charles
Fillmore (1977a):
Meanings are relativized to scenes.

According to this slogan, meanings have internal structure that is


determined relative to a background frame or a scene. The easiest way to understand this thesis is by way of example. The following one is from Fillmore (1977c):
Suppose that two identical twins Mark and Mike are both in
a hospital sitting on the edge of their beds in exactly the same
position. When a nurse walks by Marks room, she says: I see that
Mark is able to sit up now, and when she walks by Mikes room
she remarks: I see that Mike is able to sit down now. Drawing on
what we know about hospitals our hospital background scenes
or frames we will interpret the two remarks of the nurse rather
differently, thereby relativizing the meanings of her remarks to
the relevant scenes.
Another often-cited example of Fillmore (1977c) that clearly
demonstrates the previous thesis is the difference in meaning
between the following two sentences:
(1) I spent three hours on land this afternoon.
(2) I spent three hours on the ground this afternoon.

The background scene for the first sentence is a sea voyage, while
the second sentence refers to an interruption of air travel. This

319

Frame Semantics
illustrates Fillmores use of the term frame as an idealization of a
coherent individuatable perception, memory, experience, action,
or object (1977c).
In order to understand frame semantics, it is helpful to begin
with a brief history. From here we turn to an overview of the most
important theoretical concepts. Next, the relationship of frame
semantics to one specific version of construction grammar
is introduced and some examples analyzed. The entry ends with a
short summary of applications of frame semantics and a note on
formalization. Usually, frame semantics is taken to be a very informal approach to meaning, but nevertheless, some approaches
relating frame semantics to formal semantics exist.

History
There are at least two historical roots of frame semantics; the first
is linguistic syntax and semantics, especially Fillmores case
grammar; the second is artificial intelligence (AI) and the notion
of frame introduced by M. Minsky (1975) in this field of study.
A case frame in case grammar was taken to characterize a
small abstract scene that identifies (at least) the participants of
the scene and thus the arguments of predicates and sentences
describing the scene. In order to understand a sentence, the language user is supposed to have mental access to such schematized scenes.
The other historical root of frame semantics is more difficult to
describe. It relates to the notion of frame-based systems of knowledge representations in AI. This is a highly structured approach
to knowledge representation that collects together information
about particular objects and events and arranges them into a taxonomic hierarchy familiar from biological taxonomies. However,
the specific formalism suggested in the aforementioned paper by
Minsky was not considered successful in AI .

Some Basic Theoretical Principles


The central theoretical concepts characterizing frame semantics originated with Fillmore and did not change much since his
first writings on this approach. In order to understand the most
important notions of frame semantics, let us briefly consider a
typical example of a frame, the commercial transaction frame that
demonstrates the origin of frame semantics from Fillmores case
frames as well. In Table 1, the concept frame is applied to verbs
like buy with the intention to represent the relationsips between
syntax and semantics. The verb buy, according to Table 1, requires
obligatorily a buyer and goods and optionally a seller and a price.
Verbs with related meanings, such as sell, are expected to have
the same meaning slots but in a syntactically different order. This
clearly shows the relation to Fillmores case frames. Combining
these frames results in the commercial transaction frame about
which Table 2 provides partial information. Of course, the PLACEfeature just marks the beginning of an open-ended list, since every
event in Table 2 can be further specified for instance, with respect
to time. Moreover, the collection of frames forms an ordered structure. For instance, the commercial transaction frame itself is part
of the more general transaction frame prototypically expressed by
the ditransitive verb give. This indicates that the system of dependencies between frames forms an intricate hierarchical structure.
The concept prototype is one of the most important concepts of frame semantics. Frames should be understood as

320

Table 1.
BUYER

buy

Subject

GOODS

(SELLER)

(PRICE)

object

From

for

from Pete

Angela

bought

the owl

Eddy

bought

them

Penny

bought

a bicycle

for $ 10
for $ 1

from Stephen

Table 2.
VERB

BUYER

GOODS

SELLER

MONEY

PLACE

Buy

subject

object

From

For

at

Sell

to

Cost

indirect
object

subject

Object

at

Spend

subject

on

Object

at

prototypical descriptions of scenes. A prototype has the advantage that it does not have to cover all possible aspects of the
meaning of a phrase; in other words, a prototype does not have to
provide necessary and sufficient conditions for the correct use of
a phrase. Fillmore (1977b) illustrates the use of prototypes within
frame semantics by an analysis of the concept widow. The word
widow is specified with respect to a background scene in which
people marry as adults, they marry one person, and their lives
are affected by their partners death and perhaps other properties. The advantage of a theory of meaning based on the prototype
concept, compared to a theory that insists on stating necessary
and sufficient conditions for the meaning of a phrase, is that it
does not have to care about certain boundary conditions; that is,
it does not have to provide answers for questions like Would you
call a woman a widow who has lost two of her three husbands but
who had one living one left? Fillmore (1977b). In a case like this,
whether the noun widow applies or not is unclear since certain
properties of the background frame for this concept are missing.
The concept prototype is not unproblematic either, however.
Note that Fillmore does not use this concept with respect to words
but with respect to frames or scenes. Some words like bird certainly have prototypes, but others may not have a corresponding
prototype. What is a prototypical vegetable, for instance, or a prototype corresponding to the adjective small? Moreover, applications of prototype theory often involve two different measures for
category membership. A penguin, for example, is certainly not a
prototypical bird, but nobody hesitates to judge it as a bird. The
other measure of category membership is typically used in the
analysis of vague predicates, for instance, color adjectives. It may
sometimes be hard or even impossible to assign a given object to
the category of pink or red entities .
Another central notion within frame semantics is the concept
profiling. R. Langacker (1987) uses the example of hypotenuse for
explaining this concept. One can easily draw a mental picture
of the concept hypotenuse. The interesting question concerning
this mental picture is this: Can you imagine what a hypotenuse is
without imagining the whole right triangle? The answer is clearly
no. The triangle and the plane in which it is included is a frame,
and the terms hypotenuse and right triangle are interpreted

Frame Semantics
with respect to this frame, but they profile different parts of the
frame.
The following example taken from Goldberg (1995) illustrates
lexical profiling of participants. Consider the following differences between the closely related verbs rob and steal.
(3) a. Jesse robbed the rich (of all their money).
b. *Jesse robbed a million dollars (from the rich).
(4) a. Jesse stole money (from the rich).
b. *Jesse stole the rich (of money).

These distributional facts can be explained by a semantic difference in profiling. In the case of rob, the victim and the agent
(the thief) are profiled; in the case of steal, the agent and the valuables are profiled. Representing profiled participants in boldface, A. Goldberg proposes the following argument structure for
rob versus steal:
rob <thief target goods>
steal <thief target goods>

However, Goldbergs main concern is with constructions, for


which she uses frame semantics in order to provide highly structured rich meanings for them.

Construction Grammar: A Closely Related Framework


What are constructions? Here is Goldbergs definition: A construction is defined to be a pairing of form with meaning/use
such that some aspects of the form or some aspect of the meaning/use is not strictly predictable from the component parts or
from other constructions already established to exist in the language (1995, 68).
There is no doubt that constructions exist. Morphemes, for
instance, satisfy Goldbergs definition. But do constructions different from morphemes exist? This is, of course, what defendants
of construction grammar try to show. Here, we take the existence
of constructions other than morphemes simply for granted.
Consider the following examples:
(5) Margaret baked Peter some cookies.
(6) Martin sneezed the napkin off the nightstand.

The peculiarity of example (5) is due to the fact that the verb
bake, which normally has two arguments, is used with three
arguments here. Peculiar as this sentence is, we nevertheless can
make sense of it. Margaret baked some cookies with the intention to give them to Peter. Note that this interpretation helps us
to make sense of the recipient role, which is not provided by the
verb bake; that is, we think of this sentence as an instance of the
ditransitive construction of which a more standard example is
(7) John gave Mary a present.

The crucial claim of construction grammar is that this is not


due to different basic meanings of the verb bake but due to the
integration of this verb plus its meaning into the ditransitive construction, which has a meaning of its own. Therefore, construction grammar distinguishes the semantics of argument structure
constructions from the semantics of the verbs that instantiate
them. An advantage of this approach is that it accounts for novel

uses of verbs in specific constructions. In (6), the intransitive


verb sneeze has to be integrated into the caused motion construction and, therefore, is forced to be interpreted as some kind of
action.
Both verbs and constructions are associated with frame
semantic meanings. However, in contrast to the rich frame
semantic representations of verbs, the basic constructions are
associated with a more abstract semantics. These basic constructions and their frames are supposed to be independent of
a particular language. They are cross-cultural structures that are
deeply entrenched in human experience. This is the content of
Goldbergs scene encoding hypothesis.
Scene Encoding Hypothesis: Constructions that correspond to
basic sentence types encode as their central senses event types
that are basic to human experience. (Goldberg 1995)

Applications
Frame semantics has a wide range of applications reaching
from subfields of linguistic theorizing such as morphology,
to typolology, linguistic discourse analysis, and
language acquisition. However, the central and most successful application seems to be (computational) lexicography. In a
frame-based lexicon, the frame accounts for related senses of a
single word and its semantic relations to other words. A framebased lexicon, therefore, offers more comprehensive information than the traditional lexicon. An example is Petruck (1986), a
study of the vocabulary of the body frame in modern Hebrew. An
example of computational lexicography is the FrameNet-System
(see Boas 2002).

Formalization
Although frame semantics does not lend itself easily to formalization there is an early approach by J. M. Gawron (1983) in which
basic insights of frame semantics were formalized by notations
like those of the LISP programming language, in combination
with situation semantics. A more recent approach is presented in
van Lambalgen and Hamm (2005) in which scenarios a concept
closely related to the frame concept are formalized as certain
kinds of logic programs. An explicit formalization of the combination of frame semantics and construction grammar based on
this work can be found in Andrade-Lotero (2006).
Fritz Hamm
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aitchison, J. 1994. Words in the Mind. Malden, MA: Blackwell.
Andrade-Lotero, E. 2006. Meaning and Form in the Event Calculus.
M.A. thesis, University of Amsterdam, MOL-200601.
Atkins, B. T. S. 1995. The role of the example in a frame semantics dictionary. In Essays in Semantics and Pragmatics, ed. M. Shibatani and
S. Thompson, 2542. Amsterdam: John Benjamins.
Boas, H. 2002. Bilingual FrameNet dictionaries for machine translation. In Proceedings of the Third International Conference on Language
Resources and Evaluation. Vol. 4. Ed. M. G. Rodriguez and C.P.S.
Araujo, 136471. Las Palmas de Gran Canaria, Spain: University of Las
Palmas de Gran Canaria.
Fillmore, C. 1977a. The case for case reopened. In Syntax and Semantics 8:
Grammatical Relations, ed. P. Cole, 5981. New York: Academic Press.

321

Framing Effects
. 1977b. Scenes-and-frames semantics. In Linguistic Structure
Processing, ed. A. Zambolli, 5582. Amsterdam: North Holland.
. 1977c. Topics in lexical semantics. In Current Issues in
LinguisticTheory, ed. R. W. Cole, 76138. Bloomington: Indiana
University Press.
Fillmore, C., and C. Baker 2000. Frame Net. Available online at: http://
www.icsi.berkeley.edu/framenet.
Gawron, J. M. 1983. Lexical representation and the semantics of complementation. Ph.D. diss., University of California, Berkeley.
Goldberg, A. 1995. Constructions: A Construction Grammar Approach to
Argument Structure. Chicago: University of Chicago Press.
Hamm, F., H. Kamp, and M. van Lambalgen. 2006. There is no opposition between formal and cognitive semantics. Theoretical Linguistics
32: 140.
Langacker, R. 1987. Foundations of Cognitive Grammar. Vol. 1. Theoretical
Prerequisites. Stanford, CA; Stanford University Press.
Minsky, M. 1975. A framework for representing knowledge. In The
Psychology of Computer Vision, ed. P. H. Winston, 21177. New York:
McGraw-Hill.
Petruck, M. 1986. Body part terminology in Hebrew. Ph.D. diss.,
University of California, Berkeley.
van Lambalgen, M., and F. Hamm. 2005. The Proper Treatment of Events.
Malden, MA: Blackwell.
For more information, the interested reader is advised to consult the
Frame Semantics Bibliography drawn up by Jean Mark Gawron.

FRAMING EFFECTS
A framing effect is usually said to occur when equivalent descriptions of a decision problem lead to systematically different
decisions. Framing has been a major topic of research in the psychology of judgment and decision making and is widely viewed
as carrying significant implications for the rationality debate
(e.g., Shafir and LeBoeuf 2002). Framing effects are commonly
taken as evidence for incoherence in human decision making
and for the empirical inapplicability of the rational actor models
used by economists and other social scientists. The first part of
this entry presents a brief review of the empirical phenomena;
the second part describes the standard normative interpretation
of these empirical effects. Although the literature has not typically focused on the structure of human conversational environments, framing effects involve utterances selected by a speaker
for a listener. A final section considers the possible implications
of communicative factors for a normative and descriptive understanding of framing effects.

Empirical Review
In this section, we follow I. P. Levin, S. L. Schneider, and G. J. Gaeths
(1998) taxonomy of framing effects into three categories: attribute
framing, risky choice framing, and goal framing.
In attribute framing, a single attribute of a single object is
described in terms of either a positively valenced proportion
or an equivalent negatively valenced proportion. The subject
is then required to provide some evaluation of the object thus
described. The typical finding is a valence-consistent shift (Levin,
Schneider, and Gaeth 1998): Objects described in terms of a
positively valenced proportion are generally evaluated more
favorably than objects described in terms of the corresponding negatively valenced proportion. For example, in one study,

322

beef described as 75% lean was given higher ratings than beef
described as 25% fat (Levin and Gaeth 1988); similarly, research
and development (R&D) teams are allocated more funds when
their performance rates are framed in terms of successes rather
than failures (Duchon, Dunegan, and Barton 1989). The valenceconsistent shift in attribute framing is a robust effect, observed in
a large range of experimental environments, with obvious implications for marketing and persuasion.
In risky choice framing, subjects are presented with two
options in a forced-choice task. The two options are typically
gambles that can be described in terms of proportions and probabilities of gains or losses. Usually, one of these options is a sure
thing (in which an intermediate outcome is specified as certain),
while the other is a risky gamble (in which extreme good and bad
values are both assigned non-zero probabilities). The gamble and
sure thing are both described either in terms of gain outcomes
and probabilities or in terms of equivalent loss outcomes and
probabilities. The two options are usually equated in expected
value (i.e., the mean outcome expected over many repeated trials), enabling the framing researcher to interpret observed patterns of preference in terms of subjects risk attitudes. Within this
rubric, preferences for the sure thing indicate risk aversion, and
preferences for the gamble indicate risk seeking. The best-known
risky choice framing problem is the so-called Asian Disease
Problem (Tversky and Kahneman 1981). In it, subjects first read
the following background blurb:
Imagine that the U.S. is preparing for the outbreak of an unusual
Asian disease, which is expected to kill 600 people. Two possible
programs to combat the disease have been proposed. Assume
that the exact scientific estimates of the consequences of these
programs are as follows:

Some subjects are then presented with options A and B:


(A) If program A is adopted, 200 people will be saved.
(B) If program B is adopted, there is a one-third probability
that 600 people will be saved and a two-thirds probability that
no people will be saved.
Other subjects are presented with options C and D:
(C) If program C is adopted, 400 people will die.
(D) If program D is adopted, there is a one-third probability that nobody will die and a two-thirds probability that 600
people will die.
The robust experimental finding is that subjects tend to prefer
the sure thing when given options A and B but tend to prefer the
gamble when given options C and D. Note, however, that options
A and C are equivalent, as are options B and D. Subjects thus
appear to be risk averse for gains and risk seeking for losses, a
central tenet of prospect theory (Kahneman and Tversky 1979).
In prospect theory, it is the decision makers private framing of
the problem in terms of gains or losses that determines his or
her evaluation of the options; the framing manipulation is thus
viewed as a public tool for influencing this private frame.
In goal framing, subjects are urged to engage in some activity (e.g., wearing seatbelts). This plea involves a description
of either the advantages of participating in the activity or the

Framing Effects
corresponding disadvantages of not participating. The most
common result is that subjects are more likely to engage in the
activity when the disadvantages of not engaging, rather than the
advantages of engaging, are emphasized (Levin, Schneider, and
Gaeth 1998).

Normative Analysis
Risky-choice framing effects have been put forward as positive
evidence for prospect theory (Kahneman and Tversky 1979),
a theory of choice that aims to be both formally tractable and
cognitively realistic. However, the focus in the framing literature has been largely on the negative evidence that framing
effects allegedly raise against classical expected utility theory
and other so-called rational actor models. The literature on
attribute framing, in particular, is concerned almost exclusively
with the normative and practical implications of the empirical effects. Framing effects, D. Kahneman has noted, are
less significant for their contribution to psychology than for
their importance in the real world and for the challenge they
raise to the foundations of a rational model of decision making (2000, xv). This raises the important questions: Are framing effects always counternormative? And if so, what norm or
norms do they violate?
In an important paper, A. Tversky and Kahneman argued that
framing effects violate a bedrock normative condition of description invariance [a]n essential condition for a theory of choice
that claims normative status so basic that it is tacitly assumed
in the characterization of options rather than explicitly stated as
a testable axiom (1986, S253). Any theory of rational choice, they
argued, must stipulate that the same problem will be evaluated
in the same way, regardless of how the problem is described
thus, equivalent descriptions should lead to identical decisions.
Expected utility theory, for example, satisfies this principle: it
evaluates choice options strictly as a function of probability and
outcome, with no specification of probability-outcome framing. This reducibility of decision problems to a canonical form
is clearly a theoretical convenience; the principle of description
invariance states that it is also a normative requirement. Because
the framing phenomena observed both in the laboratory and in
real-world situations violate the description invariance principle, these effects are taken to imply that no theory of choice
can be both normatively adequate and descriptively accurate
(ibid., S251).

Framing, Communication, and Rational Norms


Although framing effects are investigated mainly in relation to
normative choice models, such effects are clearly bound up with
human language, and closely related phenomena have been
investigated by language scholars. For example, markedness
theorists have documented subtly different information conveyed by opposing polar adjectives. The school of cognitive
linguistics has drawn on a more general notion of frame in
its treatment of fundamental issues in semantics (see frame
semantics). Framing, in the broad sense, enters crucially into
many processes of communication and can only be fully
understood in the context of those processes.
Experimental framing effects involve utterances selected by
speakers for listeners, but the standard normative analysis,

described in the previous section, applies to listener effects without any consideration of associated speaker phenomena (i.e.,
regularities in how speakers choose frames in typical linguistic
environments). Researchers have tended to interpret the experimental effects as if the experimenter had somehow surgically
implanted a framing of the decision problem into the subjects
brain. However, because linguistic utterances are employed,
regularities in speaker behavior may be relevant to the normative and descriptive understanding of listener behavior: If speakers tend to choose different frames as a function of background
conditions, then listeners may reasonably draw inferences from
the speakers choice of frame. If knowledge of these background
conditions is relevant to the listeners choice, then the frames,
while logically equivalent, would not be information equivalent.
S. Sher and C. R. M. McKenzie (2006; cf. McKenzie and Nelson
2003) argued that the frames studied in the attribute framing
literature are commonly information nonequivalent, because
speakers tend to frame options in terms of attributes that are
relatively salient. For example, a generally impressive R&D team
is more likely to be described in terms of its success rate than a
generally incompetent team with the same success/failure rate.
A positive frame thus highlights the salience of the positive attribute in the speakers conception of the option information relevant to its evaluation.
Experiments convey information to subjects in framed statements, and researchers have generally assumed that the only
information content is logical information content. The framing
of the logical content is assumed not to convey information but
simply to influence the listeners construal of the logical content.
In this way, the usual normative analysis of framing experiments
leans on an implicit assumption of the information equivalence
of logically equivalent frames. However, while the logical equivalence of a pair of frames can usually be determined on inspection
(though see Jou, Shanteau, and Harris 1996), a determination of
information equivalence requires empirical study of the human
communicative environments in which speakers typically frame
objects and options. At least in the domain of attribute framing,
logical equivalence does not imply information equivalence.
Whether the study of communicative environments will have
similar implications for traditional normative conclusions drawn
in risky choice and goal framing is an open question. There also
remain important questions about how, and how flexibly, listeners use subtle information that is, in principle, available in particular framing experiments. However, an analysis of speaker
regularities in human communicative environments is likely to
be of some significance in any research area in which information presented to experimental subjects is evaluated against a
normative standard of information equivalence (cf. Hilton 1995;
McKenzie 2004; Sher and McKenzie 2008; Schwarz 1996).
Shlomi Sher and Craig R. M. McKenzie
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Duchon, D., K. J. Dunegan, and S. L. Barton. 1989. Framing the problem
and making decisions: The facts are not enough. IEEE Transactions on
Engineering Management (February): 257.
Hilton, D. J. 1995. The social context of reasoning: Conversational inference and rational judgment. Psychological Bulletin 118: 24871.

323

Frontal Lobe
Jou, J., J. Shanteau, and R. J. Harris. 1996. An information processing
view of framing effects: The role of causal schemas in decision making. Memory and Cognition 24: 115.
Kahneman, D. 2000. Preface. In Choices, Values, and Frames, ed.
D. Kahneman and A. Tversky, ixxvii. Cambridge: Cambridge
University Press.
Kahneman, D., and A. Tversky. 1979. Prospect theory: An analysis of
decision under risk. Econometrica 47: 26391.
Levin, I. P., and G. J. Gaeth. 1988. How consumers are affected by the
framing of attribute information before and after consuming the product. Journal of Consumer Research 15: 3748.
Levin, I. P., S. L. Schneider, and G. J. Gaeth. 1998. All frames are not
created equal: A typology and critical analysis of framing effects.
Organizational Behavior and Human Decision Processes 76: 14988. A
thorough taxonomy, review, and analysis of attribute, risky choice, and
goal framing effects.
McKenzie, C. R. M. 2004. Framing effects in inference tasks and
why they are normatively defensible. Memory and Cognition 32:
87485.
McKenzie, C. R. M., and J. D. Nelson. 2003. What a speakers choice of
frame reveals: Reference points, frame selection, and framing effects.
Psychonomic Bulletin and Review 10: 596602.
Schwarz, N. 1996. Cognition and Communication: Judgmental
Biases, Research Methods, and the Logic of Conversation. Mahwah,
NJ: Erlbaum. A summary of empirical evidence indicating that many
supposed shortcomings of human judgment are due to experimental subjects going beyond the literal meaning of the information
provided by the researcher and drawing on the pragmatic meaning.
Shafir, E., and R. A. LeBoeuf. 2002. Rationality. Annual Review of
Psychology 53: 491517.
Sher, S., and C. R. M. McKenzie. 2006. Information leakage from logically equivalent frames. Cognition 101: 46794.
. 2008. Framing effects and rationality. In The Probabilistic
Mind: Prospects for Bayesian Cognitive Science, ed. N. Chater and M.
Oaksford, 7996. Oxford: Oxford University Press. A summary of problems of information nonequivalence in the framing literature and of
the application of information equivalence to other areas of psychological research.
Tversky, A., and D. Kahneman. 1981. The framing of decisions and the
psychology of choice. Science 211: 4538.
. 1986. Rational choice and the framing of decisions. Journal of
Business 59: S25178. An influential discussion of risky choice framing
effects and their implications for models of rational choice.

FRONTAL LOBE
The evolution of the frontal lobes in humans is strongly associated with the advent of language, the primary behavior distinguishing humans from other primate species. Although not
much is known about how the frontal lobes evolved, we do know
that there was a marked expansion 3.3 to 2.5 million years ago
in Australopithecus Africanus, a prehistoric form of human. This
frontal lobe development, coupled with growth of the temporal
lobes also associated with language function, coincides with the
creation and use of tools. Tool use may have given rise to the
development of hand preference and eventual lateralization of
language functions. Verbal communication likely evolved from
hand signals to vocalizations, which progressively came to signify common referents (i.e., words) used in spoken language.
Thus, the genesis of a linguistic symbolic communication system influenced, and in turn was influenced by, brain growth and

324

reorganization. The primary role of the frontal lobes has developed from behavior regulation through language, giving rise to
abstract thinking and the ability to plan and execute complicated
actions. It may very well be that the development of language
and the frontal lobes propelled humans from a hunting and
predatory species to a social species, capable of modifying and
manipulating their environments.

Neuroanatomical and Neurophysiological Overview


In humans, the frontal lobes are the largest lobes of the brain
and comprise about one-third of the cerebral cortex. As seen in
Figure 1, the frontal lobes lie anterior to the central (Rolandic)
sulcus and superior to the lateral (Sylvian) fissure. Anatomically,
there are several important landmark sulci and gyri in the frontal
lobe. The precentral sulcus lies anterior and parallel to the central sulcus and comprises a vertically placed precentral gyrus.
The superior and inferior frontal sulci, extending from the precentral sulcus, separate the lateral surface of the frontal lobe into
three horizontally parallel gyri: the superior, middle, and inferior frontal gyrus. The inferior frontal gyrus is divided into three
parts: The pars orbitalis, the pars triangularis, and pars opercularis constitute brocas area, an important region for speech
and language.
Functionally, the frontal lobe can be divided into motor areas
and prefrontal cortex. The motor areas include the primary
motor cortex, the premotor cortex, the supplementary motor
cortex, frontal eye field, and Brocas area. The primary motor
cortex (Brodmanns area, BA 4), also called the motor cortex, is
located anterior to the central sulcus and the adjacent portion of
the precentral gyrus. The primary motor cortex executes skilled,
voluntary movement on the contralateral (opposite) side of the
body. That is, the right hemisphere controls the left side of the
body and the left hemisphere controls the right side. The entire
human body is represented in the primary motor cortex, and
these representations are arranged somatotopically, such that
each area of the body is related to a specific area on the motor
cortex (see Figure 2). This somatotopic mapping, also know as
the homunculus (in Latin, little man), is mapped according to
the amount of brain matter devoted to each particular body part,
rather than by the proportional body-part size. Thus, there is a
relatively larger portion of the motor cortex dedicated to laryngeal, tongue, and finger areas that involve fine and complex
movements, and a relatively smaller portion of the motor cortex
dedicated to the trunk and limbs that have relatively gross and
simpler movements.
The premotor cortex lies anterior and parallel to the primary
motor cortex. The premotor cortex stores motor schemata (elementary motor acts) and selects movements to be executed for
externally cued movements. Together with the prefrontal cortex, the premotor cortex prepares, initiates, selects, and learns
movement patterns. For language, thus, the premotor cortex
determines how humans are going to move their articulators
to generate speech. There is also a supplementary motor area
located in the mesial prolongation of the premotor area. This
supplementary area helps to execute spontaneous or self-initiated movements, and damage to this area causes reduced spontaneous speech and apraxia of speech.

Frontal Lobe

Figure 1. Frontal Lobe (left hemisphere).


Source: Adapted from Gray, Henry. Anatomy of the
Human Body. Edited by Warren H. Lewis. Philadelphia: Lea
& Febiger, 1918. New York: Bartleby.com, 2000.

Figure 2. The motor homunculus.


Source: Penfield, W, and Rasmussen T. The Cerebral Cortex of
Man. New York: Macmillan, 1950.

325

Frontal Lobe
Brocas area consists of the foot of the third frontal convolution (the inferior frontal gyrus), just in front of the motor strip.
Traditionally thought to be responsible for the production of language, it has also been implicated in comprehension of syntax.
As with any of the perisylvian regions involved in language, it
appears to contribute to lexical access as well.
The second major portion of the frontal lobe, in addition to
the motor areas just discussed, is the prefrontal cortex, located
in the foremost part of the frontal lobe, in front of the motor
area. The ratio of the prefrontal cortex to the rest of the cortex is
larger in humans than in any other species. The prefrontal cortex
is divided into three major regions: dorsolateral, orbitofrontal,
and medial prefrontal cortex. Although there is some controversy, currently many researchers associate different behavioral
changes and cognitive disturbances with each of these three
prefrontal regions. Dorsolateral prefrontal cortex is suggested to
have a role in working memory, attention, planning, and problem solving. Lesions in this area cause an inability to plan correctly and difficulty with multistep tasks. The orbitofrontal cortex
is associated with emotion and the initiation of behavior. Lesions
of this area lead to disinhibition, socially inappropriate behavior, and change of affect. Finally, the medial/cingulate prefrontal cortex is related to motivation and drive. Lesions in this area
result in apathy, loss of interest in ones life, and reduced spontaneous speech and movement.
The prefrontal cortex is extensively connected to the motor,
perceptual, and limbic systems of the brain. It sends and receives
great amounts of information and controls cognitive processes
so that appropriate behaviors are carried out at the correct time
and place. It takes part in higher aspects of motor control and
various cognitive functions, such as inhibition, problem solving, abstract thinking, emotional control, and appropriate social
behavior. Neuropsychologists consider the prefrontal cortex
particularly responsible for executive function, by which they
mean the ability to plan, monitor, and make inferences, as well
as related self-governing behaviors. The old notion of a computer with a central processor that determines how the rest of
the systems computational abilities are to be used is sometimes
employed when the role that executive function plays vis--vis
other cognitive abilities is discussed. When distinction is made
between automatic and controlled processes, it is the controlled
processes that the prefrontal cortex controls. Cognitive flexibility
is also considered a role played by the frontal lobes. working
memory the ability to keep a limited amount of information
available for analysis is either assumed to be a part of executive
function or to be closely linked to it.
Language behaviors that involve executive function are the less
automatic ones, such as appreciating humor, making inferences
from nonexplicit phrasing, appreciating the pragmatic aspects of
communication, monitoring ones speech for speech errors and
correcting them, monitoring ones interlocutors to rephrase or
shift register if they appear not to be comprehending, selecting
the right language or mix of languages to speak in bilingual and
polyglot situations, avoiding culturally taboo words or topics that
would be inappropriate in a given context, and reanalyzing input
when garden-pathed or on other occasions when it appears not
to make sense. One language task that appears to put substantial
burden on working memory is simultaneous interpretation.

326

Neurological Underpinnings of Our Knowledge


of Frontal Lobe Responsibilities
Lesions of the frontal lobe have resulted in a number of speech
and language deficits, as well as motor, memory, behavioral,
emotional and cognitive problems, that affect speech and language functioning. The main speech and language disorders
include dysarthria, apraxia of speech, mutism, and aphasia.
Damage to the motor areas in the frontal lobe causes weakness
or paralysis of the face and articulators, such as the larynx, pharynx, palate, tongue, lips, and jaw. Dysarthria is a speech production disorder due to impaired control of these articulatory
systems, and the patients may show abnormal acoustic and
phonetic patterns, hypophonic speech (decreased vocal volume), deviant intonation, and prosody disturbances in their
speech. Apraxia of speech, a disorder of motor planning and
programming of speech without muscle weakness or paralysis, is
characterized by an inability to execute appropriate movement
for articulation of speech. Apraxic patients phonological
errors are inconsistent, and articulatory accuracy is better
for automatic speech than volitional speech. Also, patients with
frontal lobe damage may lack voluntary initiation of conversation and evidence marked reduction in language production.
In some cases, frontal lobe damage results in mutism, that is,
preserved comprehension but no verbal or gestural intent to
communicate with others. Patients with mutism show a lack of
frustration about their breakdown of expression, unlike patients
with Brocas aphasia.
Brocas aphasia is traditionally associated with a lesion to
Brocas area in the left frontal lobe, and characterized by relatively intact comprehension but effortful, short sentence production with morphosyntactic errors and few functors: This form of
speech production is often called telegraphic speech. However,
contemporary brain imaging studies suggest that more extensive
left frontal lobe lesions are required to cause persistent Brocas
aphasia. Also, transcortical motor aphasia, with good repetition and comprehension but impaired speech abilities, can also
result from damage of the frontal lobe.
Patients with frontal lobe damage in whom Brocas area is
spared generally do not produce grammatical errors. They do,
however, usually show a deficit in verbal fluency. Generally, verbal fluency is examined by so-called phonemic (e.g., list words
beginning with the letter s) or semantic category naming tasks
(e.g., list animals). Also, perseveration (uncontrollable repetition
of a particular word or phrase) is a frequently observed symptom
in patients with frontal lobe lesions.
Although reading and writing disorders (alexia and agraphia,
respectively) are more often associated with posterior brain
lesions, frontal lobe lesions can also manifest them. Frontal lobe
alexia, a reading disorder, involves difficulty in decoding words
by the grapheme-to-phoneme method. Frontal lobe agraphia, a
writing disorder, results in spelling errors, including inappropriate letter repetition and erroneous word selection.
Deficits of behavioral and emotional problems due to frontal lobe damage may affect language and communication function as well. The close association of personality changes and
prefrontal cortex damage is best evidenced by the famous case
of Phineas Gage. Gage was a railroad construction worker who
suffered a penetrating wound to his frontal cortex. Although he

Functional Linguistics
survived the serious accident, he was no longer himself. Before
the accident, Gage had been known as a hard-working, capable,
and sociable man, but his personality changed radically after the
accident. He uttered profanities, becoming impatient, obstinate,
fitful, and antisocial. Indeed, in Gages case and similar ones,
difficulty in socializing follows frontal lobe brain damage. Thus,
while no overt language problems regarding the form of language, such as grammar, or regarding language content, such as
semantic meaning, were reported in these patients, their use of
pragmatic language skills was impaired.
In sum, the frontal lobes play a crucial role in various speech
and language processing tasks, including initiation of articulatory movement, phonetic/phonological selection, morphosyntactic production, syntactic manipulation, and integration of
pragmatic function. Depending on the location and depth of the
frontal lobe lesion, a mixture of cognitive and behavioral symptoms can appear with the disorders of these speech and language
functions as well.
JungMoon Hyun, Elizabeth Ijalba, Teresa M. Signorelli,
Peggy S. Conner, and Loraine K. Obler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, Steven W., Hanna Damasio, Daniel Tranel, and Antonio
R. Damasio. 2000. Long-term sequelae of prefrontal cortex damage acquired in early childhood. Developmental Neuropsychology
18: 28196.
Baddeley, Alan. 2003. Working memory and language: An overview.
Journal of Communication Disorders 36: 189208.
Damasio, Hanna, Daniel Tranel, Thomas Grabowski, Ralph Adolphs, and
Antonio R. Damasio. 2004. Neural systems behind word and concept
retrieval. Cognition 92: 179229.
Miller, Bruce L., and Jeffrey L. Cummings. 2006. The Human Frontal
Lobes: Functions and Disorders. New York: Guilford.
Penfield, W., and T. Rasmussen. 1950. The Cerebral Cortex of Man A
Clinical Study of Localization of Function. New York: Macmillan.

FUNCTIONAL LINGUISTICS
Functional linguistics, broadly defined, includes a wide range of
diverse approaches that highlight the interdependence of language structure and language function. In this view, structural
features of languages have evolved and continue to develop as
a result of competing cognitive, communicative, ecological, and
social pressures. (See also grammaticalization.) Functional
research treats language neither as a purely formal object nor as a
closed autonomous system; rather, it considers language form to
be tightly integrated with both the uses and users of language (see
also usage-based theory).
Within the context of current theories of linguistics, many
functional approaches differ sharply from so-called formal theories, such as government and binding or minimalism. While
lines of demarcation between formal and functional approaches
are, of course, not always clear-cut, formal theories tend to begin
with the premise that grammar is innate, to assume the autonomy of syntax, to hold to the modularity of language from
other cognitive faculties, and to impose a distinction between
competence and performance. Functionally oriented linguistics, on the other hand, tends to question the validity of these

assumptions for a variety of empirical reasons (see papers in


Elman et al. 1996; MacWhinney 1999, and references therein).
Skepticism toward these tenets of formal linguistics has consequences for the kinds of data and research methodologies
employed in functional approaches, many of which eschew
introspective grammaticality judgments and focus instead
on more inductive methodologies, such as evidence from corpora of naturally occurring language, data gleaned from field
elicitation, experimentation, statistical correlations, and ethnographic observation (see corpus linguistics).
Widely known functional theories of grammar include systemic functional linguistics (Halliday 1994), the theory of functional grammar (Dik 1997), and the functional approach to
syntax explicated by Talmy Givn (2001), as well as role and
reference grammar. For the purposes of this entry, a functionally oriented approach is any approach that holds that
structural aspects of language (phonology, morphology,
syntax, discourse organization [see discourse analysis
(linguistic)], etc.) are motivated and constrained by functional concerns. These can be broadly divided into at least four
overlapping, closely allied factors: 1) the role of communication and discourse, 2) the centrality of meaning, 3) human
cognitive, neurological, and physiological capacities, and 4) the
social and interactional nature of human beings (see papers in
Tomasello 1998 and 2002 for a thorough overview.) To arrive at a
full understanding of the functional constraints and motivations
that act on the form of language, it is necessary to consider the
joint contribution of all of these areas.
Functional research in the early twentieth century focused
primarily on the way in which communicative pressures organize sentence structure. For example, Prague School linguists
noted regular patterns of the placement of the theme (information known by the recipient) and the rheme (information new to
the recipient). In pragmatically unmarked sentences in English,
for example, known information tends to come early in a sentence as its subject, while new information comes later, and
both information statuses tend to co-occur with certain types of
prosody. The key insight of this approach is that the functional
need to organize and differentiate known information from new
information leads to prosodic and structural consequences for
the ways in which sentences are organized an illustration of
function affecting form. Prague School work directly influenced
the approach known as functional sentence perspective (Firbas
1992), the functional theory of Andr Martinet (1962), and was a
precursor to the cognitively based information flow theory fully
articulated by Wallace L. Chafe (1994; see also information
structure in discourse.)
As work in this area began to develop and mature, research
expanded beyond the boundaries of simple communicative
pressure to encompass wider and more complex discourse and
pragmatic contexts. Such work has become widely known as
discourse-functional linguistics. Current discourse-functional
work seeks to describe the ways in which particular grammatical constructions are used in discourse, to propose explanations
for the way those forms may have come into being through language use, and to explicate the consequences of this research in
shaping the linguists understanding of the nature of grammar.
For example, Sandra A. Thompson and Anthony Mulac (1991)

327

Functional Linguistics
describe the conditions for the occurrence and nonoccurrence
of the complementizer that in a corpus of English conversation
(e.g., in sentences like You can tell that its going to rain versus
I think its going to rain). They find the distribution of that to
be probabilistically determined by functional factors, such as
topic management and strength of epistemic commitment. The
explanation of these findings rests in the observed blurring of
the distinction between main and subordinate clauses in these
contexts, whereby the so-called main clause in constructions
without that are actually best analyzed as epistemic adverbial
phrases (see also Thompson 2002). These findings suggest that
the traditional binary distinction between main and subordinate clauses is not categorical, as it emerges out of specific communicative situations in natural discourse. (See Cumming and
Ono 1997 for a thorough explication of the discourse-functional
approach, including further examples and references; see also
emergent structure.)
In addition to communicative and discourse factors, functional research of the last several decades has also focused on the
role of semantics in shaping grammar (Bolinger 1977 inter alia;
Halliday 1994). In these views, meaning is central in motivating
form, and, therefore, particular grammatical constructions have
the forms they do because of the particular meanings that they
convey. Two noteworthy paradigms emerging from this tradition include cognitive linguistics, which views language
form as motivated by meaning and conceptualization, and the
cross-linguistic approach known as typological-functional linguistics, which explores the ways in which similar meanings are
expressed in the grammars of languages around the world (see
cognitive grammar and typology.)
Other strands of functional research investigate the ways in
which neurological, cognitive, and physiological faculties of the
human user motivate and constrain grammar and phonology.
For example, Sidney M. Lamb (1999) outlines a connectionist,
relational network model of language based in what is known
about neurons and connections among neurons and the neurological processes of activation and inhibition. Functional theories
of phonology likewise take seriously the conceptual, neurological, and physiological aspects of language users, for example,
articulatory phonology (Browman and Goldstein 1986), which
views the abstract phonological system as constrained by the
physical system of articulatory phonetics. More recently,
Joan L. Bybee (2001) has proposed an exemplar-based theory
of phonology, which likewise treats phonology as grounded in
the concrete physiology of phonetics and considers speech as
a neural-motor activity subject to the same processes observed
in other complex motor activities such as routinization, simplification, and entrenchment based on frequency. (See also Bybee
2006 for an extension of this approach to grammar in general.)
Recent functional research has also begun to concentrate on
ways in which the inherent social and interactional nature of language users likewise motivates and constrains language form. One
socially oriented approach has become known as interactional
linguistics, incorporating insights from sociology especially the
subfields of ethnomethodology and conversation analysis.
Interactional linguistics views language as a form of highly systematic social action. Interactional research seeks to describe the
prosodic and syntactic forms that regularly accomplish specific

328

Games and Language


social actions and, conversely, to understand in what ways social
activities, such as turn-taking, conversational repair,
sequential organization, assessments, and preference structure,
are directly implicated in shaping syntax and prosody. (For an
overview, several concrete examples, and further references, see
Ford, Fox, and Thompson 2002).
Robert Englebretson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bolinger, Dwight L. 1977. Meaning and Form. New York: Longman.
Browman, Catherine P., and Louis Goldstein. 1986. Towards an articulatory phonology. Phonology Yearbook 3: 21952.
Bybee, Joan L. 2001. Phonology and Language Use. New York: Cambridge
University Press.
. 2006. From usage to grammar: The minds response to repetition. Language 82.4: 71133.
Chafe, Wallace L. 1994. Discourse, Consciousness, and Time: The Flow
and Displacement of Conscious Experience in Speaking and Writing.
Chicago: University of Chicago Press.
Cumming, Susanna, and Tsuyoshi Ono. 1997. Discourse and grammar.
In Discourse Studies: A Multidisciplinary Introduction, Vol. 1. Ed. Teun
A. van Dijk, 11237. London: Sage.
Dik, Simon C. 1997. The Theory of Functional Grammar. 2 vols. 2d rev. ed.
Ed. Kees Hengeveld. New York: Mouton de Gruyter.
Elman, Jeffrey, Elizabeth Bates, Mark Johnson, Annette KarmiloffSmith, Domenico Parisi, and Kim Plunkett, eds. 1996. Rethinking
Innateness: A Connectionist Perspective on Development. Cambridge,
MA: MIT Press.
Firbas, Jan. 1992. Functional Sentence Perspective in Written and Spoken
Communication. New York: Cambridge University Press.
Ford, Cecelia E., Barbara A. Fox, and Sandra A. Thompson. 2002. Social
interaction and grammar. In The New Psychology of Language. Vol. 2.
Ed. Michael Tomasello, 11943. Mahwah, NJ: Lawrence Erlbaum.
Givn, Talmy. 2001. Syntax: An Introduction. 2 vols. Rev. ed.
Amsterdam: John Benjamins.
Halliday, M. A. K. 1994. An Introduction to Functional Grammar. 2d ed.
London: E. Arnold.
Lamb, Sydney M. 1999. Pathways of the Brain: The Neurocognitive Basis of
Language. Amsterdam: John Benjamins.
MacWhinney, Brian, ed. 1999. The Emergence of Language. Mahwah,
NJ: Lawrence Erlbaum.
Martinet, Andr. 1962. A Functional View of Language. Oxford: Clarendon
Press.
Thompson, Sandra A. 2002. Object complements and conversation: Towards a realistic account. Studies in Language 26.1: 12564.
Thompson, Sandra A., and Anthony Mulac. 1991. The discourse conditions for the use of the complementizer that in conversational English.
Journal of Pragmatics 15: 23751.
Tomasello, Michael, ed. 1998, 2002. The New Psychology of
Language: Cognitive and Functional Approaches to Language Structure.
2 vols. Mahwah, NJ: Lawrence Erlbaum.

G
GAMES AND LANGUAGE
Rules and Language Games
One traditional view in philosophy and linguistics is that without rules of usage common to the speaker and the listener,

Games and Language


communication would be impossible. According to it, every
linguistic expression has a meaning, which is determined by
the rules for its correct use. This obviously brings language and
games together, for it is in the latter that rules are explicitly given.
Here are two examples of language-games, which illustrate
in a very simple and ideal way how a communication language
could emerge out of language games. They are from the Finnish
logician Erik Stenius, who thought that they are typical examples
of the view of language advocated by Ludwig Wittgenstein in his
later period.
The Garden Game is played by a gardener A and his assistant
B. There are pieces in the game, the letters a, b, c, P, and Q and
a flower bed divided into squares as in the following figure. In
every square there is a plant.
3rd day
2d day
c

1st day

The game amounts to this: Every day B writes on a piece of


paper the letters a, b, c, and to the left of any of these letters he
writes either the letter P or the letter Q, according to whether the
plant in the square corresponding for that day to the lowercase
letter is in flower or not. For instance, if in the rectangle for that
day the plant next to the path is in flower, whereas the two others
are not, B will write:

Strategic Games and Nash Equilibria


One of the most legendary games in classical game theory is the
Prisoners dilemma: Two criminals, 1 and 2, are interrogated in
separate cells. If they both confess, each of them will be punished
to stay three years in prison. If only one of them confesses, he will
be free while the other will be punished with four years in prison.
If neither of them confesses, each will stay one year in prison. The
following figure depicts the choices and payoffs of the players of
the Prisoners dilemma:

D
C

D
(1,1)
(0,4)

C
(4,0)
(3,3)

D stands for dont confess and C stands for confess.


As we see, a complete description of a strategic game with
two players requires a list of the players action repertoires A1
and A2, and a specification of their utility functions u1 and u2.
u1 specifies, for each possible sequence of choices (a,b) (action
profile) player 1s payoff. In our example, we have: u1(D,D) = 1,
u1(D,C) = 4, u2(D,C) = 0, and so on.
Many of the games studied in classical game theory are strictly
competitive (zero-sum games): The players payoffs are strictly
opposed; that is, for each action profile (a,b), u1(a,b) + u2(a,b) =
0. Matching pennies is a typical example of such a game: Each
player chooses either Head or Tail. Player 2 pays player 1 one
euro if the two choices match. Otherwise, it is player 1 who pays
player 2 one euro in Matching pennies:

Pa Qb Qc

The teaching of the game is done by simple gestures of


approval and disapproval, depending on whether B writes the
correct tokens on the piece of paper.
Once the assistant masters the Garden Game, A and B move
to play the Report Game. A does not need to accompany B to
the flowerbed any longer. A now takes part in the game only
by receiving the written tokens from B. If B really follows the
rules of the game, A can read off certain facts from what B has
written.
It is obvious that by means of the report game, A and B have
created a small language for communication: a, b, and c are used
to denote certain squares, and so on. They get a meaning.
Steniuss language games had more of a philosophical purpose,
namely, to give concrete examples of Wittgensteinian languagegames. They inspired the philosopher David Lewis, who formulated
them in a more precise way. In doing so, Lewis thought to respond
to a challenge launched by another philosopher and logician,
Quine. W. V. O. Quine regarded with distrust conventional views
of language and doubted that one can give a coherent account of
how communication takes place without already presupposing
some degree of linguistic competence. Lewis formulated signaling games, that is, communication games played by two players,
the sender and the receiver, the former sending messages or signals about the situation he or she is in, and the latter undertaking a
certain action after receiving it. The point to be emphasized is that
the messages do not have a prior meaning: Whatever meaning they
are going to acquire, it will be the result of the interactive situation
in the signaling game, or in other terms, they will be optimal solutions in the game. Let us have a closer look at the game-theoretical
setting.

(1,1)

(1,1)

(1,1)

(1,1)

Given a strategic game, we are interested in optimal plays of the


game, that is, in every players action being the best response
to the actions of his opponents. For simplicity, we consider the
Prisoners dilemma game, but the points we make are general.
Consider an arbitrary action profile (a,b). It is a Nash equilibrium
in the strategic game if none of the players would have been better off by making a different choice:
uI(a,b) uI(c,b), for any c in A1
uII(a,b) uII(a,d), for any d in A2.

Thus, in the Prisoners dilemma game, (C,C) is a Nash equilibrium, but there is no Nash equilibrium in the Matching pennies
game. There are games that have two Nash equilibria, like the
following one where both (L,L) and (R,R) are Nash equilibria:
L

(1,1)

(1,1)

(1,1)

(1,1)

In this case, the problem is to devise a criterion for selecting


between them. One such criterion is Pareto optimality:
The action profile (a,b) is Pareto more optimal than (a,b), if a > a
and b > b.

329

Games and Language

Lewiss Signaling Games


Lewis modeled signaling games using the game-theoretical
apparatus just described. The game models an ideal communicative situation, which involves a sender and a receiver, the former
sending messages that the latter tries to interpret. More precisely,
whenever he or she is in a state t (one of the many from the set
T of possible states), the sender selects a message or form f from
a set F that he or she sends to the receiver. The receivers task is
one of interpretation; that is, whenever a message f is received,
he or she will associate it with a state in T. The signaling games
are cooperative games; that is, both players try to achieve a common goal, communication. For this reason, whenever in state t
the sender sends f and the receiver interprets f as t, both players
receive an equally rewarding payoff. As in Steniuss Report
Game, the receiver does not have full knowledge about the state
the sender is in. The situation becomes even more complicated
when, for instance, the sender may use more than one form in
the same state. The optimal situation is the one in which communication is achieved: The receiver associates with each message f the state t in which the sender was when he or she sent f.
The game-theoretical analysis is meant to show that the optimal
situation can be obtained as one of the solutions (Nash equilibria, Pareto optimality) to the game. We may, of course, wonder how the game can achieve this somehow miraculous result
without the receivers having full knowledge of the situation the
sender is in. Well, in a way, the game cannot achieve it: In the
setting just described, the game will yield several solutions (Nash
equilibria); that is, there will be more than one way to pair messages with states. If we want to discriminate among them, more
information is needed. It may come in different layers.
One kind of additional information the players may have is a
prior probability distribution over the states in T: Some of them
are more probable than others. But even so, it is often the case
that there are several Nash equilibria in the game. Lewis would
say that in this case, one of those is chosen, the most salient one.
Anyhow, the existence of several equilibria shows, according to
him, the conventional character of the formsmeanings pairs.
If one does not like this form of conventionalism, then some
more discriminative information is needed. For instance, we may
assign costs to messages and assume that it is more rational for
the sender to send less expensive messages. It can be shown that
in some cases where these conditions are fulfilled, the combination of Nash equilibria and Pareto optimality leads to a unique
solution.

Signaling Games and Gricean Pragmatics


We saw that signaling games are useful for modeling communicative situations in which the players extract information from
linguistic messages according to some general principles of
rational behavior. We mentioned the case of the senders being
forced to consider alternative expressions he or she could have
used, together with their costs, and so on. It was then assumed
that it is more rational for the sender to send less expensive
messages.
One finds similar features in the so-called Gricean
pragmatics (see conversational implicature). In this
case, one is not so much concerned with the question of how
expressions acquire their meanings, but rather with the distinction

330

between what is said and what is conveyed or implied. The former


is more or less conventional, semantic meaning, while the latter
is something the speaker wants the hearer to understand from
what is said, although not explicitly stated. In a seminal paper,
Paul Grice tried to account for such pragmatic inferences by making use of maxims of conversation, like Be relevant, Always say
the truth, or Be as informative as possible, and so on. Recently,
some attempts have been made to reduce and explicate these
maxims in terms of rational principles of communication, which
advise the speaker to say as much as he or she can to fulfill communicative goals, and to say no more than he or she must to fulfill those communicative goals. Gricean pragmatics found its way
recently into optimality theory, a linguistic theory that basically compares alternative syntactic inputs to one another and
selects as the optimal meaning the one associated with the syntactic form that expresses it in the most efficient way. It should come
as no surprise that the ranking and judging of representations and
meanings in optimality-theoretic interpretation has a structure,
which resembles principles developed in strategic games.

Hintikkas Semantical Games


Games are also used to characterize different notions of dependence in logic and language. In contrast to communication
games whose task it is to model how expressions of the language
acquire an interpretation, in semantical games associated with
natural or formal languages, it is presupposed that expressions
already have an interpretation. What we want, instead, is a way
to characterize the dependence (and independence) of certain
expressions of the language on other expressions in terms of the
interaction of the players in a semantical game. Here is a typical
example from the mathematical vernacular:
A function y = f(x) is continuous at x0 if given a number however
small, we can find such that |f(x) f(x0)|< , given any x such
that |xx0| < .

In game-theoretical terms, we can find is represented by an


existential player, , and given any is represented by a universal
player, , both choosing individuals from the relevant universe
of discourse. Thus, the property of the function f being continuous at x0 is characterized by a game in which the universal player
chooses an individual from the universe, after which the existential player chooses an individual , and finally the universal
player chooses an individual x. The game stops here. Unlike
in the strategic games, which are one-shot games, semantical
games have a sequential element: There is a sequence of choices,
with later choices depending on earlier ones, and so on. The crucial notion is not any longer that of Nash equilibrium but that of
winning strategy. In other words, the game-theoretical paradigm
in this case is that of extensive games.

Extensive Games
It is customary to exhibit extensive games as a sequence
G = (N,H,P,(ui)iN)

where N is a collection of players, H is a set of histories, P is a


function attaching to each nonmaximal history the player whose
turn it is to move, and ui is the utility function for player i, that is,
a function that associates with each maximal history in H a payoff

Games and Language

Gender and Language

for player i. In other words, each maximal history represents a


play of the game, at the end of which each of the players is given
a payoff. Unlike communication or signaling games, semantical
games are strictly competitive 0-sum games: For each maximal
play, one of the players is winning and the other is loosing.
The crucial notion is that of a strategy for a given player, a
method that gives the player the appropriate choice depending
on the elements chosen earlier in the game. Such a strategy is
codified by a mathematical function g that takes as arguments
the partial histories (a0,,an-1) in H where it is the players turn
to move, and gives him or her an appropriate choice g(a0,,an-1);
g is a winning strategy for the player in question if it guarantees a
win in every maximal play in which he or she uses it.

Semantical Games, Quantifiers, and Anaphora


Our informal description of the game associated with the definition of a continuous function should be sufficient to convey the idea that the game in question can be rephrased as an
extensive game. A maximal play of the game is any sequence
(,,x), with and x chosen by the universal player, and by
the existential player. As for utilities, if the chosen elements
stand in the appropriate relations, that is, if whenever |xx0| <
we also have |f(x) f(x0)|< , then we declare the play to be a
win for and a loss for . Otherwise, it is a win for and a loss
for . But now any winning strategy of has to be a function g
whose arguments are all the individuals chosen by earlier in
the game. In other words, the logical priority of the quantified
expression given any over we can find an is captured by
the strategy g of the existential player, which is defined over any
element chosen by . And the fact that this strategy is a winning one amounts to any of the sequence of elements (,g(),x)
satisfying the appropriate conditions; that is: If |xx0| < g(),
then |f(x) f(x0)|< .
Another dependency phenomenon modeled by semantical
games is pronominal anaphora, as in the following sentence:
1.

A woman is sitting on a bench. She smiles.

We witness here a phenomenon of semantical dependence: Before an expression (She) gets an interpretation,
another expression that is its head (A woman) must get an
interpretation. The semantical games in this case are completely
analogous to the quantifier game (in fact, the game involves
quantifiers). The rules of the game will contain not only choices
prompted by quantified expressions but also choices prompted
by the anaphoric pronoun: She prompts a move by the existential player, who must now choose the same individual chosen
earlier as a possible value of the indefinite A woman.
Gabriel Sandu
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Dekker, Paul, and Robert van Rooy. 2000. Bi-directional optimality theory: An application of game-theory. Journal of Semantics 17: 21742.
Grice, Paul. 1975. Logic and conversation, In Syntax and Semantics 3:
Speech Acts, ed. P. Cole and J. L. Morgan. New York: Academic Press.
Hintikka, Jaakko, and Gabriel Sandu. 1991. On the Methodology of
Linguistics. Oxford: Basil Blackwell.
Lewis, David. 1969. Conventions. Cambridge: Harvard University Press.
Stenius, Erik. 1967. Mood and language-game. Synthese 17: 25474.

Van Rooy, Robert. 2002. Optimality-theoretic and game-theoretic


approaches to implicatures. In Stanford Encyclopedia of Philosophy.
Available online at: http://plato.stanford.edu.

GENDER AND LANGUAGE


The term gender in this discussion refers to the social condition of
being a woman or a man as distinct from sex, biological femaleness or maleness. Sex may be relevant in areas of inquiry where biological mechanisms are at issue (e.g., the organization of language
in the brain), but in most research, the issue is the social differentiation of men and women. Gendered linguistic behavior arises
not because men and women are innately different but because of
the way the difference is made significant in the local organization
of social life. The forms and precise social significance of gender
can vary considerably across cultures and through time.
Gender in this sense is also distinct from the use of the term
gender to denote a grammatical category. The relationship
between linguistic and social gender across languages has been
studied extensively (e.g., Hellinger and Bussman 20013), but for
reasons of space, this body of work will not be considered here;
the focus will instead be on research investigating patterns of
language use linked to the gender of the user.
The relationship of gender to language has been of interest to
scholars for a variety of reasons. On the one hand, the social fact
of gender differentiation (apparently universal in human cultures) influences processes such as language variation, change,
and shift and is, therefore, relevant for linguists understanding
of those phenomena. On the other hand, language use is part
of the process whereby gender is produced and reproduced as
a social fact. This makes language of interest to scholars whose
main interest is in the social organization of gender relations,
language and identity, or inequality.
The modern field of language and gender studies emerged in
the 1970s when the advent of second-wave feminism prompted
sympathetic researchers in language-related disciplines to undertake a systematic examination of the language used by and about
women. Adopting a broadly feminist political standpoint and a
modern sociolinguistic perspective, these researchers reacted
against the assumptions pervading previous discussions, which
had stereotyped women language users as simultaneously
exotic and inferior. Questions about sex differences in language
were reframed as questions about social identity, difference, culture, and power.
In the early phase of the fields development (roughly 1975 to
1990), researchers worked largely within a framework of interest in identifying and explaining differences between men and
women. This work continued the earlier tradition of treating
womens linguistic behavior as marked with respect to mens,
but the questions that researchers asked were different and so
were their motivations. Some gender difference studies were
animated by a desire to establish the (in)validity of sexist folklinguistic stereotypes like women talk incessantly or women
cant tell jokes. Other scholars were interested in exploring how
differences between mens and womens ways of speaking might
arise from the social reality of male dominance (Lakoff 1975).
This dominance current sought to raise consciousness about
the fact that in language as elsewhere, women were relegated to

331

Gender and Language


second-class citizenship by the way they were socialized to speak
and write, the way they were judged as speakers and writers, and
the way they were conventionally represented in speech and
writing. An alternative, cultural difference current placed more
emphasis on the idea that women and men (and, importantly,
girls and boys) grew up in different social worlds in which they
learned different rules for verbal communication (Maltz and
Borker 1982).
By the end of the 1980s, however, many researchers were turning away from the gender difference paradigm and abandoning
what was increasingly seen as an unproductive quest for global
generalizations. The more that empirical findings accumulated,
the more apparent it became that women and men could not usefully be treated as internally undifferentiated populations. It was
forcefully argued (notably by Eckert 1990) that since intragroup
differences were as significant as intergroup ones, and since the
variable of gender did not exist in isolation but always interacted
with other social variables, such as class, race/ethnicity, and age,
general statements to the effect that women do X and men do Y
were unenlightening, if not meaningless.
The traditional focus on binary gender difference began
to yield to an approach that was more concerned with gender
diversity in other words, with the use of linguistic variability as
a resource for producing a range of gendered styles in different
communities or contexts. Researchers followed the injunction to
look locally (Eckert and McConnell-Ginet 1992) at the forms
that gender identities and relations take in specific communities
of practice (CoPs), on the grounds that gender-linked patterns
of language use will emerge from the localized social practices
in which women and men are engaged. This led, among other
things, to a wave of empirical research conducted with more
socially and linguistically diverse groups of subjects, and dealing with masculinity as well as femininity (Johnson and Meinhof
1997).
The shift was also theoretical in nature, as language and gender scholars were influenced by the more general critique of gender essentialism (belief in masculinity and femininity as fixed and
invariant essences). Some adopted the performative account of
gender put forward by Judith Butler (1990) (see also sexuality
and language), while others took up alternative theoretical
approaches exemplifying the shift from essentialism to a more
radical social constructionism. One influential theoretical contribution was made by linguistic anthropologist Elinor Ochs
(1992), who used the concept of indexicality (see indexicals to
give an account of the relationship between language and gender that could accommodate empirical observations about its
locally variable and context-embedded nature. Pointing out that
few features of languages directly and exclusively index gender,
she suggested that masculinity and femininity were most often
indexed indirectly by the use of linguistic features whose primary
meaning related to particular roles or qualities (e.g., motherhood or modesty) but which had come to connote masculinity or femininity by association.
The implication of this line of argument is that we should not
expect direct and unmediated correlations between speakers
gender and features of their language use. The correlations are
typically indirect, the results of a process whereby features with
other primary meanings are differentially appropriated and/or

332

avoided by women and men. Some of the reasons for this differentiation have to do with the influence of gender norms and stereotypes. Girls may be instructed by parents and teachers that the
use of features that index modesty or deference is appropriate for
them, whereas boys may be ridiculed for using those same features; speakers in each gender group may develop an investment
in using the features to the extent that they also have an investment in being judged as gender appropriate. The approach, however, allows for the possibility that not everyone does develop
such an investment: There have been various studies of groups
whose behavior appears to be shaped by a conscious refusal of
gender appropriateness (e.g., Abe 2004; Bucholtz 1999; Okamoto
1995.) There are also speakers (e.g., some transgendered or
transsexual individuals) whose investment in using features that
will index their adopted gender identity is such that they produce
extreme gender stereotypes (Kulick 1999).
Other reasons for using or avoiding features that indirectly
index gender, though, have more to do with the demands of the
activities in which speakers are engaged. Bonnie McElhinny
(1995) reports that women police officers in Pittsburgh adopt a
relatively affectless style of interaction regarded by some observers as defeminizing, but that they are quite clear that they are not
trying to talk like men; they are trying to talk like police officers.
They also believe that the style of talk required is not simply a
contingent norm reflecting the historical domination of policing
by men but is intrinsically demanded by the nature of the work.
Although some styles do have gendered connotations (and histories) that can pose problems for individuals whose gender is
stereotypically incongruent with them, it is clear that the way
men and women behave in different contexts has as much to do
with the nature of those contexts as with gender per se.
It is also clear that gender itself is not always and everywhere
indexed in similar ways because there is cross-cultural and historical variation in the social roles allotted to men and women
and the qualities ascribed to them (linguistic markers of which
become secondary indices of gender). The Japanese association
of femininity with delicacy may seem natural to Westerners,
too, but the association is not made by, for instance, the villagers of Gapun in Papua, New Guinea, who characterize womens
language, like women themselves, as blunt, direct, and aggressive (Kulick 1993). Nor can it be automatically assumed that such
associations, ideologically powerful though they may be, determine the actual behavior of most speakers. For instance, recent
research analyzing the speech of working-class and rural women
in different parts of Japan points to the practical irrelevance for
many Japanese women of the idealized normative construct
womens language (Okamoto and Shibamoto Smith 2004).
Whereas early language and gender researchers often took
issue with prefeminist generalizations about mens and womens
language use, more recent researchers influenced by the shifts
just outlined have revisited many of the generalizations made
during the 1970s and 1980s. The classic claims of variationist
sociolinguistics about gender (that women are generally closer
than men to prestige norms and lead in change from above
because of their greater status consciousness, but are otherwise
conservative) have been substantially revised (e.g., Labov 2001).
While the new variationist orthodoxy is still a gender generalization (that women tend to lead in both change from above and

Gender and Language


change from below), it does not permit stereotypical explanations in terms of roles or psychological dispositions shared by all
women but, on the contrary, requires an account to be given of
womens nonuniform sociolinguistic behavior. There continues
to be debate on the claim that women are by and large more
polite speakers than men (see politeness) by reason of their
subordinate social positioning (Lakoff 1975; Brown 1980), with
some researchers suggesting that this generalization still has
value (e.g., Holmes 1995), while others are more skeptical (e.g.,
Mills 2003).
There is also debate on the theoretical assumptions of the
new paradigm itself: It can be asked whether the emphasis on
looking locally, stressing the diversity and variability of gendered behavior, is resulting in a reluctance to think globally,
which risks throwing the feminist baby out with the essentialist
bathwater. For some commentators, caution about treating gender as an overarching social category or making generalizations
about it is problematic, in that it obscures or downplays inequalities that, although they may not be global in the sense of universal and exceptionless, are not localized to just one community
of practice. There is also concern that some current approaches
overemphasize the agency of subjects in constructing gendered
personae while downplaying the structural and institutional factors that in reality constrain their performances. This concern is
addressed in recent research dealing with womens use of language in the workplace and in other public domains a traditional feminist research topic that is now being revisited from
newer theoretical perspectives (e.g., Baxter 2006; Holmes 2006;
Walsh 2001).
Penelope Eckert (2000) is among those researchers who
believe that looking locally can and should be combined with
thinking globally about gender. Eckert carried out research in
a suburban high school near Detroit, where identity and social
practice were organized around the contrast between jocks
(who embrace mainstream definitions of school success, e.g.,
participating actively in both academic and extracurricular pursuits) and burnouts (who reject the schools values and resist
active participation in its official culture). Affiliation in these
groups was marked linguistically as well as in other ways: Jocks
made more use of conservative vowel pronunciations, whereas
burnouts used more innovative urban variants. In both groups,
however, it was girls who were more advanced in the use of the
variants that indexed group membership. Eckert suggests that
they were symbolically claiming status as good jocks or good
burnouts and that this reflected the differing terms on which
the two sexes participated in their CoPs. Males gained status by
displaying ability (e.g., in sports or fighting), but females status
was more dependent on appearance and personal style: They
were obliged to work harder to assert in-group status by means
of symbolic details like the styling of their jeans and the pronunciation of their vowels.
Eckert argues that the pressures to which these high school
girls were responding are not confined to adolescent subcultures. Women, as the subordinate gender, may perceive their status and legitimacy to be in question in all kinds of CoPs; making
a symbolic display of in-group credentials pointedly presenting oneself as, say, a real lawyer/athlete/truck driver or using
resources such as language that are accessible to women as well

as men is one way to deal with this marginal social positioning. If so, it is evident that inequality, rather than just difference,
shapes the relationship of language use to gender.
A recent external development to which researchers are now
beginning to respond is the rise of scientific paradigms such
as evolutionary psychology, which, in addition to being
generally critical of feminist social constructionism, have made
specific claims (some of them empirically ill-founded see,
e.g., Hyde 2005) about malefemale linguistic differences as
the hardwired products of millennia of natural selection. The
recent resurgence of biological essentialism in both academic
and popular culture constitutes a challenge, both intellectual
and political, that language and gender researchers, in my view,
should not ignore. Yet while in the future there may well be more
discussion of the relationship between sex and gender, I think
it is unlikely that researchers will abandon the commitment to
(some variant of) the social constructionism that has proved so
productive in recent years.
Deborah Cameron
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abe, Hideko. 2004. Lesbian bar talk in Shinjuku, Japan. In Okamoto
and Shibamoto Smith 2004, 20521.
Baxter, Judith, ed. 2006. Speaking Out: The Female Voice in Public
Contexts. Basingstoke, UK: Palgrave.
Brown, Penelope. 1980. How and why are women more polite? In
Women and Language in Literature and Society, ed. Sally McConnellGinet, Ruth Borker, and Nelly Furman, 11149. New York: Praeger.
Bucholtz, Mary. 1999. Why be normal? Language and identity practices
in a community of nerd girls. Language and Society 28: 20323.
Butler, Judith. 1990. Gender Trouble: Feminism and the Subversion of
Identity. New York: Routlege.
Cameron, Deborah. 2006. On Language and Sexual Politics.
London: Routledge
Eckert, Penelope. 1990. The whole woman: Sex and gender differences
in variation. Language Variation and Change 1: 24568.
. 2000. Gender and sociolinguistic variation. In Language and
Gender: A Reader, ed. Jennifer Coates, 6475. Oxford: Blackwell.
Eckert, Penelope, and Sally McConnell-Ginet. 1992. Think practically
and look locally: Language and gender as community-based practice.
Annual Review of Anthropology 12: 46190.
. 2003. Language and Gender. Cambridge: Cambridge University
Press.
Hellinger, Marlis, and Hadumod Bussman, eds. 20013. Gender Across
Languages: The Linguistic Representation of Women and Men. 3 vols.
Amsterdam: John Benjamins.
Holmes, Janet. 1995. Women, Men and Politeness. London: Longman.
. 2006. Gendered Talk at Work. Malden, MA: Blackwell.
Holmes, Janet, and Miriam Meyerhoff, eds. 2003. The Handbook of
Language and Gender. Malden, MA: Blackwell.
Hyde, Janet Shibley. 2005. The gender similarities hypothesis. American
Psychologist 60: 58192.
Johnson, Sally, and Ulrike H. Meinhof, eds. 1997. Language and
Masculinity. Oxford: Blackwell.
Kulick, Don. 1993. Speaking as a woman: Structure and gender in domestic arguments in a Papua New Guinea village. Cultural Anthropology
8: 51041.
. 1999. Transgender and language. GLQ 5: 60522.
Labov, William. 2001. Principles of Linguistic Change. Vol. 2. Social
Factors. Oxford: Blackwell

333

Gender Marking
Lakoff, Robin. 1975. Language and Womans Place. New York: Harper
and Row.
Maltz, Daniel, and Ruth Borker. 1982. A cultural approach to malefemale misunderstanding. In Language and Social Identity, ed. John J.
Gumperz, 196216. Cambridge: Cambridge University Press.
McElhinny, Bonnie. 1995. Challenging hegemonic masculinities: Female and male police officers handling domestic violence.
In Gender Articulated, ed. Kira Hall andMary Bucholtz, 21743.
London: Routledge.
Mills, Sara. 2003. Gender and Politeness. Cambridge: Cambridge
University Press.
Ochs, Elinor. 1992. Indexing gender. In Rethinking Context: Language
as an Interactive Phenomenon, ed. Alessandro Duranti and Charles
Goodwin, 33558. Cambridge: Cambridge University Press.
Okamoto, Shigeko. 1995. Tasteless Japanese: Less feminine speech
among young Japanese women. In Gender Articulated, ed. Kira Hall
and Mary Bucholtz, 297325. London: Routledge.
Okamoto, Shigeko, and Janet Shibamoto Smith, eds. 2004. Japanese
Language, Gender and Ideology. New York: Oxford University
Press.
Walsh, Clare. 2001. Gender and Discourse: Language and Power in
Politics, the Church and Organizations. London: Longman.

GENDER MARKING
Almost all languages have some grammatical means of dividing up their noun lexicon into distinct classes, with devices
or markers occurring in surface structures under specifiable
conditions and providing information about the semantic
characteristics of the referent of the nominal head of the noun
phrase. Gender marking is one such device, typically found
in languages with a fusional or agglutinating profile; other
devices are frequently grouped under the term classifiers and
are typically found in isolating languages. The term gender is
used both for the particular classes of nouns (a language may
have two or more genders, or noun classes) and for the whole
grammatical feature (a language may or may not have the feature of gender).
There is always some semantic basis to gender classification,
though it may be supplemented with additional formal (phonological and morphological) criteria. The semantic
criteria include humanness, animacy, sex, shape, form, consistency, and functional properties. A minimal gender system
consists of two genders (e.g., French), and this is the most common system; it was found in 50 languages of a sample of 256
(Corbett 2005). Three-gender systems (e.g., Russian) appear to
be roughly half as common, and larger systems are increasingly
less common. The largest system found so far is Nigerian Fula
with around twenty genders (the exact count depending on the
dialect). However, 144 of the 256 languages had no gender
system.
Semantic distinctions between classes of nouns, even lexical
derivations (e.g., the English poet versus poetess), do not in themselves make genders. This is because it is taken as the definitional
characteristic of gender that some constituent outside the noun
itself must agree in gender with the noun. Thus, gender refers
to classes of nouns within a language that are reflected in the
behavior of the associated words (Hockett 1958, 231), and a language has a gender system only if we find different agreements

334

Generative Grammar
dependent on nouns of different classes, regardless of whether
or not the nouns themselves bear gender markers.
Agreement in gender with the head noun can be found in
other words in the noun phrase (adjective, determiner, demonstrative, numeral, etc., even focus particle), in the predicate of the
clause, an adverb, and arguably in an anaphoric pronoun
outside the clause boundary. For example, in Polish the feminine noun skarpeta sock (the controller) requires that many
other elements (targets) in the clause agree with it in gender: ta
jedna stara porwana skarpeta, ktra leaa na pododze this.f
one.f old.f torn.f sock(f) which.f lay.f on floor. Markers of
gender often do not mark gender alone but may be portmanteau
markers that combine information about gender with number,
person, case, or other features.
If antecedentanaphor relations are accepted as agreement,
languages in which free pronouns present the only evidence
for gender (gender distinctions being absent from noun phrase
modifiers and from predicates) can be counted as having a
(pronominal) gender system. Such languages are rare the best
known example is English, which is typologically unusual (see
typology) in this respect (Corbett 2005); another is Defaka
(Niger-Congo).
Anna Kibort
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Corbett, Greville G. 1991. Gender. Cambridge: Cambridge University
Press.
. 2005. Number of genders. In The World Atlas of Language
Structures, ed. Martin Haspelmath, Matthew S. Dryer, David Gil, and
Bernard Comrie, 1269. Oxford: Oxford University Press.
Hockett, Charles F. 1958. A Course in Modern Linguistics. New York:
Macmillan.

GENERATIVE GRAMMAR
The approach to linguistics known as generative grammar
(GG) was initially introduced by Noam Chomsky in the 1950s,
and though it has developed continuously ever since, the core
assumptions have remained remarkably constant. For instance,
at the highest level, GG has always maintained that a grammar
effectively constitutes a set of formal rules that recursively enumerate all, and only all, the well-formed (i.e.,grammatical)
sentences of a language (see recusion, iteration, and
metarepresentation ). As part of the process of constructing such a model of linguistic knowledge, GG research has
consistently attempted to describe how a speaker-hearer can
generate and comprehend an infinite number of grammatical
sentences despite encountering only a finite amount of primary
linguistic data (PLD) while learning any given language. In
order to account for this apparent conundrum, GG standardly
assumes that the language capacity is a genetic endowment (see
innateness and innatism ) that is distinctive to homo sapiens and which specifies those aspects of linguistic knowledge
that are genetically determined, as opposed to those that must
be acquired via contact with PLD. As a result of this emphasis
on language acquisition, GG is often closely associated with
biolinguistics.

Generative Grammar
The following sections summarize the main stages in the
development of GG from the 1950s to the present.

The Early Years


During the 1930s and 1940s, prominent linguists in North
America sought to discover, in a systematic fashion, the grammatical rules that regulated the sentential structure of utterances
in any given corpus. For instance, in his Methods in Structural
Linguistics (1951), Zellig Harris identified distributional discovery procedures that could determine the structure of a given language, specifying rules that would enable (for instance) knife and
knives to be associated with a single underlying morphophonemic sequence. Such concerns came to typify the corpus-driven
structural linguistics of the 1940s.
While still a student, Chomsky became dissatisfied with
the decompositional methodology that Harris (and others)
were advocating, and he developed an alternative approach,
transformational generative grammar (TGG), that was
designed to overcome limitations in the work of his contemporaries. Although the rudiments of TGG were summarized in
Chomskys celebrated 1957 publication Syntactic Structures,
this monograph merely provided a high-level overview of
various techniques and theoretical assumptions that had been
presented in earlier work, especially his then-unpublished
manuscript The Logical Structure of Linguistic Theory ([1955]
1975; henceforth LSLT).
The theory presented in LSLT assumes a hierarchy of analytical linguistic levels (e.g., the phonemic level, the morphemic
level). Consequently, smaller linguistic units (e.g., morphemes)
can be combined in a rule-driven manner to create larger linguistic units (e.g., words). A fully articulated grammar of this
kind would be able explicitly to produce all the grammatical sentences in a given language, and therefore the model is generative
rather than decompositional. Eventually, TGG came to be associated with a number of distinctive and influential theoretical
stances such as the following:
Syntax can be analyzed independently of semantics that
is, sentences such as Colorless green ideas sleep furiously
are grammatical though they are meaningless (Chomsky
[1955] 1975, 57; 1957b, 15).
Statistical techniques (such as finite state machines and
stochastic grammars) cannot generate all, and only all, the
grammatical sentences in a given language, and therefore
they cannot usefully be incorporated into comprehensive linguistic theories (Chomsky 1957a).
Linguistic theories can be developed and presented in a
rigorously axiomatic-deductive framework like that standardly used by mathematicians and logicians (Chomsky
[1955] 1975, 83).
The detailed arguments that Chomsky developed in order to justify such beliefs caused many of his contemporaries to claim that
TGG was a more scientific linguistic theory than any of its predecessors, and this partly explains why it was eventually received
with such enthusiasm by linguists who were keen to establish
their discipline as a scientific enterprise (see Tomalin 2006).
Focusing specifically on the syntactic level of analysis, the standard TGG framework assumes that a set of phrase structure

rules (e.g., S NP VP) generates strings of symbols (the kernel


sentences of the grammar), and that a set of transformational
rules subsequently operates upon these strings, modifying them
in order to derive further sentences (see transformational
grammar).

Development and Transition


In the early 1960s, the first generation of linguists who had
encountered GG as students came to maturity, and this group
included such influential figures as John R. Ross, Paul Postal,
James McCawley, and George Lakoff. However, Chomsky continued to guide the development of GG, and a revised version
was presented in Aspects of the Theory of Syntax (1965; henceforth ATS). While various techniques from 1950s-style TGG
had been retained, there were also conspicuous differences.
For instance, the topic of language acquisition was now explicitly addressed, and Chomsky suggested that the generation
of grammatical structures was determined partly by innate
knowledge of language, thereby stressing the connection
between linguistics and cognitive psychology (see Chomsky
1965). In order to clarify this idea, he distinguished between
competence (i.e., a speaker-hearers knowledge of the formal aspects of language) and performance (i.e., a speakerhearers actual use of language in concrete situations), and he
suggested that the task of linguistic research was to provide a
description of the former. While elaborating this revised perspective, Chomsky contrasted descriptive and explanatory
adequacy (see descriptive , observational, and explanatory adequacy) and argued that a valid grammatical theory
must be both descriptively and explanatorily adequate. As a
result, a theory of universal grammar (UG) became possible. Specifically, since ATS encouraged linguists to explain
how an idealized speaker-hearer eventually achieves linguistic
competence while encountering only a finite amount of PLD,
researchers began to focus more on the task of identifying those
properties that are common to all known languages, rather than
merely producing isolated grammars for specific languages.
In the ATS framework, in addition to TGG-style phrase structure rules such as
S NP Aux VP
VP V NP
NP Det N
NP N
Det the
Aux M
Chomsky also included subcategorization rules that contained
explicit information about sublexical features (1965, 85):
N [+N, Common]
[+Common] [Count]
[+Count] [ Animate]
[Common] [Animate]
[+Animate] [Human]
[Count] [Abstract]

335

Generative Grammar

Figure 1.

LEXICON

LEXICON

D-STRUCTURE

NARROW SYNTAX

Move
PHONETIC FORM

S-STRUCTURE

LOGICAL FORM

Figure 3.

PHONETIC FORM

LOGICAL FORM

Figure 2.

Such rule sets enable structures such as those found in


Figure 1 to be generated. Base-generated trees of this kind
constituted deep structure representations, and the transformational rules operated on them to produce surface structure
representations (see underlying structure and surface structure ).

Principles and Parameters


By the late 1970s, GG had started to change once again, gradually
emerging in the early 1980s as the modular government and
binding (GB) theory. The GB framework associates UG with a
finite set of principles that are common to all languages, and a
finite set of parameters the settings of which vary from language
to language it therefore began to be referred to as the principles and parameters (P&P) approach. In the GB formalism, UG is understood to constitute a characterization of the
childs pre-linguistic initial state and the parameters are fixed
as PLD are encountered, thus creating a stable-state grammar
(Chomsky 1981, 7). Schematically,
S0 + PLD Ss

336

where S0 is the initial state and Ss is the resulting stable state


with fixed parameter settings. Also during this period, the term
E-language began to be used to refer to actual manifestations of
language in the external world, while I-LANGUAGE referred to
the ideal speaker-hearers internal, tacit knowledge of language.
Although it developed out of previous GG research, the GB
model certainly introduced a new framework for linguistic
analysis. For instance, while deep structure (i.e., D-structure)
and surface structure (i.e., S-structure) were retained, a single
rule, Move- (i.e., move anything anywhere), was used to generate S-structures from D-structures, rather than a set of specific
movement transformations. In addition, the GB phrase structure component used x-bar theory, which posited structural
similarities between different phrasal categories, such as noun
phrase (NP) and verb phrase (VP), and, crucially, Chomsky
(1986) later extended these structural insights to functional categories as well. Consequently, the basic GB framework can be
represented as in Figure 2.
During the early 1990s, the P&P approach was reformulated as the minimalist program (MP) (see minimalism ),
and in an attempt to reduce the theory to its bare essentials,
some familiar GB elements (e.g., D-structure, S-structure,
X-bar theory) were rejected in favor of a simpler, more economical methodology. Accordingly, the basic schema for the
MP framework can be presented as in Figure 3. Specifically, in
the MP, an I-language generates expressions that pair instructions for the articulatory-perceptual (A-P) system interface

Generative Grammar

Generative Poetics

with instructions for the conceptual-intentional (C-I) system


interface. CHL (the computational component of UG) contains
a small set of operations (e.g., Select, merge ), which manipulate lexical items (LIs). LIs are defined in terms of irreducible
features, and (crudely) CHL combines LIs in various principled ways in order to create larger syntactic objects, with all
superfluous machinery (e.g., projections, labels) being omitted. A computation converges if it converges at the A-P and
C-I interface levels, and, crucially, the MP hypothesizes that
natural language constitutes an optimal solution to the various demands imposed by the external interfaces. In other
words, the MP seeks to determine just how perfect natural
language actually is (Chomsky 1995, 9).
Chomsky was initially rather vague about the nature of this
perfection. However, he has recently attempted to define
this notion with reference to the specific requirements that
are imposed upon an I-language by the A-P and C-I interfaces.
Accordingly, it is possible to identify various degrees of the essential minimalist thesis, with the most stringent being the strong
minimalist thesis (SMT). In essence, if the SMT is correct, then
there are no elements of S0 that cannot be accounted for in terms
of interface requirements and general (nonlinguistic) computational properties; therefore, there are no inherently unexplainable aspects of S0. If this hypothesis is shown to be true, then
natural language would be, in this sense, perfect.
Marcus Tomalin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, N. [1955] 1975. The Logical Structure of Linguistic Theory.
Cambridge, MA: MIT Press.
. 1957a. Review of Hocketts A Manual of Phonology. International
Journal of American Linguistics 23: 22334.
. 1957b. Syntactic Structures. The Hague: Mouton.
. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
. 1981. Lectures on Government and Binding. Floris, the
Netherlands: Dordrecht.
. 1986. Barriers: Linguistic Inquiry Monograph 13. Cambridge,
MA: MIT Press.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
. 2000a. New Horizons in the Study of Language and Mind.
Cambridge: Cambridge University Press.
. 2000b. Minimalist inquiries: The framework. In Step by
Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, ed.
D. Michaels, J. Uriagereka, and R. Martin, 89115. Cambridge, MA:
MIT Press.
. 2004. Beyond explanatory adequacy. In Structures and
Beyond: The Cartography of Syntactic Structures. Vol. 3. Ed. A. Belletti,
10431. Studies in Comparative Syntax. New York and Oxford: Oxford
University Press.
Harris, R. A. 1993. The Linguistic Wars. Oxford: Oxford University Press.
Harris, Z. S. 1951. Methods in Structural Linguistics. Chicago: University
of Chicago Press.
Jackendoff, R. 1977. X-Bar Syntax. Cambridge, MA: MIT Press.
Johnson, D. E., and S. Lappin. 1997. A critique of the minimalist program. Linguistics and Philosophy 20: 273333.
. 1999. Local Constraints vs. Economy. Monographs in Linguistics
Series. Stanford, CA: CSLI.
Lees, R. 1957. Review of Syntactic Structures. Language 33: 375408.
Matthews, P. H. 1993. Grammatical Theory in the United States from
Bloomfield to Chomsky. Cambridge: Cambridge University Press.

Newmeyer, F. J. 1986. Linguistic Theory in America. Orlando, FL: Academic


Press.
Tomalin, M. 2006. Linguistics and the Formal Sciences.
Cambridge: Cambridge University Press.

GENERATIVE POETICS
Generative poetics comprises all theories that seek to explain the
production and reception of literary works by reference to a set
of rules or algorithmic procedures. It is closely related to cognitive poetics. However, generative poetics has tended to draw
inspiration from Chomskyan generative grammar. In contrast, many writers in cognitive poetics have aligned their analyses with cognitive linguistics.
Early work in generative poetics tended to track developments
in generative grammar. In Noam Chomskys usage, a generative
grammar produces all and only the grammatical sentences of
a language. For the early theorists of generative poetics, then,
a narrative grammar should produce all and only well-formed
narratives. Moreover, in both cases, it was commonly thought
that this goal is best accomplished through a transformational grammar.
Much of this early work was illuminating and important.
However, there are several problems with modeling, for example, narratology on linguistic theory. First, linguistic theories
may change rapidly. If one bases ones narrative theory on any
specific syntactic theory, ones narrative theory will probably be
outmoded by the time it is published. Second, there is no reason
to believe, a priori, that rules for the generation of stories should
directly parallel rules of syntax anyway. Finally, it is not clear that
there is any narrative counterpart to grammaticality. In other
words, the ambition of generating all and only well-formed
stories may be misguided. (The situation is, of course, different
for areas of poetics that are directly governed by linguistic rules.
Indeed, much of the most important work in generative poetics has been done in such areas; see meter, verse line, and
poetic form, universals of.)
The point about well-formedness is worth considering in further detail. There are, of course, speech actions that are clearly
not stories and speech actions that are. The difficulty is that
there is a gradient of more or less marginal cases, rather than a
strict division between stories and nonstories. In short, story is
a prototype concept. Given the prototype nature of story, it
is not clear just how the relevant data are organized, thus, just
what needs explaining. In other words, it is not clear precisely
what structures the generative rules should generate.
More exactly, the outputs of a generative system clearly
need to match the relevant data. At least two aspects of the
Chomskyan approach are generalizable here. First, the system
should not overgenerate. For example, a generative grammar
might produce all grammatical sentences. But it is still invalid if it
also produces The the of at as a grammatical sentence. Second,
the grammar cannot produce only sentences that have already
occurred. Such a grammar would be falsified as soon as someone
uttered a new sentence. In short, the data comprise all possible
speech actions of the relevant type (sentences, narratives) and
no impossible ones. This returns us to the issue of intermediate cases. However we determine the data, a generative system

337

Generative Poetics
should produce structures in keeping with the categorial organization of the data. If the data have a sharp well-formed/not wellformed division with few intermediate cases, the system should
generate structures divided in this way. If the data involve a more
gradual gradient from excluded to included cases as we find
with stories then the system should generate structures organized in this way.
In addition to this difference in the data, there is a normative
component to our concept of stories that is largely absent from
our concept of sentences. We routinely refer to some stories as
better than others. However, we do not commonly think of some
sentences as better than others.
Although it is not usually referred to as generative poetics,
some recent work in cognitive science and literature does fall
into that category as defined here. This work avoids the problems of earlier approaches by developing theoretical principles
independently of particular grammatical theories and by recognizing the prototype nature of our ordinary language concept of
stories.
In order to understand this recent work, we need to consider
what constitutes a generative rule system for a speech action.
Most obviously, a generative system needs a productive component and a receptive component. Parts of these components
will be directly parallel. In other words, many rules of production and reception have to be systematically coordinated so that
when I say Could you ask John to pass the salt? my addressee
will understand the question in such a way that I will most often
end up getting the salt. On the other hand, some parts of these
components will necessarily be different. For example, not everything involved in producing an alliterative, rhyming, metered
verse line is also involved in reading that line.
Whether speaking of productive or receptive speech actions,
there are several ways in which we could organize the rules that
constitute a generative system. No matter how we do this, we are
likely to have processes (e.g., activation), structures in which processes operate (e.g., episodic memory), and elements on which
processes operate (e.g., memories of particular experiences that
we activate from episodic memory). Note that these divisions
need not be absolute. Processes could themselves be the elements on which other processes (meta-rules) operate. Moreover,
some elements may incorporate processes operating on other
elements. We see this in the case of scripts, such as the script
for eating at a restaurant. When these scripts are activated, they
guide our speech and behavior by integrating various preexisting
processes and other elements. For example, when we decide to
eat at a restaurant and follow our script for doing so, such processes as activating our prototype for a menu are involved so
that we can recognize menus and respond appropriately when
the server holds one out to us. Finally, different processes may
only operate at different derivation levels. A common case of this
sort would be the distinction between basic construction rules
and rules that adjust the outputs of the basic construction rules.
For example, certain rules of politeness might not affect our
initial production of a sentence, but may enter as adjustments
performed while we are speaking.
As the preceding analysis suggests, there are different levels
of commonality for the components and operations of any rule
system. The main levels would be as follows: 1) universal, 2)

338

specific to a group, 3) specific to an individual, and 4) specific to


a speech action. Generative grammar focuses almost entirely on
the first and second levels. Generative poetics, however, must be
concerned equally with the third and fourth levels. This is connected with the fact that verbal art is a normative category. We
care about instances of verbal art, and we care about the people
who produce works of verbal art. (We commonly care less about
individuals who interpret works of verbal art. In keeping with this
preference, writers in generative poetics and related areas have
tended to focus on production rather than reception. Ultimately,
however, a research program treating these issues will need to
address reception as well; see competence and performance, literary, for one influential approach to reception in
this context.)
Consider a simple rule system for producing narrative. Such
a system might have three types of rules: 1) basic plot construction rules, 2) development principles that serve to specify the
basic plot, and 3) evaluation rules, a form of adjustment rules
that operates in authorial revision. There are two levels of basic
plot construction rules. At the first level, a few simple processes
generate a story from a goal. These processes might involve constructing an agent who pursues the goal and the development
of some problem that prevents the achievement of the goal.
Repeated over enough instances, such stories themselves form
a second constructive level narrative prototypes. Narrative
prototypes do not define basic conditions for being a story. They
crystallize what counts as a good case of a story. These prototypes help us to account for the gradient between central and
marginal cases of stories. One may put the point simply by reference to acquisition. A child learns that Nice day! is not a story
at all; I went out to buy a loaf of bread is a very marginal sort of
story; I went out to buy a loaf of bread, but there was a fifteencar accident on the highway is more prototypical and so on,
up to the experiences of Bambi, which are highly prototypical.
Some narrative prototypes are narrative universals.
Three in particular recur cross-culturally heroic, romantic, and
sacrificial tragi-comedy. These are specified and combined in
particular genres that vary from tradition to tradition.
The nature of prototypes is such that they are relatively abstract
and common or ordinary. In contrast, successful literary works
are both concrete and distinctive. Development principles serve
to particularize the prototypes. In many, perhaps most, cases,
development principles are universal or common to a group,
though their precise operation involves individual idiosyncrasy.
For example, in specifying characters, one common development principle is to draw on exemplars, instances of particular people, real or literary. While this principle seems likely to be
universal or near universal, its results will vary with the precise
exemplars employed. These exemplars will, in turn, vary culturally and individually. Thus, we find that different traditions commonly have a limited number of exemplary characters who serve
as important models to later writers. The principle is cross-cultural, but of course the exemplars themselves differ (e.g., Jesus in
the Christian tradition versus Rama in the Hindu tradition).
Evaluation rules include self-conscious processes of adjustment for a projected audience or readership. However, they
more importantly involve unself-conscious sensitivity to patterning, suggestiveness (see dhvani and rasa and suggestion

Generative Poetics

Generative Semantics

structure), and other complex features of the work. There are

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

obviously differences in the nature of these rules and the way


they operate in writing and oral composition.
A brief example from Shakespeare should help to clarify these
points. Like any other author, Shakespeare had a diverse set of
principles that constituted his generative poetic system. These
principles were multiple and stood in complex relations with one
another. The relations are probably best understood roughly in
terms of connectionism. They were linked to one another with
different degrees of strength. Different principles were activated
at different times and in different degrees, depending on what
Shakespeare was reading, experiencing in his personal life, and
so forth. These different activations would sometimes produce
very different cascades of activation within the system, leading
to different products.
Obviously, we can never have a sense of all these particulars.
However, the particulars are not without patterns. For example,
Shakespeare largely confined himself to the universal narrative
prototypes. Moreover, there are patterns to his development
of those prototypes cultural patterns (relating, for example,
to character types of Renaissance English drama) and individual patterns, patterns that are more distinctive of Shakespeare
himself. One standard development principle, a function of the
maximization of relevance (see the essay Elaborating Speech
and Writing: Verbal Art in this volume), is alignment, where
the author parallels different levels of the narrative world (e.g.,
presenting society and nature as in simultaneous turmoil).
Shakespeare sometimes intensifies such parallelism, extending
it to three or four levels for example, mental health and family
relations, along with society and nature. The most famous case
of such alignment in Shakespeare is in King Lear, when Lears
madness is paralleled with the division of his family, strife in his
kingdom, and a terrible storm.
There are other, less common, principles employed by
Shakespeare as well. Some are localized. In several cases, for
example, he represents rebels as having suicidal thoughts, even
when their rebellion seems entirely justified. Others range more
broadly across a work. For instance, a number of Shakespeares
development principles serve to foster ambivalence toward various actions and characters, complicating our sense of who is right
and who is wrong in particular conflicts. Thus, we sympathize with
Hamlet, knowing that Claudius is a murderer. But Shakespeare
makes Hamlet, too, commit murder, if in somewhat more ambiguous circumstances. (See Hogan 2006 on this and other cases.)
Without access to Shakespeares drafts and revisions, it is difficult to isolate his evaluation rules. However, we might guess
that they involved systematizing the execution of such development principles (e.g., enhancing ambivalence through such
means as the addition of humanizing speeches for otherwise villainous characters).
Generative poetics is certainly not the only way of theorizing literature. However, it is highly promising, when dissociated
from particular linguistic theories and tied instead to a broader
understanding of rule systems in a cognitive context. As such, it
has great potential for helping us to understand both the universal patterns of verbal art and the specificity of particular works.

Generative semantics (GS) began as an orthodox development


within the standard theory (ST) of the transformationalgenerative grammar (TGG) framework developed by Noam
Chomsky and his collaborators in the late 1950s and early 1960s.
Its most prominent voices include one of those collaborators,
Paul Postal, two of Chomskys students, James D. McCawley and
Hj (John Robert) Ross, and others in the general circle, chiefly
George Lakoff and Robin Tolmach Lakoff. While the developers,
and many onlookers, appeared to regard GS as a natural continuation of ST, Chomsky clearly did not; antagonisms soon arose.
GS had considerably more adherents for several years, but within
a decade the situation had reversed, with more linguists adopting Chomskys extended standard theory, or EST. GS rapidly
disintegrated, leaving many proposals of lasting interest.
GS was a highly streamlined TGG model, with only two representations, meaning and form the first a syntactic structure
comprised of semantic primes and heavily influenced by
symbolic logic, the second a phonetically completed syntactic structure linked by a set of transformations, governed by
a set of derivational conditions called global rules. While the
generative role of semantics was both originary and titular, it
quickly became peripheral to GS, as proponents began tackling
a wide range of issues and phenomena previously discounted
or unnoticed within TGG. For instance, grammaticality had
been defined exclusively in terms of a specific grammar, but
GSers regarded it as a psychosocial notion, relative to language
users and their contexts. Similarly, lexical categories had been
assumed to be discrete, but GSers explored categorization
in fuzzy and context-sensitive terms resonant with prototype
theory, developments which have been influential in cognitive grammmar (e.g., Ross 1973; Lakoff 1972). GSers also
brought performative analyses into TGG, typically positing a
deletable hypersentence carrying the illocutionary force,
such as I warn you, into which a locution like Dont make a move
was embedded (Sadock 1969). This research helped bring pragmatics into formal linguistics.
The legacy of GS is extensive, albeit notoriously unacknowledged (Postal 1988). Frederick J. Newmeyer offers this partial but
significant catalog of topics introduced to TGG by GS: capturing
semantic regularities in a representation utilizing symbolic logic;
the early exploration of phenomena that led to the development
of mechanisms like indexing devices (see indexicals), traces,
and filters; lexical decomposition; and several specific proposals, such as the nonexistence of extrinsic rule ordering, postcyclic lexical insertion, and treating anaphoric pronouns as bound
variables (1986, 138; see anaphora).

Patrick Colm Hogan

Randy Allen Harris

Hogan, Patrick Colm. 2006. Narrative universals, heroic tragi-comedy,


and Shakespeares political ambivalence. College Literature
33.1: 3466.
Pavel, Thomas G. 1985. The Poetics of Plot: The Case of English Renaissance
Drama. Minneapolis: University of Minnesota Press.
Prince, Gerald. 1973. A Grammar of Stories. The Hague: Mouton.

GENERATIVE SEMANTICS

339

Generic- and Specific-Level Metaphors


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Harris, Randy Allen. 1993. The Linguistics Wars. New York: Oxford
University Press.
Huck, Geoffrey J., and John A. Goldsmith. 1995. Ideology and
Linguistic Theory: Noam Chomsky and the Deep Structure Debates.
London: Routledge.
Lakoff, George. 1972. Hedges: A study in meaning criteria and the logic
of fuzzy concepts. Proceedings of the Chicago Linguistics Society
8: 183228.
Newmeyer, Frederick J. 1986. Linguistic Theory in America. 2d ed. New
York: Academic Press.
Postal, Paul. 1988. Advances in linguistic rhetoric. Natural Language
and Linguistic Theory 6: 12937.
Ross, J. R. 1973. Nouniness. In Three Dimensions of Linguistic Theory,
ed. O. Fujimura, 137258. Tokyo: TEC.
Sadock J. 1969. Hypersentences. Paper in Linguistics 1: 283371.

GENERIC- AND SPECIFIC-LEVEL METAPHORS


The distinction between generic-level and specific-level metaphors was introduced in conceptual metaphor theory
(Lakoff 1993; Lakoff and Turner 1989). It identifies hierarchical relations between metaphorical concepts that are
hypothesised to be used in the understanding of figurative
language.
Examples of specific-level metaphors are LOVE IS A JOUR
NEY, PEOPLE ARE PLANTS, or DEATH IS A THIEF. Examples
of generic-level metaphors are EVENTS ARE ACTIONS and
GENERIC IS SPECIFIC. The motivation for introducing this distinction was the observation that some topics are commonly
talked about using a small group of metaphors that share important characteristics. George Lakoff and Mark Turner (1989), in an
analysis of poetic language, noticed that time was metaphorically
personified as a destroyer, a thief, or a devourer, as in Miltons
line time, the subtle thief of youth (cited on p. 35). While such
personifications might be common and easily understood, personifications of time such as, say, a child or as a shop clerk do
not occur in the analyzed poems. The reason for this, Lakoff and
Turner argued, is that metaphorical understanding in terms of
relatively specific ideas (such as thieves or shop clerks) is constrained by generic-level metaphors, in this case EVENTS ARE
ACTIONS. Generic-level metaphors ensure that topics are metaphorically conceptualized using vehicles that share relevant
generic structure with the topic. In the example from Milton, the
event of middle age involves, among other things, the absence
of a previously experienced youth. Actions involving thieves as
agents typically result in the absence of previously held possessions. Evidently, the two share generic structure, which licenses
the personification of time as a thief via the events are actions
metaphor. Typical activities involving shop clerks as agents, on
the other hand, might be more difficult to relate to the experience
of middle age: Such activities do not share generic structure with
the onset of middle age. This is why, theoretically, the EVENTS
ARE ACTIONS metaphor should prevent poets from attempting
the line time, the subtle shop clerk of youth. The generic-level
metaphor EVENTS ARE ACTIONS constrains the range of viable
specific-level personification metaphors.
Generic-level metaphors differ from specific-level metaphors
not only in the generality of the topics and vehicles they apply

340

to but also in their internal structure. Specific-level metaphors


are held to involve a set of specific mappings between a source
domain and a target domain (see source and target). In the
metaphor LOVE IS A JOURNEY, abstracted from conventional
expressions, such as our relationship has hit a dead-end street,
look how far weve come, or the relationship isnt going anywhere
(Lakoff 1993, 206), the lovers correspond to the travelers, the distance traveled corresponds to the duration of the relationship,
and so on. However, generic-level metaphors do not involve
such a fixed set of correspondences. For example, the metaphor
EVENTS ARE ACTIONS does not specify the events that can be
metaphorically understood as actions or the actions that can be
used as metaphor vehicles, and, consequently, it does not specify any particular correspondences between such events and
actions.
The idea of generic-level metaphors was a significant turning point in the development of conceptual metaphor theory
because it marked a step away from the extrapolation of hypothesized conceptual metaphors from linguistic data toward the
postulation of more abstract schemas with loose relations to
observable linguistic metaphors (e.g., Lakoff and Johnson 1999).
Whereas in earlier work (Lakoff and Johnson 1980), conceptual
metaphors were treated as generalizations over metaphors
in communication on the part of the analyst and, presumably, the speaker, the direction of reasoning was reversed with
respect to generic-level metaphors: These were explicitly held
to constrain the particularities of metaphor in language and
communication.
The idea of generic-level metaphors has attracted some debate.
One contention has been that the relation between actions and
events (EVENTS ARE ACTIONS) or between a specific instance
and a general phenomenon (GENERIC IS SPECIFIC) is not itself
metaphoric (Jackendoff and Aaron 1991; Stern 2000). Also, it
remains an open question whether abstract schemas such as
generic-level metaphors appropriately model metaphor understanding (e.g., Murphy 1996) or whether they are better regarded
as descriptive tools for the analyst. Some current cognitive-linguistic approaches to metaphor are more concerned with the
development of metaphoric ideas in discourse communities (e.g.,
Musolff 2004).
The influence of the idea of generic-level metaphors can be
seen in two developments. The notion of primary metaphors
(e.g., Grady and Johnson 2003) takes up the idea that metaphors
in verbal communication might be constrained by other metaphoric schemas that are not themselves manifest in language.
The notion of a generic space in conceptual blending theory preserves the idea of a cognitive structure that identifies the
shared generic structure of arguments brought together in a
blended space, while abandoning the notion that such generic
structure is itself metaphoric.
Jrg Zinken
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Grady, Joseph, and Christopher Johnson. 2003. Converging evidence
for the notions of subscene and primary scene. In Metaphor and
Metonymy in Comparison and Contrast, ed. R. Dirven and R. Prings,
53354. Berlin: Mouton de Gruyter.

Genes and Language


Jackendoff, Ray, and David Aaron. 1991. Review of More Than Cool
Reason: A Field Guide to Poetic Metaphor, by G. Lakoff and M. Turner.
Language 67: 32038.
Lakoff, George. 1993. The contemporary theory of metaphor. In
Metaphor and Thought, ed. A. Ortony, 20251. Cambridge: Cambridge
University Press.
Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By.
Chicago: University of Chicago Press.
. 1999. Philosophy in the Flesh: The Embodied Mind and Its Challenge
to Western Thought. New York: Basic Books.
Lakoff, George, and Mark Turner. 1989. More Than Cool Reason: A Field
Guide to Poetic Metaphor. Chicago: University of Chicago Press.
Murphy, Gregory. 1996. On metaphoric representation. Cognition
60: 173204.
Musolff, Andreas. 2004. Metaphor and conceptual evolution. Metaphorik.de 7: 5575. Available online at: http://www.
metaphorik. de.
Stern, Joseph. 2000. Metaphor in Context. Cambridge, MA: MIT Press.

GENES AND LANGUAGE


Why Should One Expect Genes to Play an Important Role in
Language?
Unlike offspring of any other species, ordinary human children
routinely acquire complex language, characterized by openended vocabularies and productive syntax. This cannot be a
result of input alone because juveniles of other closely related
primate species, such as chimpanzees, do not develop humanlike languages even with extensive tutelage (Terrace, Petitto,
et al. 1980). From the perspective of biology, the question is not
whether genes play roles in language but how (Fisher and Marcus
2006).
Clearly, the words and grammar that are specific to any particular language are learned through exposure to appropriate
models. Nevertheless, the peculiar human capacity to acquire
and use language depends on a rich mixture of neural systems
that must be biologically constrained. These include mechanisms that need to simultaneously coordinate syntactic,
semantic, phonological, and pragmatic representations
with one another, with motor and sensory systems, and with
both speakers and listeners knowledge of the world. The functional properties of the relevant neural systems are largely determined by the cellular architecture of the human brain, which
is itself the product of ongoing interactions between genes and
the environment (Marcus 2004). Genes contribute to the birth,
migration, differentiation, patterning, and connectivity of neurons during embryogenesis and development, and they continue to contribute to online functions in the mature brain, for
example, by mediating changes in the strengths of connections
between neurons. It is likely that hundreds or even thousands of
genes participate in the development and maintenance of the
neural systems that underlie language, some in ways that may
be tailored to linguistic functions, others (like housekeeping
genes that govern the basic metabolic processes of all cells) that
clearly are not.
As yet, it is unknown how many of the genes in the human
genome are closely tied to language, but studies of developmental syndromes that primarily disrupt speech and/or language
skills give strong reason to believe that such genes are there to be

found (Fisher, Lai, et al. 2003). Speech and language disorders


are repeatedly observed to cluster in families, and twin studies
indicate that they are highly heritable (Bishop 2001). In recent
years, geneticists have successfully located chromosomal sites
within the human genome that are likely to harbor genetic risk
factors involved in developmental language disorders (e.g., The
SLI Consortium 2002). Moreover, they have even been able to
zero in on a specific gene, FOXP2, that is implicated in one particular disorder affecting speech and language (Lai, Fisher, et al.
2001).

FOXP2: What Is It and How Was It Discovered?


The FOXP2 gene, found on human chromosome 7, codes for a
special type of regulatory protein, technically known as a forkhead-box (or FOX) transcription factor. This class of proteins
helps govern when and where genes are expressed (switched on
and off) during embryogenesis, in postnatal development, and
in the mature organism (Lehmann, Sowden, et al. 2003). Each
FOX protein contains a special structure, called a forkhead-box
domain, which enables it to bind to the DNA of a target gene and
affect how much of the product of that target gene is made in
the cell. Transcription factors like these may affect many downstream targets in chorus and, thus, represent central components
of gene regulatory networks that are important for implementing
developmental programs, allowing cells to respond to signals,
and so on.
The discovery of the human FOXP2 gene originated in studies of a large three-generational family (the KE family) suffering
from a rare form of speech and language impairment (Hurst,
Baraitser, et al. 1990; Gopnik and Crago 1991). The disorder is
characterized primarily by severe difficulties in the learning and
production of sequences of mouth movements that are necessary for fluent speech, usually referred to as developmental verbal dyspraxia or childhood apraxia of speech (Vargha-Khadem,
Watkins, et al. 1998). Affected individuals simultaneously display problems in a wide range of language-related abilities,
in both oral and written domains, with impact on receptive as
well as expressive skills (Watkins, Dronkers, et al. 2002; VarghaKhadem, Gadian, et al. 2005). All 15 of the affected people in the
KE family have inherited a mutation altering a single nucleotide
letter in the DNA code of the FOXP2 gene (Lai, Fisher, et al. 2001).
This change affects the structure of the encoded FOXP2 protein
and prevents it from functioning properly (Vernes, Nicod, et al.
2006).
Although the mutation in question is private to the KE family,
different mutations disrupting FOXP2 function have been discovered in other families, showing comparable problems with
speech and language acquisition (MacDermot, Bonora, et al.
2005). In all cases identified thus far, the mutations have been
heterozygous; that is, people with the disorder have a mutation in only one copy of FOXP2, while the other copy is intact.
(Humans are diploid organisms, carrying two copies of every
gene, one inherited from the father, the other from the mother,
with a few exceptions such as the genes on the sex chromosomes.) The consistent observations of heterozygosity in different cases of FOXP2-related disorder suggest that affected people
have reduced amounts of working FOXP2 protein in brain circuits that are important for speech and language. Therefore, the

341

Genes and Language


amount (dosage) of FOXP2 may be a critical factor in the proper
development of speech and language skills.

Does That Make FOXP2 the Language Gene?


No. Although studies of people carrying damaged versions of
FOXP2 are consistent with a role (or roles) in the development
and/or processing of language, it is already apparent from
genetic studies that no single gene is exclusively responsible for
this distinctive human trait. Indeed, FOXP2 is implicated (thus
far) only in one rare form of disorder, and not mutated in people
diagnosed with more common variants of specific language
impairment (SLI) (Newbury, Bonora, et al. 2002). Instead, most
developmental language disorders are likely to be multifactorial: the product of multiple genetic risk factors, their interactions
with one another and interactions with the environment (Fisher,
Lai, et al. 2003). It is also worth noting that mutation of FOXP2
impairs not only aspects of speech and language but also aspects
of nonlinguistic orofacial motor control (Watkins, Dronkers, et
al. 2002; Vargha-Khadem, Gadian, et al. 2005).
More broadly, given what we know about the fundamentals of genetics, developmental biology, and neuroscience, it is
highly unlikely that there is a single human-specific gene whose
sole purpose is to endow our species with the capacity to acquire
language. Individual genes do not specify particular behavioral
outputs or aspects of cognitive function. Rather, they contain the
codes for assembling individual molecules that act in a highly
interactive fashion with other molecules in order to build and
maintain a working human brain (Marcus 2004). Often, a gene
will have a primary function that is very clearly defined at the
cellular level for example, by encoding an enzyme, structural
protein, ion channel, signaling molecule, or receptor but the
pathways that link the gene to higher-order brain function will
nevertheless be complex, indirect, and difficult to disentangle
(Fisher 2006).
The language gene shorthand is also misleading because
most, if not all, of the genes that are involved in language are
likely to play other roles, elsewhere in the brain and/or in other
tissues of the body. The expression of FOXP2 is not confined to
classical language-related regions of the cortex, or even to the
brain. Instead, it extends to additional brain structures, such
as the basal ganglia, thalamus, and cerebellum (Lai,
Gerrelli, et al. 2003), and to other parts of the body (e.g., the lungs
[Shu, Yang et al. 2001]); it also has close counterparts in all vertebrates, as discussed in the section on evolution. In sum, FOXP2
can properly be a called a gene that participates in language
but not the language gene or even a gene that participates
exclusively in language.

Genetic studies of typical forms of specific language impairment have identified other genomic regions that are likely to be
relevant to language, and researchers are focusing considerable attention on those chromosomal sites in the hope of pinning down particular genes (e.g., The SLI Consortium 2002). For
developmental dyslexia, a disorder primarily characterized by
reading disability, but underpinned by subtle persistent deficits
in language processing, it has been possible to home in on several candidate genes (DYX1C1, KIAA0319, DCDC2, and ROBO1).
These genes differ from FOXP2 in that there have been no specific causal mutations identified instead, it is thought that the
increased risk of dyslexia stems from as-yet unknown variants in
the regulatory parts of those genes that govern their expression
(Fisher 2006). Nevertheless, there are some striking parallels
with FOXP2; each of the dyslexia candidate genes shows widespread expression patterns in multiple circuits in the brain, and
each is active in additional tissues, not only the brain. None of
the genes is unique to humans; for example, highly similar versions of each are found both in other primates and in rodents. At
this stage, there is little understanding of why alterations in these
genes should have relatively specific effects on reading abilities,
although their basic neurobiological functions are beginning to
be defined; three of the genes (DYX1C1, KIAA0319, and DCDC2)
have been linked to neuronal migration, and the fourth (ROBO1)
codes for a receptor protein involved in signal transduction,
which helps regulate axon/dendrite guidance.
At the time of writing, there are indications that alterations
in gene dosage may emerge as a general theme underlying overt
speech/language deficits. For example, it has been recently
shown that duplications of a specific region on chromosome
7 (far away from the site of the FOXP2 gene) can cause speech
deficits (Somerville, Mervis, et al. 2005). What is especially interesting about this finding is the fact that the relevant part of chromosome 7, which contains several different genes, corresponds
to the region that is most commonly deleted in cases of Williams
syndrome, a well-studied disorder in which language skills can be
relatively well preserved as compared to other abilities. In other
words, while deletion of that part of chromosome 7 (i.e., reduced
gene dosage) tends to spare language, duplication of this same
set of genes (increased gene dosage) leads to speech disruptions.
Similarly, there is evidence to suggest that the number of functional copies of a chromosome 22 gene called SHANK3, recently
implicated in autism spectrum disorders, may be critical for
speech development (Durand, Betancur, et al. 2007). Language,
like many aspects of biology, is likely to depend on a precise balance among many different molecules.

What Can Genes Tell Us about the Evolution of Language?


How Representative Is FOXP2? Are Other Genes Involved in
Language Likely to Act in Similar Ways?
It is difficult to say for sure; thus far, the FOXP2 gene represents
the only known example where point mutations have been linked
to a developmental disorder which primarily affects speech and
language. However, since disruptions of FOXP2 are found in only
a very small subset of people with language-related disorders
(Newbury, Bonora, et al. 2002; MacDermot, Bonora, et al. 2005),
it is clear that there must be other genetic effects that remain to
be discovered.

342

Genes, like species, are the product of the process that Darwin
called descent with modification. Each gene has an evolutionary history, with its current function a modification of earlier
functions. To the extent that the language system is the product
of descent with modification, most genes that are associated with
language can be expected to have counterparts in nonlinguistic
species. As such, comparisons of gene sequences and expression
patterns in different species can help cast light on language evolution, identifying which of the relevant neurogenetic pathways
are shared with other species and which have been modified

Genes and Language


on the lineage that led to modern humans (Fisher and Marcus
2006).
FOXP2 again appears representative in this regard. Following
the discovery of the gene, molecular studies have shown that it is
present in similar form in many vertebrates, including mammals,
birds, reptiles, and fish, where it is expressed in corresponding
regions of the brain to those observed in humans (reviewed by
Vargha-Khadem, Gadian, et al. 2005; Fisher and Marcus 2006).
On the basis of such data, it appears that FOXP2 is evolutionarily
ancient, shared by many vertebrate species, regardless of speech
and language ability, where it may have conserved functions in
brain circuits involved in sensorimotor integration and motorskill learning (Fisher and Marcus 2006). For example, the striatum in the basal ganglia is a conserved site of high FOXP2
expression, which shows reduced gray matter density in humans
carrying FOXP2 mutations (Vargha-Khadem, Gadian, et al.
2005). It is intriguing that in songbirds, changes in expression of
the gene in striatal Area X a key nucleus of the brain system
involved in song learning appear to relate to alterations in vocal
plasticity (see White, Fisher, et al. 2006).
Despite such notable conservations across distantly related
species, a comparison of the locus in different primates has
demonstrated that there was accelerated change in the FOXP2
protein sequence during human evolution, most likely due to
positive selection (Enard, Przeworski, et al. 2002). Mathematical
analyses of genomic sequences from diverse human populations
suggest that the version of FOXP2 now ubiquitous in modern
humans arose within the last 200,000 years, concordant with several archaeological estimates of the time of emergence of proficient spoken language (ibid.). We may never know for certain
why these modifications spread throughout the population, but
it seems plausible that they proliferated due to some advantage
inherent in enhanced vocal communication, perhaps achieved
through modification of pathways already involved in motorskill learning. Still, this does not mean that changes in FOXP2
were the sole reason for the appearance of speech and language,
even if they did represent an important factor in the evolution of
human communication.

Whats on the Horizon?


Technological advances provide one reason for optimism.
Techniques for characterizing genes and genomes are quickly
becoming more rapid, cost-effective, and efficient which can
only speed up ongoing searches for genes involved in speech and
language. For example, it has recently become possible to simultaneously screen hundreds of thousands of genetic markers in
people with a disorder of interest, and compare the data to those
obtained from a control set of unaffected individuals. Given an
adequate sample size, this kind of approach could uncover subtle
genetic differences that are correlated with developmental language disorders. Before long, it will even be feasible to sequence
the entire genome of every person participating in a study. We
might also expect to see developments in the ways we can image
gene expression patterns in the human brain, with the hope that
it may one day be possible to observe on-line changes in gene
expression in neural circuits during language processing. Still, it
is also clear that we will need major conceptual advances in order
to make sense of the vast quantities of sequence and expression

data that will soon emerge, both in terms of sheer data analysis
and in terms of relating these data to linguistic functions.
Another exciting prospect is the use of genetic manipulation in order to find out more about the functions of genes that
are involved language, for example, by examining the function
of nonhuman counterparts to those genes (White, Fisher et al.
2006). In this way, individual genes may provide the first molecular entry points into neural pathways involved in human communication, and a direct way to understand how the twin processes
of descent and modification led to the remarkable and uniquely
human faculty for complex language.
Gary F. Marcus and Simon E. Fisher
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bishop, D. V. 2001. Genetic and environmental risks for specific language impairment in children. Philos Trans R Soc Lond B Biol Sci
356.1407: 36980.
Durand, C. M., C. Betancur, et al. 2007. Mutations in the gene encoding
the synaptic scaffolding protein SHANK3 are associated with autism
spectrum disorders. Nat Genet 39.1: 257.
Enard, W., M. Przeworski, et al. 2002. Molecular evolution of FOXP2, a
gene involved in speech and language. Nature 418.6900: 86972.
Fisher, S. E. 2006. Tangled webs: Tracing the connections between genes
and cognition. Cognition 101.2: 27097.
Fisher, S. E., C. S. Lai, et al. 2003. Deciphering the genetic basis of speech
and language disorders. Annu Rev Neurosci 26: 5780.
Fisher, S. E., and G. F. Marcus 2006. The eloquent ape: Genes, brains and
the evolution of language. Nature Reviews Genetics 7: 920.
Gopnik, M., and M. B. Crago 1991. Familial aggregation of a developmental language disorder. Cognition 39.1: 150.
Hurst, J. A., M. Baraitser, et al. 1990. An extended family with a dominantly inherited speech disorder. Dev Med Child Neurol 32.4: 3525.
Lai, C. S., S. E. Fisher, et al. 2001. A forkhead-domain gene is mutated in
a severe speech and language disorder. Nature 413.6855: 51923.
Lai, C. S., D. Gerrelli, et al. 2003. FOXP2 expression during brain development coincides with adult sites of pathology in a severe speech and
language disorder. Brain 126 (Part 11): 245562.
Lehmann, O. J., J. C. Sowden, et al. 2003. Foxs in development and disease. Trends Genet 19.6: 33944.
MacDermot, K. D., E. Bonora, et al. 2005. Identification of FOXP2 truncation as a novel cause of developmental speech and language deficits. Am J Hum Genet 76.6: 107480.
Marcus, G. F. 2004. The Birth of the Mind: How a Tiny Number of Genes
Creates the Complexities of Human Thought. New York, Basic Books.
Newbury, D. F., E. Bonora, et al. 2002. FOXP2 is not a major susceptibility gene for autism or specific language impairment. Am J Hum Genet
70.5: 131827.
Shu, W., H. Yang, et al. 2001. Characterization of a new subfamily of
winged-helix/forkhead (Fox) genes that are expressed in the lung and
act as transcriptional repressors. J Biol Chem 276.29: 2748897.
SLI Consortium, The. 2002. A genomewide scan identifies two novel
loci involved in specific language impairment. Am J Hum Genet
70.2: 38498.
Somerville, M. J., C. B. Mervis, et al. 2005. Severe expressive-language
delay related to duplication of the Williams-Beuren locus. N Engl J
Med 353.16: 1694701.
Terrace, H. S., L. A. Petitto, et al. 1980. On the grammatical capacity
of apes. In Childrens Language 2, ed. K. E. Nelson, 371495. New
York: Gardner.
Vargha-Khadem, F., D. G. Gadian, et al. 2005. FOXP2 and the neuroanatomy of speech and language. Nat Rev Neurosci 6.2: 1318.

343

Gesture
Vargha-Khadem, F., K. E. Watkins, et al. 1998. Neural basis of an
inherited speech and language disorder. Proc Natl Acad Sci U S A
95.21: 12695700.
Vernes, S. C., J. Nicod, et al. 2006. Functional genetic analysis of mutations implicated in a human speech and language disorder. Hum Mol
Genet 15.21: 315467.
Watkins, K. E., N. F. Dronkers, et al. 2002. Behavioural analysis of an
inherited speech and language disorder: Comparison with acquired
aphasia. Brain 125 (Part 3): 45264.
White, S. A., S. E. Fisher, et al. 2006. Singing mice, songbirds, and
more: Models for FOXP2 function and dysfunction in human speech
and language. J Neurosci 26.41: 103769.

GESTURE
no ideas, just irritable mental gestures.
(remark attr. to Lionel Trilling,
New York Times, June 21, 2006, A1)

What Is Gesture?
Lionel Trilling, in this non-motto, invokes an all-too-common
view of gesture. The very phrase hand waving suggests triviality.
But let us imagine Trillings own gesture. It would have been (we
can predict) what Cornelia Mller has called the palm up open
hand (PUOH), the hand seeming to hold a discursive object,
holding, in fact, Trillings view. These kinds of gestures have been
linked to the conduit metaphor the metaphor whereby language
or cognition is a container holding some content.
The PUOH is also one of a species of gesture termed by Kendon
gesticulation, one of several kinds of gesture he distinguished
and that I arranged on Kendons Continuum:

Figure 1. Gesture combining entity, upward movement, and interiority in


one symbol. (Computer art by Fey Parrill.)
Table 1. Gesture-speech binding resists interruption
Domain

Phenomenon

Delayed auditory feedback

Does not disrupt speech-gesture


synchrony.

Gesticulation Speech-Linked Pantomime Emblems


Sign Language

Stuttering

Gesture stroke onsets resist stuttering;


stuttering cancels ongoing strokes.

Even though gesticulation is only one point on the continuum,


in storytelling, living-space descriptions, academic discourse
(including prepared lectures), and conversations, gesticulation is
the overwhelming gesture type 99+ percent of all gestures and
it is the gesture offering the greatest penetration into language
itself. As one moves from gesticulation to sign language, two
reciprocal changes take place. First, the degree to which speech
is an obligatory accompaniment of gesture decreases. Second,
the degree to which gesture shows the properties of a language
increases. Gesticulations are obligatorily accompanied by speech
but have properties unlike language. Speech-linked gestures,
such as the parents were fine but the kids were [finger across
throat], are also obligatorily performed with speech but relate to
speech as a linguistic segment sequentially, rather than concurrently, and in a specific linguistic slot (standing in for the complement of the verb, for example). Pantomime, or dumb show,
by definition is not accompanied by speech. Emblems such as
the OK sign have independent status as symbolic forms. Signs
in American Sign Language (ASL) and other sign languages are
not accompanied by speech and while simultaneously speaking
and signing is possible for ASL-English bilinguals, this is not typical, and the languages themselves have the essential properties
of all languages.
Clearly, therefore, speech and gesticulations (but not the
other points along Kendons Continuum) combine properties

Blindness

Gestures occur when speaking to other


blind known as such.

Fluency

Speech and gesture are complex or


simple in tandem.

Information exchange

Information seen in gesture recalled as


speech, and vice versa,

344

that are unalike, and this combination of unalikes occupies the


same psychological instant a fact of importance for creating an
imagerylanguage dialectic. I use gesture, rather than gesticulation, in the remainder of this entry.
The gesture-first theory of language origin holds that the first
form of language consisted largely of gestures, to be later supplanted by speech an idea going back to tienne de Condillac in
the eighteenth century. Gesture-first has attracted much interest in
recent years. A difficulty, however, is that it predicts the wrong
gestures. The initial gestures would have been speechless pantomimes, nonverbal actions with narrative potential, but not the gesticulations that pose dialectic oppositions to language at the far end
of Kendons Continuum. Pantomime may indeed have been present but, if so, did not lead to the evolution of speech and gesture
units (growth points). Such units would likely have had their own
adaptive value. An implication is that different evolutionary trajectories landed at different points along the continuum, reflected
today in different forms and timing patterns with speech.

Gesture

2 2769

1 2740

so he gets a / hold of a big


3 2780

bends it way ba

[oak tree / and he


4 2828

ck]

Figure 2. Phases of a gesture timed with and he bends it way back. The insert is a frame counter (1 frame = 1/30
sec.). The total elapsed time is about 1.5 seconds. Panel 1: Preparation. Panel 2: A prestroke hold while saying
he. Panel 3: Middle of stroke bends it way ba(ck). Panel 4: End of stroke and beginning of the poststroke
hold in the middle of back.

Simultaneous Semiotic Modes


Figure 1 illustrates one gesture and how it is simultaneous with
coexpressive speech. The example is taken from the narration
of a cartoon story (the speaker had just watched the cartoon
and was recounting it from memory to a listener; instructions
emphasized that the task was storytelling without mention of
gesture). The speaker was describing an event in which one
character (Sylvester) attempted to reach another character
(Tweety) by climbing up a drainpipe conveniently attached
next to the window where Tweety was perched. He entered the
pipe and traversed it on the inside adding stealth to his effort.
The speaker said and he goes up thrugh the pipe this time
(the illustration captures the moment at which she is saying the
stressed vowel of thrugh). Coexpressively with up her hand
rose upward; coexpressively with through her fingers spread
outward to create an interior space. The upward movement and
the opening of the hand took place concurrently, not sequentially, and these movements occurred synchronously with up
through, the linguistic package that carries the same meanings.
The contrastive emphasis on thrugh, highlighting interiority, is
matched by the added complexity of the gesture, the spreading
of the upturned fingers. What makes speech and gesture coexpressive is this joint highlighting of the ideas of upward motion
and interiority.
Note the differences, too. In speech, meanings are analyzed
and segregated. Speech divides the event into semantic units a
directed path (up) plus the idea of interiority (through). Analytic
segregation further requires that direction and interiority be combined in order to obtain the composite meaning of the whole. In
gesture, this composite meaning is fused into one symbol, and
the semantic units are simultaneous there is no combination

(meaning determination moves from the whole to the parts, not


from the parts to the whole). The effect is a uniquely gestural way
of packaging meaning something like rising hollowness, which
does not exist as a semantic package in the lexicon of English at
all. Thus, speech and gesture, at the moment of their synchronization, were coexpressive but nonredundant, and this sets the stage
for doing one thing (conception of the cats climbing up inside the
pipe) in two forms analytic/combinatoric and global/synthetic.

Properties of Gestures
THE UNBREAKABLE SPEECH-GESTURE BOND. Synchronized
speech and gesture comprise virtually unbreakable psycholinguistic units, unbreakable as long as speech and gesture are
coexpressive. A diverse range of phenomena show the inseparability of the two modes; Table 1 summarizes some of them. In
each case, some disruption to speech-gesture combination is
resisted; it holds despite the disruption. To break this bond, one
has to drain the combination of meaning for example, through
rote repetition.
PHASES AND THEIR SIGNIFICANCE. Gesture phases are organized
around the stroke: everything is designed to present it in proper
synchrony with its coexpressive speech segment(s). Figure 2
shows all gesture phases except retraction. The full span, from
the beginning of preparation to the end of retraction, brackets
what can be thought of as the lifetime of a specific idea unit in
language-geared imagery. We see the image in a state of activation that did not exist before and does not exist after this span.
The dawn of the idea unit is seen in the beginning of the preparation, and the idea unit itself is the unit formed of the synchronized coexpressive speech and stroke (called a growth point).

345

Gesture
WHEN DO GESTURES OCCUR? Somewhat surprisingly, the timing
of gestures in relation to speech has been the subject of controversy. The question is whether gestures tend to anticipate their
linked linguistic material or coincide with it. The anticipation
view is often accompanied by a further idea that gestures take
place during speech pauses. The synchrony view, clearly, implies
that gestures and speech are co-occurring. When the question
is examined with careful attention to the distinction between
preparation and stroke, the facts are clear: The preparation for
the gesture precedes the coexpressive linguistic segment (with
a pause or not); the stroke coincides with this segment about
90 percent of the time. Holds ensure that this synchrony is
preserved.

Discourse and Social Interaction


In addition to fueling idea units, a significant intersection of gestures with language is in the construction of discourse and social
interactions.
The gesture in Figure 1 was the second that this speaker had
performed for Sylvesters ascent of the pipe. In the cartoon,
Sylvester attempts to climb the pipe twice, first on the outside, as
a kind of ladder, second on the inside, the version in Figure 1. The
outside gesture by this speaker, just before Figure 1, had been free
of pipelike features; it was pure ascent. The Figure 1 gesture thus
exhibited precisely what, in the immediate context, was distinctive interiority creating communicative dynamism. Narrators
who, due to error, do not mention the outside attempt but only the
inside ascent tend not to include interiority. The fact of interiority is
not sufficient; the gesture is sensitive to the distinctiveness of this
information in the discourse context. Coexpressive speech and
gesture apparently synchronize at points of high communicative
dynamism (experiments by S. Duncan and D. Loehr are currently
testing this hypothesis).
Gestures also code discourse frames by use of the second
hand. A two-handed gesture can initiate a discourse segment in
which one hand depicts events while the other hand maintains
the shape and/or location it had in the launching gesture, and
this frames the event in the continuing context. ASL exploits this
device for discursive cohesion.
A further concept provides an empirical route for finding the
context within which an idea unit is differentiated. A catchment
comprises multiple gestures with recurring form features and
exposes the discourse segment to which a growth point belongs
(the use of two hands for discourse frames comprises a catchment, but catchments are formed in a wide variety of ways).
Catchments offer a second insight for linguistics: Discourse itself
takes on imagery form.
In addition to discourse, gestures are sensitive to the socialinteractive context of the speakers. Asli zyrek showed that
changing the number and the spatial loci of listeners has an effect
on the speakers gestural imagery. Janet Bavelas has pioneered
the study of a class of gestures she terms interactive gestures
whose significance lies in the structuring and management of
social interactions without yielding control of the floor. Along
similar lines, gesture mimicry and joint speakerlistener gesture
production cement social interactions. In roundtable discussions, gestures are parts of turn-taking and speaker dominance.
Gestures with an interactive focus are not discontinuous from

346

Government and Binding


gestures relating to idea units. On the contrary, they exhibit continuity with ideas, as envisioned by L. Vygotsky.
David McNeill
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Beattie, G. 2003. Visible Thought: The New Psychology of Body Language.
Hove, UK: Routledge.
Goldin-Meadow, S. 2003. Hearing Gesture: How Our Hands Help Us
Think. Cambridge: Harvard University Press.
Kendon, A. 2004. Gesture: Visible Action as Utterance.
Cambridge: Cambridge University Press.
McNeill, D. 1992. Hand and Mind: What Gestures Reveal about Thought.
Chicago: University of Chicago Press.
McNeill, D.. 2005. Gesture and Thought. Chicago: University of Chicago
Press.
McNeill, D. ed. 2000. Language and Gesture. Cambridge: Cambridge
University Press.

GOVERNMENT AND BINDING


Government and binding (GB) theory, originally developed
by Noam Chomsky (see Chomsky 1981, 1982), is an approach
to the study of the syntax of human languages based on
abstract underlying representations and transformations
successively altering those structures. The approach is centered around universal principles argued to be innately represented in the mind and simple parameters , fixed by the
language learner from simple evidence, determining how
languages can differ.
GB theory developed out of Chomskys earlier work in transformational grammar. Like all of his work, it centered on
two fundamental questions:
(1) What kind of capacity is knowledge of language?
(2) How does this capacity arise in the individual?

From his earliest work, Chomskys answer to (1) posited a computational system that provided statements of the basic phrase
structure patterns of languages (phrase structure rules) and
operations for manipulating these basic phrase structures (transformations). GB strongly focused on question (2) by positing
heavier and heavier restrictions on the computational system,
thus limiting the choices available to the learner.
An innovation was the development of trace theory, which
proposed that when movement transformations operate, they
leave behind traces, silent placeholders marking the position
from which movement took place, as schematized in (3).
(3) Linguistics, I like t

Under trace theory, the earlier importance of deep structure (the initial representation in the syntactic derivation in the
standard theory) for semantic interpretation is ultimately
eliminated. It was already known that some aspects of meaning
(including scope of quantifiers, anaphora, focus) depend on surface structure. Once surface structure is enriched with traces, even
grammatical relations (subject of, object of, etc.) can be determined at that level of representation. Using the term LF (logical
form) for the syntactic representation that relates most directly

Government and Binding


to semantics and PF (phonetic form) for the one relating most
directly to phonetics, we have the so-called (inverted) Y-model in
(4), which was at the core of GB theorizing.
(4)

D-Structure

Transformations

S-Structure

PF
LF

Modularity
The GB theory displayed a high degree of modularity. Complex
phenomena were seen as the result of interactions of simple
modules. The phrase structure module was virtually reduced to
x-bar theory (originally developed in Chomsky [1970] 1972),
with specific instantiations following from properties of particular lexical items. Further, the X-bar schema itself was extended
from just lexical categories (noun, verb, adjective, etc.) to functional categories. For example, a sentence came to be analyzed
as the projection of an inflectional head, Infl, containing tense
and agreement information.
The transformational module is also dramatically simplified
in comparison with its predecessors. The GB framework replaced
the earlier numerous specific transformations with very general
operations, Move (displace any item anywhere), or even Affect
(do anything to anything). There is thus very little transformational syntax that the child has to learn. A grammar this simple
and general would seem to massively overgenerate, producing
countless numbers of unacceptable sentences. To deal with this
overgeneration problem, GB theorists, further developing a line
of research begun in the 1960s, posited general constraints on
the operation of transformations (locality constraints, especially
subjacency, part of bounding theory) and also conditions
on the output of the transformational component (including
filters).

Parameters
The postulated universal (wired-in) parts of the computational system are called principles. The (limited) ways in which
languages can differ syntactically are called parameters. The system is fundamentally based on principles and parameters.
The child learning a language is preequipped with the principles
and needs only to set the values of the parameters. The standard
assumption is that there are few parameters, they are very simple, and their values can be determined by the child on the basis
of readily available primary linguistic data.

-Theory and the Lexicon


The X-bar schema for phrase structure is one module of the
theory. The lexicon is another. These modules determine
Dstructure configurations via the regulation of a third module, theta theory. Subcategorization properties follow, in large
measure, from semantic properties. Thus, in a sentence with
the verb solve, there is a semantic function for a direct object to
fulfill, while there is no such function in the case of sleep. These
semantic functions that arguments fulfill are called thematic
()roles. The verb prove demands a direct object since the

object would fulfill a necessary -role (theme in this instance)


determined by the meaning of the verb. Conversely, an intransitive verb like sleep does not take a direct object since there would
be no -role for it to fulfill. These paired requirements on assigners and recipients of theta roles are called the -criterion: Every
-role must be assigned to one and only one argument, and every
argument must receive one and only one -role.

Case Theory
S-structures result from the transformational component operating on D-structures. Given the generality of Move , derivations often seem to yield ungrammatical sentences. One module
reining in this overgeneration by regulating S-structure is case
theory. There are characteristic structural positions that license
particular cases. In many languages (such as Latin, Russian,
German), these case distinctions are overtly manifested. In
English, only pronouns show an overt distinction between nominative and accusative, for instance, but Case Theory posits that all
noun phrases (NPs) have abstract case (henceforth, Case), even
when it is not phonologically visible. The requirement that all
NPs occur in appropriate Case positions is the Case Filter, a wellformedness condition on the S-structure level of representation.

Government
The GB approach always sought regularities and generalizations.
The notion government is itself a generalization of the X-bar
theoretic head-complement relation. The basic definition is as
follows:
(5) A head H governs Y if and only if every XP (highest projection of X) dominating H also dominates Y and conversely.
[Domination is ancestry in a phrase structure tree
diagram.]

By (5), a head governs its complement and also its specifier. Case
licensing then is under government, with the governor licensing the governee. A transitive verb governs its direct object NP; a
preposition governs its complement NP; Infl governs its specifier
(the subject of the clause). Thus, a Case-licensing head licenses
Case on a nominal expression that it governs. For example, a
transitive verb, such as prove licenses (accusative) Case on its
complement direct object.

Types of Movement
The transformational module of the theory recognizes three
major subtypes of movement. A-movement is movement to an
argument-type position (i.e., an A-position, especially subject
position. (non-A)-movement is movement of an XP (highest
projection of X, for variable X) to a non-A position. The movement of an interrogative expression, such as Who in (6) (WHmovement) is a central exemplar:
(6) Who will they hire t

WH-movement is standardly analyzed as movement to the


specifier of CP (complementizer phrase), a functional projection above IP (inflectional phrase). Both types of movement are
regarded as instantiations of one very general operation: Move .
The differences follow from independent properties of the items
moved and the positions moved to.

347

Government and Binding


(8)

(13) *She thinks Mary will solve the problem [with She intended to
refer to Mary]

CP
|
C'
C

(14) Mary thinks she will solve the problem


IP

NP
|
Susan

The Role of Logical Form


I'
VP

I
V
|
is

NP
|
a linguist

Figure 1.

The third major type of movement is head movement, where


an X0, a minimal X-bar theoretic element, adjoins to a higher
head (the very next higher head by the head movement constraint). One of the classic analyses of generative grammar was
restated in the GB framework in terms of head movement. Pairs
of sentences like those in (7) are related via movement of the
verb be/is to Infl, followed by movement of Infl to C, schematized
in (8) in Figure 1.
(7)

a. Susan is a linguist
b. Is Susan a linguist

Similar head movement, along with WH-movement, is involved


in the derivation of (6).

In the core GB model schematized in (4), LF is not distinct from


S-structure. However, more and more arguments were put forward that transformational operations of the sort successively
modifying D-structure, ultimately creating S-structure, also
apply to S-structure, creating a distinct LF. (See especially May
1977, 1985.) One such operation, quantifier raising (QR), moves
quantifiers from their surface positions to positions more transparently representing their scope, with the traces of the moved
quantifiers ultimately interpreted as variables bound by those
quantifiers. Unlike the transformational operations mentioned
earlier, applications of QR exhibit no phonological displacement. This follows from the organization of the grammar. When
a transformation operates between D-structure and S-structure,
it will have an effect on the phonetic output, since S-structure
feeds into PF. On the other hand, a transformational application
between Sstructure and LF will have no phonetic effect, since
LF does not feed into PF.
Another covert operation is the analog of overt wh-movement. Assume that overt WH-movement positions an interrogative operator in its natural position for interpretation (with the
trace it leaves behind in the natural position for a variable bound
by the operator). Then in sentences with multiple interrogatives,
such as (15), at the level of LF all are in sentence initial operator
position, as illustrated in (16).
(15) Where should we put what

Binding
The binding part of government and binding theory has as
its core anaphoric relations, circumstances under which one
expression can or cannot take another as its antecedent, that is,
pick up its reference from the other. In (9), him can take John as
its antecedent, while in (10), it cannot.
(9) John said Mary criticized him
(10) John criticized him

That is, (10) has no reading corresponding to that of (11), with the
pronoun him replaced by the anaphor himself (see anaphora).
(11) John criticized himself

A pronoun cannot have an antecedent that is too close to it.


This is Condition B of the binding theory. Conversely, an anaphor
requires an antecedent quite close to it (Condition A). Compare
(11) with (12).
(12) *John said Mary criticized himself

The pertinent locality is, roughly, being in the same clause


(though in certain instances a more complicated notion involving government is implicated, hence, Chomskys name governing category for the relevant domain).
A third binding condition (Condition C) excludes an anaphoric connection between the higher in the tree She and the
lower Mary in (13), as contrasted with (14).

348

(16) what1 [where2 [we should put t1 t2]


(16) is then rather transparently interpreted as:
(17) For which object x and which place y, we should put x at y

One of the most powerful arguments for covert WH-movement,


from Huang (1981/82), involves constraints on movement. For
example, it is difficult to move an interrogative expression out of
an embedded question (a question inside another sentence):
(18) *Why1 do you wonder [what2 [John bought t2 t1]]

If (18) were acceptable, it would mean What is the reason such


that you wonder what John bought for that reason. In languages
where wh-phrases are in situ (unmoved) at S-structure, such as
Chinese, their interpretation apparently obeys this same constraint. So, in Chinese, an example like (19) is possible but one
like (20) is impossible.
(19) ni renwei [ta weisheme bu lai]
you think he why
not come
LF: [weisheme1 [ni renwei [ta t1 bu lai]
Why do you think he didnt come?
(20) *ni xiang-zhidao [Lisi weisheme mai-le sheme]
you wonder
Lisi why
bought what
LF: [weisheme1 [ni xiang-zhidao [Lisi t1 mai-le sheme]
*What is the reason such that you wonder what Lisi bought for
that reason?

Grammaticality

Grammaticality Judgments

This argues that even though the weisheme is not phonetically


displaced, it really is moving; that is why it is obeying movement constraints. But this movement is covert, occurring in
the mapping from Sstructure to LF, hence not contributing to
pronunciation.
The modular and very restrictive nature of the GB approach of
syntax led to theories that went well beyond descriptive adequacy
toward a high degree of explanatory adequacy, as it drastically
limited the types and numbers of grammatical rules available to
the learner. In fact, the success of the approach led Chomsky to
formulate a new program, minimalism, that aims to move even
beyond explanatory adequacy.
Howard Lasnik
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. [1970] 1972. Remarks on nominalization. In Readings
in English Transformational Grammar, ed. Roderick A. Jacobs and
Peter S. Rosenbaum, 184221. Waltham, MA.: Ginn. Reprinted in
Noam Chomsky, Studies on Semantics in Generative Grammar, 1161.
The Hague: Mouton.
. 1981. Lectures on Government and Binding. Dordrecht, the
Netherlands: Foris.
.1982. Some Concepts and Consequences of the Theory of Government
and Binding. Cambridge, MA: MIT Press.
Chomsky, Noam, and Howard Lasnik. [1993] 1995. The theory of principles and parameters. In Syntax: An International Handbook of
Contemporary Research. Vol. 1. Ed. Joachim Jacobs, Arnim von Stechow,
Wolfgang Sternefeld, and Theo Vennemann, 50669. Berlin: Walter
de Gruyter. Reprinted in Noam Chomsky, The Minimalist Program,
13127. Cambridge, MA: MIT Press.
Haegeman, Liliane. 1994. An Introduction to Government and Binding
Theory. 2d ed. Oxford: Blackwell.
Huang, C.-T. James. 1981/82. Move wh in a language without wh-movement. Linguistic Review 1: 369416.
Lasnik, Howard, and Juan Uriagereka. 1988. A Course in GB Syntax: Lectures
on Binding and Empty Categories. Cambridge, MA: MIT Press.
May, Robert. 1977. The Grammar of Quantification. Ph.D. diss.,
Massachusetts Institute of Technology.
. 1985. Logical Form: Its Structure and Derivation. Cambridge,
MA: MIT Press.
Webelhuth, Gert, ed. 1995. Government and Binding Theory and the
Minimalist Program. Oxford: Basil Blackwell.

grammaticality judgments ). If grammar is taken in the


sense of theory, grammaticality may be a purely binary property, depending on the degree and type of formalization of the
theory.
Noam Chomsky (1957, 15) introduces grammaticality in contrast to acceptability, illustrating them with the examples in (1).
(1) a.
b.
c.
d.

Colorless green ideas sleep furiously.


*Furiously sleep ideas green colorless.
Have you a book on modern music?
*Read you a book on modern music?

Ungrammaticality in (1) is indicated by a star, following


a convention introduced in the early 1960s. Whereas (1ab)
are difficult to interpret, (1cd) are readily understandable.
Nevertheless, (1a) is grammatical and (1d) not. The acceptability
problem in (1a) is entirely due to semantic incongruence. It can
be overcome by extending the meanings of the words in the sentence. This is not possible for (1b). In (1d), the meaning is entirely
transparent, but English grammar does not allow questions to be
formed in this way.
As discussed in detail by Frederick J. Newmeyer (1983),
grammaticality is a theory-dependent property. The boundary
between the factors accounted for by the grammar and by other
components of knowledge is not given in advance. If a sentence
violates a condition of the grammar, it is ungrammatical, but if
it violates, for instance, only semantic or pragmatic conditions, as in (1a), it is grammatical. Grammaticality in relation to
competence is a matter of degree. Thus, Liliane Haegeman (1994,
56573) discusses the analysis of contrasts such as (2).
(2) a. *Whomi do you know [the date [whenj [Mary invited ti tj]]]
b. **Whenj do you know [the man [whomi [Mary invited ti tj]]]

Both sentences in (2) are ungrammatical, but (2b) is worse


than (2a). As this result is fairly robust, it is worth trying to
explain it in terms of grammatical theory. A fully formalized
grammar will normally partition the set of possible sentences
into grammatical and ungrammatical ones, without any intermediate degrees.
Pius ten Hacken
WORKS CITED AND SUGGESTIONS FOR FURTHER READING

GRAMMATICALITY
A sentence that is well formed according to a given grammar
is grammatical. By analogy, the property can also be assigned
to constituents of a sentence, for example, noun phrases. In
generative linguistics, the term grammar has been used with
a systematic ambiguity: We must be careful to distinguish the
grammar, regarded as a structure postulated in the mind, from
the linguists grammar, which is an explicit articulated theory
that attempts to express precisely the rules and principles of the
grammar in the mind of the ideal speaker-hearer (Chomsky
1980, 220).
Each sense of grammar corresponds to a sense of grammaticality. If grammar is taken in the sense of competence,
a component of the speakers mind, grammaticality is a gradual property that is realized in judgments by the speaker (see

Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton.


. 1980. Rules and Representations. New York: Columbia University
Press.
Haegeman, Liliane. 1994. Introduction to Government and Binding
Theory. 2d ed. Oxford: Blackwell.
Newmeyer, Frederick J. 1983. Grammatical Theory: Its Limits and Its
Possibilities, Chicago: University of Chicago Press. Discussion of grammaticality in Chapter 2.

GRAMMATICALITY JUDGMENTS
Grammaticality judgments involve explicitly asking speakers
whether a particular string of words is a well-formed utterance of
their language, with an intended interpretation stated or implied.
Among the many kinds of data available to linguistics, grammaticality judgments are particularly useful in distinguishing

349

Grammaticalization
possible from impossible utterances (the latter conventionally
marked *) among those that are not spontaneously produced.
Intermediate degrees of well-formedness may also be of interest (e.g., ? questionable, ?? highly questionable). These
judgments can also bring to light knowledge of language in special populations whose production and even comprehension
may not evince it. Contra widespread belief, however, like all
performance data they bear on linguistic competence only
indirectly and call for the same attention to methodology as
other data sources.
Carson T. Schtze

GRAMMATICALIZATION
Grammaticalization is a historical process whereby fixed grammatical forms, such as prepositions, conjunctions, suffixes, and
auxiliaries, and the constructions of which they are part arise out
of what were previously independent categorial forms in freer
arrangements. The negative construction with ne pas in French
provides a good example: pas step, pace was originally a noun
that functioned to reinforce the ne that supplied the negative
meaning, as in il ne va pas he doesnt go (originally doesnt
go a step). Nowadays, pas has become grammaticalized as a
general marker of negation, as in il (ne) parle pas he doesnt
speak the ne being increasingly dropped altogether. As a field,
grammaticalization is the study of those linguistic changes that
result in specifically grammatical forms and construction. The
standard history of grammaticalization is Lehmann 1995.
Grammaticalization involves two aspects: structural change
and semantic change. These two kinds of change go hand
in hand, and it is impossible to assign priority to either of them.
Structural changes include reanalysis and phonological reduction. In reanalysis, adjacent forms are rebracketed: [I am going]
[to sell my pig] Im on my way to sell my pig becomes [I am going
to][sell my pig] and eventually [going to] assumes the meaning
of future tense. Characteristically, too, a major category, such as
verb or noun that was formerly the head of a phrase, is demoted to
a minor category, such as auxiliary or preposition, and becomes
a satellite to the new head; thus, when a cup full of flour becomes
a cupful of flour, the erstwhile head noun cup is reanalyzed as a
component of a quantifying expression, now reduced in status to
a determiner of the semantic head noun, and its place as semantic head noun is usurped by flour. Similarly, full, previously the
head of the adjective phrase full of flour, is reduced to the status
of a suffix on cup. This decategorialization of grammaticalized
forms (Hopper 1991, 303; Hopper and Traugott 2003, 10614)
means that the forms lose the typical attributes of the older category, such as ability to take modifiers and determiners, availability as an argument of the verb, and independent referential
status. Parallel restrictions are placed on verbs that become
auxiliaries.
Phonological reduction, or erosion (Heine and Reh 1984,
215), is a frequent but not inevitable accompaniment of grammaticalization. English lets is clearly derived from let us, but
now serves to introduce an adhortative predicate, as in lets
leave. When [going][to sell] was reanalyzed as [going to][sell],
going to became reduced to gonna. The French future tense in

350

(je) chanterai I will sing (Benveniste 1968) came about when


the Old French descendents of Latin cantare habeo I have to
sing were collapsed into a single word: cantar ayo > cantarayo
> chanterai. The end product of such changes is typically a paradigm in which the verb is decked out with person and number
affixes that were formerly pronouns and tense, modality, and
aspect affixes that were once auxiliaries.
In recent years, linguists have come to see forms undergoing
grammaticalization as spreading out into wider contexts and as
increasing their pragmatic usefulness (Traugott 1995; Traugott
and Dasher 2002). For example, the Old English ancestor of the
modal auxiliary can know how to occurred exclusively with
human subjects. Later, the restriction to humans was modified
to include a wider variety of forms, including inanimates: These
trees can grow to a height of 100 meters. This widening distribution presupposes a semantic change from knowledge to ability
to possibility. The English going to/gonna construction provides
another example. There is a change from the sense of purpose,
as in Shakespeares letters to my friends, And I am going to
deliver them (Two Gentlemen of Verona 3.1, 51), that is, I am
on my way to deliver them, to a predictive future tense in which
neither motion nor purpose is expressed, as in The ice carvings
are going to melt.
The changes characteristic of grammaticalization are gradual (Lichtenberk 1991) and unidirectional (Haspelmath 1999;
Hopper and Traugott 2003, 88139). They evolve in the single
direction of semantic diffuseness and increased pragmatic range
and, often, phonological reduction and grammatical agglutination. The exceptions to this directionality are idiosyncratic and
frequently turn out not to be true exceptions (see Hopper and
Traugott 2003, 1308 for further discussion).
Grammaticalization, while studied principally as a subfield
of the study of change, has implications for general linguistics
in pointing to the essential fluidity of grammar. It suggests that
the appearance of fixed forms and rules is illusory, that the
grammarlexicon division is more blurred than is commonly
assumed, and that grammatical structure itself is emergent,
that is, subject to constant revision by speakers. While the paths
of grammaticalization are constrained by universal and cognitive factors, their proximate causes are in discourse, through
frequency of use (Bybee and Hopper 2001) and the consequent
routinization (Haiman 1994) of word combinations.
Paul J. Hopper
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aitcheson, Jean. 2001. Language Change: Progress or Decay? 3d ed.
Cambridge: Cambridge University Press. A readable introduction to
the study of change, with an excellent chapter on grammaticalization.
Benveniste, Emile. 1968. Mutations of linguistic categories. In Directions
for Historical Linguistics: A Symposium, ed. Winfred Lehmann and
Yakov Malkiel, 8594. Austin: University of Texas Press.
Bybee, Joan, and Paul Hopper, eds. 2001. Frequency and the Emergence of
Linguistic Structure. Amsterdam: John Benjamins.
Haiman, John. 1994. Ritualization and the development of language.
In Perspectives on Grammaticalization, ed. William Pagliuca, 328.
Amsterdam: John Benjamins.
Haspelmath, Martin. 1999. Why is grammaticalization irreversible?
Linguistics 37: 104368.

Grooming, Gossip, and Language


Heine, Bernd, and Tania Kuteva. 2002. World Lexicon of
Grammaticalization. Cambridge: Cambridge University Press.
Heine, Bernd, and Mechthilde Reh. 1984. Grammaticalization in African
Languages. Hamburg: Buske.
Hopper, Paul. 1991. On some principles of grammaticalization. In
Traugott and Heine 1991, I: 1735. Amsterdam: John Benjamins.
Hopper, Paul, and Elizabeth Traugott. 2003. Grammaticalization. 2d ed.
Cambridge: Cambridge University Press. An up-to-date survey of the
field, with detailed discussion of current controversies.
Kuteva, Tania. 2001. Auxiliation: An Enquiry into the Nature of
Grammaticalization. Oxford: Oxford University Press.
Lehmann, Christian. 1995. Thoughts on Grammaticalization.
Munich: Lincom Europa. The standard work on the historical background to grammaticalization.
Lichtenberk, Frantiek. 1991. On the gradualness of grammaticalization. In Traugott and Heine 1991, I: 3780.
Traugott, Elizabeth. 1995. Subjectification in grammaticalization. In
Subjectivity and Subjectivisation in Language, ed. Dieter Stein and
Susan Wright, 3154. Cambridge: Cambridge University Press.
Traugott, Elizabeth, and Richard Dasher. 2002. Regularity in Semantic
Change. Cambridge: Cambridge University Press.
Traugott, Elizabeth, and Bernd Heine, eds. 1991. Approaches to
Grammaticalization. 2 vols. Amsterdam: John Benjamins.

GROOMING, GOSSIP, AND LANGUAGE


Language (and speech) are unique to humans and, of course,
play a crucial role in making possible the human social and cultural worlds. Yet the question as to why language might have
evolved, or even why it evolved only in the human lineage, has
seldom been asked. At best, there has been an implicit assumption that language evolved to allow our ancestors to create and
manufacture the stone tools that have been so important a part
of the story of human evolution or to plan the hunts required to
obtain meat with those tools. However, a number of alternative
views of language evolution have recently been proposed that
emphasize the social aspects of language use. Perhaps the best
known of these is the gossip hypothesis. Others include the need
to make social contracts and the role of mate choice.
It is important to appreciate in this context that the use of the
term gossip in the gossip hypothesis does not imply the kinds of
pejorative, often malicious, forms of gossip that are often associated with the term today. Although conversations may include
statements of this kind (in effect, it is a form of policing designed
to control others behavior), the term gossip is here being used in
its much broader original sense to refer to all kinds of topics that
are essentially social in character (information about oneself,
ones likes and dislikes, ones relationship with ones interlocutor, other people, arrangements for future social events, etc.). It
has more to do with the kinds of casual conversation one might
have around the hearth or over the garden fence.
In one sense, a conversation is a statement of intent or interest
in the other party: I would rather be standing here talking to you
than over there talking to so-and-so (and it does not really matter what we talk about). In this respect, language can be seen as
a form of social grooming, and one might envisage that the origins of language lie in some kind of wordless and contentless
chorusing when two individuals were physically separated and
engaged in other activities like feeding. One reason for suggesting

an intermediate state of this type is that we, in fact, find exactly


this kind of vocal exchange in some species of monkeys (e.g.,
the contact calls of baboons; see primate vocalizations). In
gelada baboons, these calls are exchanged preferentially between
grooming partners when they are feeding or traveling. However,
an alternative (but not necessarily incompatible) hypothesis for
the origin of language might be that it evolved to allow us to comment on, or even organize, our internal thoughts and only later
became externalized in response to a social context.

Background
Over the past century, a great deal of work has been done on
questions about the anatomical and neural bases of language
production (i.e., speech and phonetics; see brain and lan-

guage; speech anatomy, evolution of; phonetics and


phonology, neurobiology of) and on the structural bases
of language (i.e., grammar and related aspects of cognition).
However, while all of these are important to the grand story of
language evolution, none addresses the question of what we
actually do with language the reasons why it evolved in our
lineage in the form we now have. Understanding the function
of language might help to explain some of the design features of
language since it is function that drives evolution.
One aspect on which almost everyone would agree is that
grammar plays a central role in language: It is what allows us
to express complex thoughts and convey information to one
another. Without that capacity, language would be very impoverished, and humans would not have been able to achieve the
remarkable accomplishments of science, culture, and architecture that we have. However, the fact that grammar exists or even
has a particular structure does not necessarily tell us what it was
designed to do. The fact that we can use language to create science now does not necessarily mean that it originally evolved
for this purpose. Grammatical structure is an all-purpose tool
that allows any kind of information to be transmitted. The issue,
then, is which kinds of information are evolutionarily primitive (i.e., were the initial driving force that selected for language
capacity).
The suggestion that language evolved for essentially social
purposes (but has since been exploited for the conveyance of
technical information) emerged from studies of social bonding in monkeys and apes. Although grooming has an obvious
hygienic function in all animals (namely, removing debris and
perhaps parasites from the skin), it is obvious that some species of monkeys and apes do far more grooming than is really
necessary for strictly hygienic purposes. Some of the more social
species, for example, devote as much as 20 percent of their day
to grooming each other. In these species, grooming functions
as a mechanism for social bonding: Through the calming and
other physiological effects that grooming has on the recipient,
it establishes the bases for friendship and cooperative alliances.
More importantly, it turns out that the amount of time devoted
to social grooming by a given monkey or ape species is related to
the size of its social groups: The bigger the group, the more time
spent grooming. However, there seems to be an upper limit on
time spent grooming at about 20 percent of total daytime. This
limit is set by the demands of other essential activities, such as
foraging and traveling between feeding sites.

351

Grooming, Gossip, and Language


The puzzle that emerged out of this research was the realization that if humans were to bond their groups in exactly the same
way as other monkeys and apes, then the size of typical human
social groups would require us to devote more than twice the
maximum amount of time that any monkey or ape species has
ever been known to devote to this activity which would involve
nearly half of all the hours in the day. Since the realities of having
to find food seem to impose an upper limit on grooming time for
monkeys and apes, it seems unlikely that humans could bypass
this constraint. In addition, for primates, the amount of time that
is free to devote to grooming limits group sizes. Hence, if there
is an upper limit on the amount of time that could be devoted
to grooming, then human groups would be limited to the same
sizes as those of other primates. The fact that they are quite
obviously not so limited (however they are measured, modern
human groups are clearly many times larger than the largest
groups found in any primate species) means that some other
mechanism has been brought into play to enable humans to
break through the glass ceiling imposed by time limits on grooming. Language was suggested as the likely explanation.
Language has several properties that make it more efficient
than grooming as a mechanism for social bonding. Grooming
is very much a one-on-one activity (as it still is, in fact, with us
today). That means that if you have to invest a certain amount of
time in each social partner to create a working relationship, the
number of relationships is ultimately fixed by how much time
you can spare. But language has broadcast capacities that enable
us to have a onemany relationship with our social partners,
thus allowing us to service several relationships simultaneously.
Speech also has the useful property of allowing us to multitask: We can talk and walk, or talk and feed, at the same time,
whereas grooming does not allow that. In addition, language
allows us to exhibit badges of group membership: Being able to
use the local dialect or to understand subtle jokes or obscure
allusions helps to label us as members of a community.
Over and above these properties, however, the informationtransfer capacities of language allow us to do one thing that
grooming does not, and this is to pass on information about the
state of the social network in the absence of firsthand knowledge.
For monkeys and apes, what they do not see they will never know
about. But we can ask and be told about what others have been
up to in our absence. That way, we can keep track of our dynamically changing social world, know about cases in which individuals have reneged on their social obligations to us, and more
generally avoid the worst social faux pas by not knowing who is
or is not now friends with whom.

Evidence for the Gossip Hypothesis


Although we can never know what actually happened when language first evolved, the case for the gossip hypothesis would be
strengthened if we could show that language use was heavily
dominated by social functions (in the loose sense, gossip). A
number of studies provide evidence of this kind.
An early study of freely formed, natural conversations indicated that around 65 percent of conversation time by both genders was devoted to social topics, with all other topics (including
sports, politics, religion, technical or work-related topics,
and factual matters of all kinds) accounting for only about 35

352

percent among them. A more detailed analysis of the social content itself suggested that the vast majority was concerned with
factual personal experiences (~30% of all social conversation
time), personal social/emotional experiences (~30%), or with
third-party social/emotional experiences (~30%). The balance
was devoted to critical comments about third parties and seeking/giving advice.
More recently, an experimental study of language transmission in which groups of four participants were asked to relay
information passed on to them by a previous member of a chain
found that both gossip (in the racy sense) and social information (about the actions of people in a story) were transmitted
much more reliably than was factual information about a social
event (with no motivational content) or descriptive information
about a nonsocial event (for example, tourist information about
a location). This suggests that information with a strong social
(and perhaps emotional) content is more memorable in some
way than other kinds of information. The transmission rates for
social and gossip content did not differ significantly, suggesting
that the issue is not the raciness of the content but the fact that it
concerns individuals social and emotional lives.

Alternative Hypotheses
Two alternative hypotheses as to how language might function
in the social domain have been suggested: the symbolic contract hypothesis (proposed by Terrance Deacon in his book The
Symbolic Species) and the Scheherazade effect (proposed by
Geoffrey Miller). Both are concerned with aspects of reproductive behavior.
Deacon observed that humans have a unique mating system
based on pairbonds that are embedded within a large multimale/
multifemale social group. What makes this a particular problem
is that the division of labor means that mates are often separated
for long periods of time in the presence of potential rivals. Since
the presence of rivals creates risks for the sexual fidelity of a pairbond, Deacon argued that language must have been needed to
establish formal social contracts in which exclusive mating rights
are identified and publicly agreed upon (This is my mate; you
are not allowed to interfere with him/her while I am away).
Millers Scheherazade effect is also concerned with the business of mating, but in this case, the focus lies with the intrinsic
(as opposed to the extrinsic) dynamics of the pairbond: how to
woo and keep your mate interested, rather than merely guarding
him/her against rivals. Miller argued that the capacity to be witty
and entertaining (in the Arabian Nights sense) would have been
strongly selected for, especially in a context where more attractive rivals were readily available.
While both suggestions are undoubtedly plausible, for neither has any substantive evidence yet been adduced. However,
an alternative view might be to see both of these mechanisms
as being derivative of an initial situation for which language
had been selected as a more general mechanism for bonding
large social groups. One reason for this suggestion is simply that
Deacons paradox (the risk to pairbonds posed by the presence
of large numbers of rivals) can only be a problem when social
group size is large. But without some kind of bonding mechanism over and above social grooming, it is difficult to see how
our distant ancestors would have been able to create and hold

Habitus, Linguistic
together large social groups. Thus, we might see language evolving initially as a social bonding device and then subsequently see
the skills underpinning language having been exaggerated by
either or both of these two mechanisms.
R. I. M. Dunbar
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Deacon, T. 1997. The Symbolic Species: The Coevolution of Language and
the Human Brain. Harmondsworth, UK: Allen Lane.
Dunbar, R. I. M. 1993. Coevolution of neocortex size, group size and language in humans. Behavioral Brain Science 16:681735.
Dunbar, R. I. M., N. Duncan, and A. Marriot. 1997. Human conversational behaviour. Human Nature 8: 23146.
Mesoudi, A., A. Whiten, and R. I. M. Dunbar. 2006. A bias for social information in human cultural transmission. British Journal of Psychology
97: 40523.
Miller, G. 1999. Sexual selection for cultural displays. In The Evolution
of Culture, ed. R. I. M. Dunbar, C. Knight, and C. Power, 7191.
Edinburgh: Edinburgh University Press.

H
HABITUS, LINGUISTIC
Habitus is one of Pierre Bourdieus two fundamental conceptual tools of analysis; the other being field. If field relates to the
objective conditions of social space, then habitus is an expression
of subjectivity and is defined by Bourdieu as
systems of durable, transposable dispositions, structured structures predisposed to function as structuring structures, that is,
as principles which generate and organize practices and representations that can only be objectively adapted to their outcomes without presupposing a conscious aiming at ends or an
express mastery of the operations necessary in order to attain
them. Objectively regulated and regular without being in any
way the product of obedience to rules, they can be collectively
orchestrated without being the product of the organizing action
of a conductor. ([1980] 1990, 53)

Both habitus and field are homologous in terms of structures that


are both structured and structuring. Linguistic habitus concerns
the language element of any individuals or group of individuals
habitus.
Linguistic habitus is a central theme in Bourdieus attack on
orthodox linguistics. In Language and Symbolic Power ([1982]
1991), he argues that conventional linguistic studies are based
on a fundamental misunderstanding. In linguistics, language is
studied very much as an object of contemplation. This tradition
began with Ferdinand de Saussure and treats the social world as
a series of phenomena of which language is one that can be
decoded or deciphered according to a particular established theoretical code. Bourdieu refers to this tradition as an intellectualist philosophy against which he wishes to pose his own theory
of practice. For Bourdieu, it is not enough to treat language as
symbolic interactions; it is also necessary to see them as expressions of symbolic power (see inequality, linguistic and

communicative). Every act of language takes place in a space


governed by the rules of a field, which itself both forms and regulates the linguistic habitus. Linguistic habitus is intimately connected to habitus as a whole, which it helps to form and express.
It is a kind of social personality and, in language, is expressed
in linguistic dispositions. Such dispositions can be conscious but
are mostly unconscious motives, behaviors, and tendencies to
act and, in this case, speak in a certain way a way originating
from social background. The objective background of the linguistic environment establishes norms of behavior in language,
which are both sanctioned and policed within the field. In this
way, dominant linguistic patterns are established, legitimated,
and consecrated.
Linguistic habitus can express itself at any level of language.
However, it has particularly strong markers in phonetics, syntax, and paralinguistic features (see paralanguage), such
as propensity to speak, interest, and expression. In effect, what
Bourdieu is attempting to do with this approach to language and
linguistics is to integrate the study of objective social variation
from sociolinguistics (see Vann 2004) with the patterns of
subjective affective convergence and divergence found in social
psychology. In linguistic habitus, objective patterns of social
variations can be found together with subjective dispositions
expressed in and against a linguistic field background. Field here
refers to both overarching dominant fields and fields within
fields. Because linguistic habitus is essentially formed as part
of and expresses power relations, all language acts whether of
comprehension or articulation, oral or aural need to be understood as power relations and as symbolizing a certain relation
both to language and the field in which it arises. There are consequently forms of both self-censure and selectivity in language
uses, which express the linguistic habitus and the logics of practice that formed it and with which it is now confronted. There are
also strategies in language use, for example, euphemism in place
of direct expression. Such strategies can be seen to be employed
by those occupying positions of linguistic dominance in the
field: They play with language as a part of linguistic mastery.
Condescension is another of these features of linguistic dominance. A further strategy is hypocorrection: The linguistically
dominant can descend into vulgar speech as an expression of
their complete control of the dominant vernacular. It is acknowledged as such by those around them as a sign of distinction in
the way someone plays with the popular form. These strategies
are not open to those less linguistically secure. In fact, some of
them may even have recourse to the opposite strategy hypercorrection where anxiety to be linguistically correct is merely
interpreted as evidence of linguistic insecurity and, therefore, an
inferior position in the field.
Bourdieu uses linguistic habitus in a number of ways and
field contexts. However, it has particular significance in the areas
of culture and education. In culture, linguistic habitus is the
base generator of a certain style and way of being in the world.
Whether affected, direct, stylized, abrupt, or prompt, linguistic
manners can be understood in terms of the social conditions that
produced them and the social differentiating purposes for which
they were created. In fact, at one point, he even argues that our
entire language and its classificatory systems can be understood
as the expression of opposing (antagonistic) adjectives for

353

Habitus, Linguistic

Head-Driven Phrase Structure Grammar

example, high/low, fine/coarse, light/heavy, broad/narrow,


common/unique, brilliant/dull and that these adjectives have
as their social derivation the structure of society (and its dominant fields to be found in social classes; see Bourdieu [1979]
1984, 468).
Language, of course, is the medium of education, and it is
here that the convergence or divergence between a particular linguistic habitus and the field that surrounds it is most
apparent. In publications such as The Inheritors ([1964] 1979),
Reproduction ([1970] 1977), and Academic Discourse ([1965]
1994), Bourdieu shows the link between academic language and
individual habitus, expressed in cognitive and mental structures ways of thinking and the very language of such expression. There are power relations between teachers and students
played out when one linguistic habitus (that of the teacher) faces
another (that of the student) (see Grenfell 1998, 2004). The fact
that some students come from the same social origins as the culture represented in education and others do not is the basis of
matches and mismatches that impact academic achievement.
Put succinctly, the linguistic habitus of some students results in
their feeling like a fish in water during their schooling while others are most certainly out of the water and left somewhat high
and dry. Parents own linguistic habitus even complements their
own explicit collusion in this process of hidden social selection,
as some students pass through to the upper echelons of academia while others drop out. Linguistic habitus, in this sense, is
their very being.
Ultimately, Bourdieu is seeking to transform empirical thinking (everyday/common sense) into scientific thinking (in his
case, sociological) by altering the linguistic habitus of researchers in its empirical forms in such a way that everyday language
is replaced, at least in part or partially reflected upon, by such
analytical thinking tools as habitus, field, and so on.
Michael Grenfell
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bourdieu, Pierre. [1964] 1979. The Inheritors: French Students and Their
Relation to Culture. Trans. R. Nice. Chicago: University of Chicago
Press.
, with Jean-Claude Passeron and Monique De Saint Martin. [1965]
1994. Academic Discourse. Oxford: Polity.
, with Jean-Claude Passeron. [1970] 1977. Reproduction in
Education, Society and Culture. Trans. R. Nice. London: Sage.
. [1979] 1984. Distinction. Trans. R. Nice. Oxford: Polity.
. [1980] 1990. The Logic of Practice. Trans. R. Nice. Oxford: Polity.
. [1982] 1991. Language and Symbolic Power. Trans. G. Raymond
and M. Adamson. Oxford: Polity.
, with Loc Wacquant. 1989. Towards a reflexive sociology: A workshop with Pierre Bourdieu. Sociological Theory 7.1: 2663.
Encrev, Pierre. 1983. La liaison sans enchainement. Actes de la recherch en science sociales 46: 3966.
Grenfell, Michael. 1998. Language and the classroom. In Bourdieu
and Education: Acts of Practical Theory, ed. M. Grenfell and D. James,
7288. London: Falmer.
. 2004. Bourdieu in the classroom. In Culture and Learning: Access
and Opportunity in the Curriculum, ed. M. Olssen, 4972. Westport,
CT: Greenwood.
Vann, Robert. 2004. An empirical perspective on practice: Operationalising Bourdieus notions of linguistic habitus. In

354

Pierre Bourdieu: Language, Culture and Education, ed. M. Grenfell and


M. Kelly, 7384. Bern: Peter Lang.

HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR


HPSG: A Preview
The core idea of head-driven phrase structure grammar
(HPSG), as a theory of grammatical representations, is
that local syntactic dependencies patterns of covariation
accounted for in earlier versions of generative grammar via
transformations can be reduced to epiphenomena of selectional specifications on heads. In order to implement this
explanatory strategy, HPSG models linguistic expressions as
complex symbols, which interact with a small network of very
general constraints to admit a subset of possible representations. The constraints that restrict structural admissibility are
defined on local phrase structure representations phrasal
categories and their daughters and take the form of feature
(in)equality requirements, where feature/value pairs are used
to encode separately transmissable properties of linguistic
expressions. These constraints apply at a single level of representation and jointly determine the status of complex linguistic signs. Roughly speaking, sentences of the language
correspond one-to-one with these admissible signs.
A true theory of grammar is a completely explicit set of representations, a subtheory of models for the representations the
theory sponsors, and fully interpreted constraints on those representations. One objective of such a theory is to make it clear
why sentences such as Terry put that book on the table, That
book, Terry put on the table, On that table Terry put the book, and
so on are well formed, whereas not only *Table put the on the that
Terry book but also *Terry put the book, *Terry put the book on,
*Terry put on the table, and so on, are all bad. The first of these
bad examples has all the words of the well-formed versions cited
earlier but hopelessly scrambled together; the latter examples
have the right order, but all appear to be missing crucial parts.
The problem is to account for these facts and for the similarities
and differences in the seemingly related sentences exhibited.
The theory of syntax embodied in HPSG appeals to principles of
valence satisfaction how lexical items combine with other elements that they require as parts of their idiosyncratic individual
(1)
PHON

criticized
verb
HEA D

VFORM

fin

A UX
LOC
SYNSEM

LOC

NP
COMS NP

SUBJ

CA T

normi

normj

CONT w(j )(i )


CONX
MONLOC

Figure 1.

...

...

Head-Driven Phrase Structure Grammar


(2)
(a)

(b)

PHON criticized
verb
VFORM fin
A UX

HEA D

SYNSEM LOC CA T

LOC
SUBJ

NPnormi

COMPS

CONT w(j)(i)

verb
HEA D VFORM fin
A UX
LOC CA T

NPj

criticized

Leslie

Figure 2.

NPnormi

PHON criticized

SYNSEM

VP

LOC

SUBJ NP
normi

COMPS NPnormj

Leslie

CONT w(j)(i)

properties together with what are, in effect, lexical redundancy


rules, to account for all facts of the kind noted in the previous
paragraph and all syntactic dependencies, including parallelisms
between seemingly related construction types. The following
discussion presents a more technically fleshed-out instantiation
of this general approach to characterizing grammatical wellformedness.

An Illustration of the System: Constituency and


Dependencies
As noted, local dependencies are uniformly accounted for in
HPSG as instances of nothing more elaborate than valence satisfaction, or systematic relationships between valence specifications of related classes of lexica. This approach is sufficient
to account for even quite distant linkages, such as It continues
to appear to have been raining during the night, by means of a
feature subj encoding subject-selection possibilities on lexical
heads and their phrasal projections.
LOCAL DEPENDENCIES. The best way to exhibit the nature of
HPSGs approach to an account of the patterns reflected in natural language is to take an example of a single sentence and show
how it is licensed by the interaction of HPSG constraints. Assume,
for example, that the lexical entry for criticized comprises a number of separate (partial) descriptions that includes a description
roughly along the lines of (1) in Figure 1.
As this (partial) lexical entry illustrates, signs in HPSG comprise a ramified feature geometry that simultaneously specifies
phon(ology) and synt(tax)/sem(antics), the latter a complex
information package revealing loc(al) and nonloc(al) properties, itemizing, respectively, the inherent morphosyntactic
and semantic properties of the category, on the one hand, and
the presence of elements within the sign with nonlocal linkages (the presence of a gap linked to a filler arbitrarily far away,
wh properties, etc.), on the other hand. Further subspecifications identify inherent morphosyntactic properties of the sign,
including head those necessarily shared with the heads
phrasal mother by virtue of the mother/head daughter relationship along with semantic and contextual information.

The key valence features comp(plement)s and subj(ect)


encode the combinatorial requirements of particular lexical elements, making it possible to eliminate specific phrase structure
rules in favor of broad schemata that correctly project phrases
from lexica regardless of the latters valence peculiarities. Each
such schema identifies a certain very general kind of structure,
which will be fleshed out in detail, depending, in many cases,
on the particular lexical item that heads the structure. Two of the
most important conditions interacting with these schemata are
the valance principle and the head feature principle, which can
be stated roughly as follows:
Valence Principle: For any valence feature f, the value on the
mother is the list containing the value of f, on the head daughter minus the values corresponding to the head daughters
sisters.
Head Feature Principle (HFP): For any phrasal category,
the value of the head feature is identical to the value of the
categorys head daughters head feature.
The valence principle specifies that the appearance of any
required valent in a local structural relationship to a selecting head removes the corresponding element from the musthave list of the mother. This principle does not, of course, limit
the number of valents that may appear as the heads sisters.
But the schematic possibilities of English require a phrase of
this type to be lexically headed, and in order to satisfy both this
requirement and the valence principle, exactly the number
and type of complement sisters that the verb identifies on its
comps list (i.e., the list of descriptions that must be satisfied by
the heads selected sisters) must appear in the structure so as
to yield the empty complements lists on the mother. We thus
license the structure in (2)a, which can be abbreviated as (2)b
in Figure 2. The head-subject schema allows a completely saturated verb (V) to have a phrasal head daughter with a subj list
of length one. Again, in conjunction with the valence principle,
this schema allows a verb phrase (VP) combining with a constituent that exactly meets the description indicated in its subj
to appear as a structure of type head-subject under an S (i.e.,
clausal) node, that is, a V with both an empty comps and an

355

Head-Driven Phrase Structure Grammar


(b)

(3) (a)
NP

PHON

LOC 1

LOC 1
1

SLA SH

SYNSEM

NONLOC|SLA SH 1

Terry
VP SLA SH

NP

I
V

NP

SLA SH

know
N

PP
SLA SH

stories
P

1
NP
LOC 1

about
SLA SH

Figure 3.
empty subj list. Thus, Robin criticized Leslie will be straightforwardly licensed.
Note further that *Robin criticized and *Robin criticized Leslie
certain books will both be ruled out. Neither of them will satisfy the
head-complement schema under the constraint imposed by the
valence principle. In the first case, the lack of a complement daughter will yield a result with a nonempty complements list, violating
the schema requirement. In the second case, the valence requirements on criticize will not equal the sum of those on the mother plus
the set of daughters that appear in the structure, as the full form of
the valence principle requires. Hence, both fail to be licensed.
The valence feature lists do not in fact specify information
about the entire sister sign selected by the head. Lexical selection
is universally blind to phonological form and also to the descending constituent structure of any selected phrasal constituent;
therefore, it makes sense to restrict valence specifications to synsem values. But the latter contain enough information to implement virtually all local dependencies as simple expressions of
selection. Thus, it is possible to select not only for coarse-grained
information, such as the number of valents and their respective
lexical category types, but also for information such as their case,
or the inflectional properties of their heads (on the assumption
that the latter are head features and will therefore be shared
between a phrasal valent and its lexical head daughter). The morphosyntactic dependency in English between the identity of auxiliary elements, on the one hand, and the inflectional form of the
verb that immediately follows them, on the other hand, follows
simply and directly by specifying that, for example, each auxiliary
have selects as its complement a VP, preserving, in its own feature
specifications, the inflectional class of its head daughter (specifically, the value psp, encoding past participial status).

the percolation of wh properties (as in so-called pied-piping phenomena), throughout arbitrarily large structures exist in natural
language and must be accounted for.
HPSG follows the central innovation introduced by Gerald
Gazdar a quarter of a century ago in treating filler/gap linkages
as a by-product of the diffusion of a specific feature slash, whose
value is correlated with that of the filler and which is driven
from mother to (at least one) daughter by the nonlocal feature
principle:

Nonlocal Dependencies

Remaining Issues

Nonlocal dependencies, in contrast, do not depend on the


selectional possibilities of particular (classes of) lexical items.
Unbounded dependency constructions involving extraction, or

There are, of course, a number of other important aspects to


the grammar architecture incorporated in HPSG, some of them
involving major points of debate within the framework.

356

Nonlocal Feature Principle (NFP, simplified): For any nonlocal feature f the value of f on a phrase at any point below the
place in the structure where f was introduced is the union of
the values of f on the set of daughters.
The NFL states, in essence, that below the point where the nonlocal feature is introduced, at least one daughter in each twogeneration tree must bear the value of f specified. In the case of
filler/gap constructions, the feature slash is identified at the top
of the dependency with the relevant part (called the loc value,
illustrated in Figure 2) of the filler. The descendents of the highest category with a nonempty slash feature preserve this value
in their own specifications, until at the bottom of the dependency, the slash value is cashed out as an empty category. An
example is illustrated in (3)a, with the relevant lexical entry for a
slash terminal category listed in (3)b in Figure 3. This approach
to filler/gap dependencies comports well with familiar phenomena in languages that mark such dependencies by local flagging
of extraction pathways; it also has considerable advantages over
movement-based approaches in the analysis of multiple gap
linkages to a single filler, such as parasitic and across-the-board
extractions in coordinate structures.

Hippocampus
The syntax/semantics interface issue is not fully resolved in
HPSG, in the sense that there are a number of competing
approaches to a variety of issues, ranging from the nature of
the objects in HPSG representations that map to semantic
representations to the mapping rules themselves. Different
answers to these questions typically correspond to major differences in syntactic analyses.
Probably the most fundamental point of contention is the
disagreement within HPSG about the degree to which patterns in natural languages reflect, on the one hand, lexical properties and systematic mappings between lexical
descriptions, or restrictions on specific constructions, as
encoded by constraints imposed on types belonging to very
elaborate ontologies, on the other hand. At one extreme,
HPSG includes a set of lexical heads with very abstract
properties and possibly no phonological realization, as in
earlier treatments of relative clauses. At the other extreme
are intricate multi-inheritance hierarchies in which properties of constructions are essentially posited as underived
primitives, or are derived by combining a number of such
underived primitives.
These and a number of other foundational issues are currently
under intense discussion and debate within the HPSG research
community, and there is no reason to expect a consensus on any
of these questions in the near future.
On the basis of the relatively brief history of the framework
so far, it seems very possible that divergences in approaches
to these fundamental matters will in time yield major schisms
within the theory, leading in the end to two or more major versions of the theory, in much the way that categorial grammar has
split into combinatory categorial grammar, on the one hand,
and the type-logical version based on the Lambek calculus, on
the other. It remains to be seen whether the theory will develop
along these lines.
Robert D. Levine
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Borsley, Robert. 1996. Modern Phrase Structure Grammar.
London: Blackwells.
Levine, Robert, and Thomas Hukari. 2006. The Unity of Unbounded
Dependency Constructions. Stanford, CA: CSLI
Pollard, Carl, and Ivan Sag. 1994. Head-Driven Phrase Structure Grammar.
Chicago: University of Chicago Press.

HIPPOCAMPUS
The hippocampus is a bilaterally symmetric subcortical structure
adjacent to the lateral ventricle in the medial temporal lobe
(mtl). Researchers in 1968 reported dramatic memory phenomena associated with hippocampal-MTL damage, and data
reported from 1998 to 2006 have indicated parallel phenomena
for many other aspects of cognition, including language. I first
discuss patient H. M., the initial source of data for the hippocampal-memory and hippocampal-language links. I then discuss
related patient groups and the theoretical significance of the
hippocampal-language link.

Because of the unique and circumscribed nature of his


1953 surgery, H. M. is probably the most studied patient in the
history of neuropsychology (Ogden and Corkin 1991): A neurosurgeon inserted thin metal tubes above the eyes, and via
suction, removed parts of H. M.s hippocampus and directly
linked MTL structures. This operation greatly ameliorated H.
M.s life-threatening epilepsy, left H. M.s neocortex virtually
undamaged, and spared all neocortex with known links to language comprehension. However, the operation caused a selective memory deficit, with normal recall of information familiar
to H. M. before his operation and used frequently since then,
but impaired recall of information newly encountered after his
operation and not massively repeated since then (see MacKay
et al. 2007).
H. M. has sentence-level language deficits that precisely
mirror his memory deficits. D. G. MacKay et al. (2007) tested
H. M.s sentence-level comprehension in six tasks. In one task,
participants identified the grammatical versus ungrammatical status of never previously encountered sentences that were
either grammatical or ungrammatical (see grammaticality).
Here, H. M. responded with the correctly answer reliably less
often than controls matched for age, IQ, and education. This
comprehension deficit impaired a wide variety of syntactic structures, including ones that memory-normal participants find easy
to recall: H. M. exhibited equivalent comprehension deficits for
easy- and difficult-to-recall sentences.
In a second task, H. M. again performed reliably worse than
controls in identifying grammatical sentences as grammatical
and in detecting, identifying, and repairing errors in sentences
containing incorrect and misordered words. A third task required
multiple-choice identification of who-did-what-to-whom in
novel sentences. Here, H. M. identified the correct thematic
role of sentence constituents reliably less often than controls. A
fourth task required multiple-choice recognition of the appropriate interpretation for sentences containing novel metaphors.
Here, H. M. chose the correct interpretation reliably less often
than controls, and his errors indicated failure to recognize that
the sentences were metaphoric. A fifth task required yesno
recognition of the appropriate interpretation for ambiguous
sentences. Here, H. M. responded correctly less often than controls and sometimes responded yes-and-no despite repeated
requests to respond yes-or-no.
Consistent with several earlier results discussed next,
H. M.s ambiguity comprehension deficits were not due to
memory overload associated with multiple meanings: In the
ambiguity detection and description task of MacKay, Stewart,
and Burke (1998), H. M. took much longer than controls to
begin to describe the first of two meanings in ambiguous sentences, even when he never discovered the second meaning.
H. M. also discovered both meanings without experimenter
help less often than controls and often failed to understand
meanings that the experimenter had just explained. Research
isolated seven deficits in how H. M. described the sentence
meanings: Grammatically impossible interpretations, misreadings reflecting failure to comprehend sentence-level meaning, errors in pronoun use (anaphora), error correction
failures, free associative responses, self-miscomprehensions,

357

Hippocampus
and failures to follow experimenter requests for clarification.
Research also indicated comprehension failure involving an
initial meaning for sentences, ambiguous or not.
To summarize, in a wide range of tasks involving many fundamental aspects of sentence comprehension, H. M. exhibited
deficits not caused by his memory problems (for corroborating
evidence on H. M.s comprehension deficits, see Corkin 1984;
Lackner 1974; and Schmolck, Stefanacci, and Squire 2000).
However, H. M.s comprehension deficits were selective rather
than across the board: Experiment six in MacKay et al. (2007)
demonstrated that H. M. comprehended familiar words and
phrases in isolation without deficit despite large deficits in comprehending these same stimuli when embedded within sentences. Besides demonstrating selectivity, these results indicated
that H. M.s deficits were not attributable to low motivation, to
failure as a child to learn the meaning of the critical words and
phrases, or to failure to understand and follow instructions for
the task.
H. M. also exhibited significant production deficits when
describing the meanings of familiar words that he comprehended without deficit in MacKay et al. (2007): Judges blind to
speaker identity rated H. M.s meaning descriptions as reliably
more redundant, less coherent, less grammatical, and less comprehensible than those of controls. These findings replicated
earlier results indicating deficits in H. M.s production of novel
or non-clich sentences (see MacKay, Stewart, and Burke 1998).
Again, however, H. M. exhibited selective production deficits that
mirrored his memory deficits, for example, spontaneously producing clich phrases such as in a way (familiar from before his
surgery) without errors (ibid.).
H. M. also exhibited similar deficits and sparing in the seemingly simple task of reading sentences aloud (MacKay and James
2001): He produced abnormal pauses at major syntactic boundaries unmarked by commas in the sentences, but normal pauses
at syntactic boundaries marked with commas, a prosodic marker
that H. M. had learned prior to his operation. H. M. also produced
abnormal pauses within unfamiliar phrases in the sentences, but
normal pauses within frequently used phrases. These and other
selective deficits indicated that he has difficulty with the process
of reconstructing novel aspects of sentence structure when reading aloud.
H. M. also exhibited similar deficits and sparing in visual
cognition: When detecting target figures hidden in concealing
arrays, he performed reliably worse than controls for unfamiliar
targets but not for familiar targets (MacKay and James 2000). In
short, H. M. exhibits similar selective deficits in visual cognition,
episodic memory, sentence-level comprehension, and sentence
production when speaking and reading aloud; impaired processing of never previously encountered events, visual figures,
phrases, and propositions; but spared processing of information familiar to him before his lesion and used frequently since
then.
Why are these parallels important? One reason is that H. M.
is not unique: Other patients with hippocampal-MTL damage
exhibit identical parallels, reinforcing the links among hippocampus-MTL, language, and memory. For example, other amnesiacs
exhibit deficits in detecting the two meanings in ambiguous sentences (Zaidel et al. 1995) and make errors resembling H. M.s

358

in reading novel sentences aloud (Friedman 1996; MacKay and


James 2001).
Second, these parallels are difficult to explain in current systems theories, in which independent systems process memory,
language comprehension, language production, and visual cognition, and the hippocampus subserves only the memory system (see, e.g., Schmolck, Stefanacci, and Squire 2000). Under
systems theories, hippocampal-MTL damage should yield
memory deficits without deficits in other cognitive systems, and
certainly without parallel deficits and parallel sparing across
supposedly independent systems for sentence comprehension,
sentence production, visual cognition, and episodic memory.
These predictions have failed, and major attempts to rescue
current systems theories from these failed predictions have likewise failed (see MacKay 2001, 2006; and MacKay, James, and
Hadley 2008).
Third, a new theoretical framework known as binding theory (not to be confused with the anaphoric binding theory
in linguistics; see Jackendoff 2003, 15) readily explains and,
indeed, originally predicted the links between hippocampalMTL damage and parallel deficits and sparing in memory,
sentence-level language, and other aspects of cognition.
Under binding theory, hippocampal-MTL damage impairs
binding mechanisms for forming new internal representations in the cortex but does not affect mechanisms for activating already existing cortical representations (see, e.g., MacKay
et al. 2007 and James and MacKay 2001 for important theoretical details regarding forgetting, frequency of use, and aging
and language ).
To illustrate in detail how binding theory explains his selective deficits, consider H. M.s sentence production in a standard
picture-description task requiring the incorporation of prespecified target words (MacKay et al. 2007): H. M. described the wordpicture stimuli significantly less accurately and completely than
eight controls, included fewer target words, and produced more
incomplete sentences (e.g., lacking a subject or verb), violations of agreement rules, non sequiturs, and run-on sentences
than the controls. Descriptions by H. M. (1a2a) versus controls
(1b2b) for the same word-picture stimuli illustrate some of these
differences.
(1a) H. M. description: Because its wrong for her to be
and hes dressed just as this that hes dressed and the same
way.
(1b) Control description: Well, I think Ill take that one
although it looks wrong.
(2a) H. M. description: I want some of that pie either some
pie and Ill have some.
(2b) Control description: Uh, there are two people getting
pie, but theres only one piece of blueberry pie left, and so,
either one of them will have to have it.
Note that H. M.s picture-description problems in 1a and
2a were selective: Unlike agrammatic aphasics, H. M . did
not produce morphemes and nonsense words jumbled
together into morphological salads (Jackendoff 2003,
264). Moreover, he produced frequently used units, such as
its wrong, to be, the same way (1a), some of that, and

Hippocampus

Historical Linguistics

Ill have some (2a), without errors. Under binding theory,


separately stored syntactic units and rules serve to activate
already formed internal representations so that words and
phrases become produced in the appropriate order. Because
H. M.s syntax-based activation mechanisms are intact and
frequently used since his lesion, H. M. therefore produces
familiar words, phrases, and propositions such as its wrong
and Ill have some, without errors. However, he lacks already
formed internal representations for propositions that he
has used repeatedly before and after his lesion to describe the
MacKay et al. (2007) word-picture stimuli. The word-picture
stimuli, therefore, triggered familiar units that H. M. simply
concatenated without forming complete, appropriate, and
coherent utterances (see 1b, 2b).
In conclusion, the pressing problem for future research is to
test new binding theory predictions for relations among brain,
language, memory, and other aspects of cognition (see MacKay
et al. 2007).
Donald G. MacKay
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Corkin, S. 1984. Lasting consequences of bilateral medial temporal lobectomy: Clinical course and experimental findings in H. M.
Seminars in Neurology 4: 24959.
Friedman, R. B. 1996. Phonological text alexia: Poor pseudo-word
reading plus difficulty reading functors and affixes in text Cognitive
Neuropsychology 13: 86985.
Jackendoff, R. 2003. Foundations of Grammar: Brain, Meaning, Grammar,
Evolution. Oxford: Oxford University Press.
James, L. E., and D. G. MacKay. 2001. H. M., word knowledge and
aging: Support for a newtheory of long-term retrograde amnesia.
Psychological Science 12: 48592.
Lackner, J. R. 1974. Observations on the speech processing capabilities
of an amnesic patient: Several aspects of H. M.s language function.
Neuropsychologia 12: 199207.
MacKay, D. G. 2001. A tale of two paradigms or metatheoretical approaches to cognitive neuropsychology: Did Schmolck,
Stefanacci, and Squire demonstrate that Detection and explanation
of sentence ambiguity are unaffected by hippocampal lesions but
are impaired by larger temporal lobe lesions? Brain and Language
78: 26572.
. 2006. Aging, memory and language in amnesic H. M.
Hippocampus 16: 4914.
MacKay, D.G., D. M. Burke, and R. Stewart. 1998. H. M.s language production deficits: Implications for relations between memory, semantic binding, and the hippocampal system. Journal of Memory and
Language 38: 2869.
MacKay, D. G., and L. E. James. 2000. Binding processes for visual cognition: A hippocampal amnesic (H. M.) exhibits selective deficits in
detecting hidden figures and errors in visual scenes. Poster presented
to the Cognitive Neuroscience Society, San Francisco.
. 2001. The binding problem for syntax, semantics, and
prosody: H. M.s selective sentence-reading deficits under the
theoretical-syndrome approach. Language and Cognitive Processes
16: 41960.
. 2002. Aging, retrograde amnesia, and the binding problem for
phonology and orthography: A longitudinal study of hippocampal
amnesic H. M. Aging, Neuropsychology, and Cognition 9: 298333.
MacKay, D. G., L. E. James, and C. Hadley. 2008. Amnesic H. M.s performance on the Language Competence Test: Parallel deficits in

memory and sentence production. Journal of Experimental and


Clinical Neuropsychology 30.3: 280300.
MacKay, D. G., L. E. James, J. K. Taylor, and D. E. Marian. 2007.
Amnesic H. M. exhibits parallel deficits and sparing in language and
memory: Systems versus binding theory accounts. Language and
Cognitive Processes 22.3: 377452.
MacKay, D. G., R. Stewart, and D. M. Burke. 1998. H. M. revisited: Relations between language comprehension, memory, and the
hippocampal system. Journal of Cognitive Neuroscience 10: 37794.
Ogden, J. A., and S. Corkin. 1991. Memories of H. M. In Memory
Mechanisms: A Tribute to G. V. Goddard, ed. W. C. Abraham, M.
Corballis, and K. G. White, 195215. Hillsdale, NJ: Erlbaum.
Schmolck, H., L. Stefanacci, and L. R. Squire. 2000. Detection and explanation of sentence ambiguity are unaffected by hippocampal lesions
but are impaired by larger temporal lobe lesions. Hippocampus
10: 75970.
Zaidel, D. W., E. Zaidel, S. M. Oxbury, and J. M. Oxbury. 1995. The interpretation of sentence ambiguity in patients with unilateral focal brain
surgery. Brain and Language 51: 45868.

HISTORICAL LINGUISTICS
Historical linguistics is the study of how languages change over
time. We can approach the study of change in various ways. One
is by studying the histories of individual languages. An example
would be analyzing the changes that have taken place in English
over the last thousand years. A second approach involves comparing various related languages in order to draw inferences
about the types of changes that have occurred since the time
they split from their common ancestor. We can further study the
frequency and naturalness of the changes that the languages are
hypothesized to have undergone and the effects of change in
one area on other parts of the language. Finally, historical linguistics is also concerned with finding explanations for change,
including why languages change and how a particular change
was actuated in a particular circumstance. Historical linguistics
also intersects with other fields. For example, a linguist trying
to reconstruct the geographic extent of a proto-language may
also make use of data from both archaeology and historical
anthropology.
The following discussion begins with an abbreviated history
of the field, from classical times to the early twentieth century.
From there, we turn to contemporary research and conclude
with a look at possible future directions for the field.

History of the Field


Modern historical linguistics was developed in the nineteenth
century, although there were both eighteenth- and seventeenthcentury scholars who practiced something that todays linguists
would recognize. In contrast, Classical Greek and Roman linguists had little to say about language history despite considerable sophistication in their synchronic descriptive techniques
(see synchrony and diachrony). The study of language
change in the Graeco-Roman world was largely confined to
etymology that is, to claims about the origin of individual lexical items. For example, in Platos Cratylus, the word anthropos is
said to derive from the contraction of the phrase anathro:n h
po:pen (looking up at the things hes seen). Similar methods
were employed by Latin linguists such as Varro, who claimed that

359

Historical Linguistics
anas duck is related to the verb nre to swim. Furthermore,
although numerous similarities between Latin and Greek were
noted, it was assumed that all such words were direct borrowings into Latin from Greek. Shared common ancestry from a language no longer spoken was never considered (see extinction
of languages). Such an assumption was in keeping with the
strong cultural debt that the Roman world owed to the Greek
(see Law 2003).
The classical etymological method continued to be employed
throughout the Middle Ages, where it was joined by theories of
language change and diversity built on the biblical story of the
destruction of the Tower of Babel (Gen. 11). A summary of the
theory can be found in Dantes de Vulgari Eloquentia (for one
English translation, see Shapiro 1990). Such a theory of diversity
specifies both a cause of language change and a partial model
of the origin of modern linguistic diversity. Work within a model
of change laid the foundation for much later linguistic scholarship, for it led to questions about the language that was spoken
by those erecting the Tower of Babel (and, therefore, what the
first human language was) and exactly how modern attested languages related to one another.
The de Vulgari Eloquentia is also a founding discussion of
relationships among the vernacular languages of Europe. The
rise of the study of Romance vernaculars led to an examination
of systematic differences among those languages, as well as comparison with Latin (for example, why Latin de is a preposition
meaning from, but in French and Italian it marks possession).
As R. H. Robins (1968, 100 ff) notes, it was this examination that
allowed the development of an adequate framework for diachronic linguistics because of a chance to study change where
the parent language was already well understood.
In the early Middle Ages, there was also a highly sophisticated
Arabic comparative linguistic tradition. Ibn Hazm (9941064)
noted regular correspondences among Hebrew, Arabic, and
Syraic, and in the Ihkam Ibn Hazm further identified changing
pronunciation and language contact as driving forces in the creation of linguistic diversity (see contact, language).
In the sixteenth to eighteenth centuries, we begin to see the
study of language change linked to other branches of linguistics,
such as typology. Scholars such as Konrad Gessner ([1555]
1974), Joseph Justice Scaliger ([1599] 1610), and later Andreas
Jger (1686) and Peter Simon Pallas (1786) collected and compared vocabularies of the languages available to them and made
hypotheses about linguistic relationships on this basis. However,
the comparisons are unsystematic and based mostly on very few
features. For example, Scaliger divides the languages of Europe
into four major classes, depending on whether their word for
god is based on deus, theos, gott, or bog (roughly corresponding to Latin/Romance, Greek, Germanic, and Slavic, respectively). Gessner ([1555] 1974, 110) deduces that Armenian is
closely related to Hebrew because of the similarity of words such
as lezu tongue (Hebrew laschon in Gessner) and hhatz cross
(Hebrew etz or hetz). Moreover, until G. W. Leibniz, there is no
conception that languages could have been descended from a
language that is no longer attested.
The beginning of modern historical linguistics and the comparative method is often said to date from a speech given by Sir

360

William Jones to the Royal Asiatic Society in 1786, in which he


noted similarities between Sanskrit and the languages of Europe
and hypothesized that they may come from a common ancestor: [N]o philologer could examine the Sanskrit, Greek, and
Latin, without believing them to have sprung from some common source, which, perhaps, no longer exists. There is a similar
reason, though not quite so forcible, for supposing that both the
Gothic and the Celtic had the same origin with the Sanskrit.
(The speech is quoted in almost all introductory textbooks for
historical linguistics; see, for example, Campbell 2004 and Trask
1996.) However, as we have seen, elements of comparative and
historical linguistic methods predate Jones by several hundred
years.
The nineteenth century saw an explosion of work on historical linguistics and the reconstruction of language history;
in fact, many methods developed during this period are still in
use. It is this period that gives us the idea of the correspondence
set: a set of words in related languages that are descended from
a common proto-form and which exhibit regular phonological correspondences. For example, English three, Latin trs,
Greek tris, Sanskrit tryas, and Gothic rs all reflect protoIndo-European *treyes. Furthermore, the correspondences
among phonemes in these languages are regular. For example,
English th corresponds to Latin t in cognate words (cf. father
and pater, brother and frater, among others; for the method and
further reconstructions, see Trask 1996). The first detailed discussion of such correspondence sets dates to A. Turgot ([1756]
1961), and the method was systematized by Rasmus Rask
(1818). We also, shortly afterwards, find the first reconstructions
of Indo-European. These are in the work of August Schleicher
(1848), who not only reconstructed lexical items but also constructed an Indo-European fable. Schleicher also introduced
the Stammbaum, or family tree, model of linguistic relationship
(see language families ).
Neither Schleicher nor his contemporaries placed much
weight on the importance of regularity in correspondences,
however. Scholars of the following generation, including Karl
Brugmann and Berthold Delbrck (the Neogrammarians or
Junggrammatiker), were the first to recognize the importance
of regularity in sound change for the comparative method
and to use it as a tool for discriminating between inheritance
and analogy. The recognition of regularity in sound change
is the pillar of historical reconstruction and forms the
basis of much modern work on historical linguistics. Without
a conception of sounds changing regularly in particular phonetic environments, it is impossible to identify irregularities,
to reconstruct proto-forms, and thereby to form a reliable idea
of linguistic relationships. However, arguments about the universal regularity of sound change continue. One area involves
the paradox between apparent regularity at the macro level
and irregularity when a language at a particular stage in time
is examined. That is, a language is not homogeneous at any
stage because of the amount of dialectal variation among
speakers. A second point of debate is the applicability of models relying on regularity of sound change to languages outside
of Europe, for instance, those spoken by huntergatherer communities in Australia.

Historical Linguistics
A further methodological coup is due to Ferdinand de
Saussure ([1915] 1972). Saussure hypothesized, on the basis of
both internal evidence of root structure and varied correspondences among vowels in Indo-European languages such as Greek
and Sanskrit, that there had once been a further set of laryngeal
consonants (possibly /h/, // and /w/) that had disappeared in
all environments in attested languages. Saussures contribution
is very important to historical linguistics for two reasons: First, it
is the foundation of internal reconstruction, that is, the
hypothesis of reconstructions based on synchronic patterns in
one language, rather than a direct comparison of forms between
languages. Second, it demonstrates very clearly the power of the
comparative method and the importance of regularity in correspondence sets. The subsequent decipherment of inscriptions in
Hittite and Luwian confirmed Saussures hypothesis since two of
the laryngeals are preserved in these languages precisely where
we should expect to find them.
Historical linguists in the twentieth century made progress in
the reconstruction of families outside Europe, in the growing use
of quantitative methods in modeling and describing language
change, and in historical syntax (see syntactic change). Most
of the methods used today were developed through the reconstruction of proto-Indo-European; however, the methods have
also been successfully applied to other families. Much early historical work was done on Finno-Ugric languages (e.g., Sajnovics
[1770] 1968; Gyarmathi 1799), and more recently, there has been
considerable progress in the reconstruction of Austronesian
(Pawley and Ross 1993), Niger-Congo (cf. Hombert and Hyman
1999), and numerous families in North America (see, for example, Campbell 1997).
While the family tree model has been very influential in historical linguistics, other models of language change are also
used. Perhaps the most commonly cited is the wave theory of
J. Schmidt and Jules Guilliron (see Guilliron 1921), who proposed that sound changes diffuse through the lexicon, gradually
affecting more and more instances of phonemes in a given environment. Regularity in correspondences is thus epiphenomenal
and only appears once a change is complete. Another common
appeal to wave theory is in subgrouping, where it is argued that
the family tree is not an accurate representation of language splitting. Rather, linguistic differentiation occurs through the gradual
building up of isoglosses and changes affecting individual lexical
items.

work on historical syntax concerns word order change in the synchronic analysis of ancient languages or the causes of syntactic
change, rather than reconstruction per se. Influential here has
been the work of David Lightfoot (e.g., 1999), who has developed
a theory of change that assigns the primary cause of change to
a childs acquisition of the language. Since children are exposed
to linguistic data that is slightly different from what their parents
were exposed to, they therefore draw slightly different conclusions about the syntactic structure of their language. We see this
reflected in the historical record as a syntactic change. Others are
less comfortable in ascribing change solely to grammar change at
acquisition and argue that syntactic changes also occur in adult
speakers as a result of exposure to new languages and dialects,
changing prestige, and other factors. There is ongoing debate
over the extent to which a person may spread change as an adult,
since there are also clearly generational differences in linguistic
production (see age groups).
The study of reconstructions within historical phonology,
morphology, and syntax may also be used to classify languages
into genetic families. Language classification must also take ideas
of language contact into account. Extensive contact between two
unrelated languages may over time lead to enough similarities
that it is difficult to tell whether they are related or not. Several
languages have been misclassified on this basis. For example,
Armenian was originally classified as an Indo-Iranian language
rather than as its own branch of Indo-European because of the
number of loans it exhibits.
There are other less widely accepted methods of investigating linguistic prehistory. One is lexicostatistics, which involves
estimating genetic relatedness by comparing the percentage of
vocabulary common to pairs of languages. Underlying the method
is the assumption that languages that share more common material are likely to be more closely related. Glottochronology uses
the estimations from lexicostatistics to estimate the time depth
of a particular family. Mass comparison (e.g., Greenberg 1987)
involves using large-scale word lists to reconstruct further back
than the strict application of the comparative method allows, by
granting more exceptions to the regularity of sound change and
greater latitude in semantics. In each case, the methods are not
widely accepted. For example, gross similarity between two
languages may be caused by several factors apart from common
genetic inheritance, including chance and borrowings, and only
detailed reconstruction of correspondences by the comparative
method allows us to choose between them.

Current State of Research


The methods of historical linguistics can be applied to all areas
of language study. Within historical phonology, there has been
a great deal of work on types of sound change, the plausibility
of different changes, and the mechanisms by which change is
spread throughout a speech community. (For discussion and
different approaches, see Ohala 1993, Blevins 2004, and Labov
2001.) The study of morphological change was particularly important in the nineteenth century, and the reconstruction of morphemes and paradigms is still an important area of
research, as is grammaticalization theory.
Historical syntax has received rather less attention than historical morphology or phonology mostly because of the difficulties of
applying the comparative method to syntactic constructions. Most

Future Prospects
Currently, our knowledge of the history of different language
families in the world is very uneven. Some families including Indo-European, Finno-Ugric, and Algonquian have been
reconstructed in detail. In other cases, we are not even sure
which languages belong to the family, let alone what the protolanguage looked like.
There are still many pressing concerns and active areas of
research. The first is in the reconstruction of various language
families. Basic original reconstruction research is needed for
much of the world. Secondly, there is an ever-increasing concern
with questions about how and why languages change. We have
long moved away from arguments of language change involving

361

Historical Linguistics

Historical Reconstruction

sloppy speech or linguistic degeneration by ignorant speakers.


Instead, research has focused on the relative importance of language acquisition in language change versus social factors, such
as peer pressure, prestige, and diffusion. Linguistic research is
also important for the study of prehistory and ancient population
movement.
Not everyone is convinced that the methods discussed in
this entry are generally applicable to all languages and language families in the world. As already noted, the family tree
has been the predominant model for 150 years. However, some
have pointed out its reliance on transmission from parents
to children, that ignores other types of transmission which
can lead to rapid language change, such as creolization (see
creoles) and the formation of mixed languages (Thomason
and Kaufman 1988).
Finally, historical work increasingly involves computational
modeling and the integration of techniques used in computational biology. The last 10 years have seen an increasing amount
of sophisticated statistical analysis and computational modeling
in research (for an overview, see McMahon and McMahon 2006).
We also see work that aims at estimating time depth and rates of
language change. It remains to be seen, however, how successful
this work will be. No matter how sophisticated the techniques for
statistical analysis, any estimates of time depth need also to take
into account sophisticated theories of language change. At this
point, we have no idea why languages change and split at different rates, although such differing rates are clearly observable in
the historical record.
Claire Bowern
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Blevins, Juliette. 2004. Evolutionary Phonology. Cambridge: Cambridge
University Press.
Campbell, Lyle. 1997. American Indian Languages: The Historical
Linguistics of North America. Oxford: Oxford University Press.
. 2004. Historical Linguistics: An Introduction. Cambridge, MA: MIT
Press.
Gessner, Konrad. [1555] 1974. Mithridates: de Differentiis Linguarum
tum Veterum tum quae Hodie apud Diversas Nationes in Toto Orbe
Terrarum in Usu Sunt. Neudruck der Ausgabe Zrich, Aalen: Scientia
Verlag.
Greenberg, Joseph. 1987. Language in the Americas. Stanford,
CA: Stanford University Press.
Guilliron, Jules. 1921. Pathologie et thrapeutique verbales.
Paris: Champion.
Gyarmathi, Smuel. 1799. Affinitas linguae Hungaricae cum linguis
Fennicae originis gram matice demonstrata. Vocabularia dialectorum Tataricarum et Slavicarum cum Hungarica comparata.
Gttingen.
Harris, Alice, and Lyle Campbell. 1995. Historical Syntax in CrossLinguistic Perspective. Cambridge: Cambridge University Press.
Hock, Hans H. 1991. Principles of Historical Linguistics. Berlin:
Mouton.
Hombert, Jean-Marie, and Larry Hyman. 1999. Bantu Historical
Linguistics: Theoretical and Empirical Perspectives. Stanford,
CA: CSLI.
Jger, Andreas. 1686. De Lingua Vetustissima Europae, Scytho-Celtica et
Gothica. Wittinberg.

362

Labov, William. 2001. Principles


of
Linguistic
Change.
Oxford: Blackwell.
Law, Vivian. 2003. The History of Linguistics in Europe: From Plato to
1600. Cambridge: Cambridge University Press.
Lightfoot, David. 1999. The Development of Language: Acquisition,
Change and Evolution. Oxford: Blackwell.
McMahon, April, and Robert McMahon. 2006. Language Classification by
Numbers. Oxford: Oxford University Press.
Ohala, John. 1993. The phonetics of sound change. Historical
Linguistics: Problems and Perspectives, ed. Charles Jones, 23778.
London: Longman.
Pallas, Peter Simon. 1786. Linguarum totius orbis vocabularia comparativa. St. Petersburg.
Pawley, Andrew, and Malcolm Ross. 1993. Austronesian historical linguistics and culture history. Annual Review of Anthropology
22: 42559.
Rask, R. 1818. Undersgelse om det gamle nordiske eller islandske sprogs
oprindelse. Copenhagen.
Robins, R. H. 1968. A Short History of Linguistics. Bloomington: Indiana
University Press.
Sajnovics, Jnas. [1770] 1968. Demonstratio idioma Ungarorum et
Lapponum idem esse, ed. Thomas Sebeok. Bloomington: Indiana
University Press.
Saussure, Ferdnand de. [1915] 1972. Course in General Linguistics
[Cours de linguistique general]. Trans. Roy Harris. Peru, IL: Open Court
Classics.
Scaliger, Joseph Justus. [1599] 1610. Diatriba de Europaeorum linguis. In
Opuscula varia antehac non edita. Paris.
Schleicher, August. 1848. Sprachvergleichende Untersuchungen. / Zur
vergleichenden Sprachgeschichte. 2 vols. Bonn: H. B. Koenig.
Shapiro, Marianne. 1990. De Vulgari Eloquentia, Dantes Book of Exile.
Lincoln and London: University of Nebraska Press.
Thomason, Sarah Grey, and Terrence Kaufman. 1988. Language Contact,
Creolization, and Genetic Linguistics. Berkeley: University of California
Press.
Trask, R. L. 1996. Historical Linguistics. London: Arnold.
Turgot, A. [1756] 1961. tymologie. Brugge: De Tempel.

HISTORICAL RECONSTRUCTION
Like all complex things in nature, languages change over
time. Historical reconstruction is a process of inference by
which changes are undone so as to recover certain aspects
of historically nonattested linguistic structure and content in
hypothetical form. Although it can be applied to languages
for which written documentation is available (e.g., the reconstruction of proto-Romance next to the extant records of Latin)
or to single languages by means of internal reconstruction, historical reconstruction typically is directed to prehistoric languages and requires the simultaneous comparison of multiple
witnesses.
The set of procedures used in historical reconstruction is
called the comparative method. After the elimination of
chance, borrowing, and universals as plausible causes of crosslinguistic similarity, it can be shown that the resemblances
between two or more languages must result from a common
origin (usually called genetic relationship), coupled with divergent descent. In this respect, historical reconstruction in linguistics shows striking parallels to the study of the evolutionary
history of natural species, and much of its terminology is conceptually modeled on that of biological taxonomy. Once it has

Historical Reconstruction
been determined that languages are genetically related, a more
exact picture of their historical connection can be achieved by
the reconstruction of a proto-language or hypothetical common
ancestor.
Although some progress has been made with other aspects
of historical reconstruction, it is widely agreed that the
comparative method has been successfully applied only in
phonology . For reconstruction to begin, it is necessary to
identify a corpus of cognate morphemes , that is, morphemes
that have a common historical origin, identified inductively
as forms of similar meaning that exhibit recurrent sound
correspondences. Distinct sound correspondences that are
found in the same or closely similar environments normally
must be attributed to different proto-phonemes , and the
inventory of proto-phonemes so inferred forms a hypothesis
about the sound system of the proto-language of a language
family . Since proto-phonemes can only be reconstructed in
lexical forms, phonological and lexical reconstructions
are inextricably bound together. Reconstructed phonemes
and reconstructed words (proto-forms) are preceded by an
asterisk to indicate their hypothetical status. Some linguists
take the position that the phonetic substance of such symbols is beyond recovery and that proto-phonemes are, therefore, little more than abstract formulas used to summarize
sound correspondences (the formulaic position). The majority view (the realist position) is more sanguine; although the
phonetic nature of some proto-phonemes clearly is controversial, many others permit little latitude in interpretation.
A related, though distinct, issue concerns the structure of
reconstructed phonological systems, since some of these
have violated implicational universals in typology .
Where this occurs, most historical linguists today would
question the validity of the reconstruction, using the typological generalizations of the present as a guide to inferences
about the past.
Once lexical reconstructions are available, it becomes possible to determine sound changes in a large number of daughter languages. No topic in linguistics has a longer history than
the study of sound change, which commenced during the first
quarter of the nineteenth century with the pioneering studies of
Rasmus Christian Rask and Jacob Grimm. A major point of controversy in the study of sound change is the issue of regularity.
It is now generally agreed that the strict Neogrammarian position, which ruled out the possibility of unconditioned phonemic splits, is overly restrictive. A second issue, which is yet to be
resolved, is whether all sound change is phonetically (or phonologically) motivated.
An examination of sound changes leads not only to theoretical models of how (and why) this process occurs but also to
evidence for subrelationship within a language family. Sound
changes that are exclusively shared (exclusively shared innovations) form the basis for linguistic subgroups. Subgrouping
allows linguists to move beyond the mere recognition of a
language family as an internally undifferentiated collection
of related languages to the reconstruction of a family tree that
defines the historical order of splits of major and minor groups
of languages within the family. The structure of a family tree,
in turn, supports inferences about the most likely center of

dispersal, or homeland, of a language family (or subgroup


ancestor) and, hence, gives rise to hypotheses about direction
of migration that can be tested against the evidence of other
scholarly disciplines, such as archaeology or population genetics. However, it has been recognized since at least the 1870s
that not all processes of linguistic differentiation are treelike,
and it is widely accepted that both family tree and wave models accurately describe the process of language split, the former
under conditions of sharp social separation and the latter under
conditions of gradual differentiation of independent languages
from a dialect complex.
The reconstruction of the proto-Indo-European case system
in the first half of the nineteenth century marked the beginning
of work on comparative morphosyntax, but many would argue
that since this involves the identification of cognate affixes, it is
a variant of lexical reconstruction. In recent years, greater attention has been paid to problems of reconstruction in other areas
of syntax, such as word order, and in semantics. It is noteworthy that the models for such work almost invariably derive
from typological approaches to synchonic linguistic structure,
rather than from formal theories of syntax (see synchrony and
diachrony).
A proto-language inevitably presents a very incomplete picture of the language that must actually have existed. Nonetheless,
the comparative method, which is generally thought to permit
reconstruction of languages up to about 6,000 years old, is a
powerful tool that allows a variety of inferences about prehistoric language communities and their cultures. The potential
use of linguistic reconstruction for inferences about culture history was recognized in the second half of the nineteenth century
and labeled linguistic palaeontoloy by Ferdinand de Saussure.
However, little use was made of this potential until relatively
recent times. Since roughly the 1970s there has been increasingly
fruitful interdisciplinary cooperation, especially between historical linguists and archaeologists, in exploring Holocene human
prehistory. This has led to a renewed inquiry into the antiquity of
the Indo-European settlement of Europe and has been a powerful force in understanding the prehistoric human settlement of
the Pacific.
Robert Blust
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Blust, Robert. 1987. Lexical reconstruction and semantic reconstruction: The case of Austronesian house words. Diachronica
4: 79106.
Campbell, Lyle. 1999. Historical Linguistics: An Introduction. Cambridge,
MA: MIT Press.
Harris, Alice C., and Lyle Campbell. 1995. Historical Syntax in CrossLinguistic Perspective. Cambridge: Cambridge University Press.
Kiparsky, Paul. 1995. The phonological basis of sound change. In The
Handbook of Phonological Theory, ed. John A. Goldsmith, 64070.
Oxford: Blackwell.
Renfrew, Colin. 1998. Archaeology and Language: The Puzzle of IndoEuropean Origins. London: Cape.
Ross, Malcolm, Andrew Pawley, and Meredith Osmond, eds. 1998.
The Lexicon of Proto Oceanic: The Culture and Environment
of Ancestral ceanic Society. Vol. 1. Material Culture. Pacific
Linguistics C-152. Canberra: Department of Linguistics, Research

363

Holophrastic Stage, The


School of Pacific and Asian Studies, The Australian National
University.
. 2003. The Lexicon of Proto Oceanic: The Culture and Environment
of Ancestral Oceanic Society. Vol. 2. The Physical Environment. Pacific
Linguistics 545. Canberra: Department of Linguistics, Research
School of Pacific and Asian Studies, The Australian National
University.

HOLOPHRASTIC STAGE, THE


Despite extensive individual variation, children on average
begin to produce words around the end of the first year of life.
Initially, there is period of slow word learning of several months,
when the rate of learning is usually not much more than eight or
so words a month. Toward the end of this period of slow word
learning, two changes take place. First, word learning begins to
increase, such that the rate eventually increases to between one
and two new words a day. Second, children begin to combine
words to form their first sentences (see two-word stage ).
This period of word acquisition, demarked at the start by the
first words and at the end by the onset of word combinations,
is commonly referred to as the holophrastic stage of language
acquisition.
An alternative would be to refer to this period as the oneword stage. This option would be descriptively adequate, in
that it captures the fact that childrens productions are limited
to a single word. Lois Bloom (1973), in fact, in an extensive
study of her daughters first words, made this choice, entitling
her book One Word at a Time. The term holophrastic, however, is making a somewhat different claim about childrens
language acquisition when they are producing single words.
Holophrastic can be defined as a single word expressing the
ideas of a phrase or sentence. The term more explicitly states
that something more than single word production is taking
place. When adults produce single words, such as doggie or
eat, it is usually with the intent to express the meaning of the
individual word. When a child around age one says doggie
(or probably a little later eat), however, he or she combines
the meaning of the individual words with the communicative intention of the utterance, intending a broader meaning, such as there is a doggie, and I can see it or there is an
apple, and I want to eat it.
The decision to call the period of slow word learning the oneword stage versus the holophrastic stage is more than that of
personal taste. The former choice reflects a more conservative
view of the childs grammatical knowledge at this point is language acquisition. The latter choice, on the other hand, represents the position that childrens knowledge of grammar may
be greater than that directly seen in single-word productions.
The present entry provides an overview of these two viewpoints
and discusses the nature of childrens emerging grammatical
knowledge during the time in which they produce one-word
utterances. It suggests that grammatical development begins
during this period, as reflected in childrens understanding of
sentences and in some of the patterns of their single-word productions, particularly as they near the point of making word
combinations.

364

Theoretical Approaches
At the onset, it is important to understand the possible
claims about grammatical knowledge during the holophrastic period. The most conservative claim would be, basically,
what you see is what you get, that is, that children only produce single words because that is all they know at that point.
This approach has come to be known as a lean interpretation
of childrens knowledge. At the other extreme, one can claim
that childrens grammatical knowledge is greater than what is
reflected in single-word productions. This rich interpretation
is based on the fact that language acquisition is rapid and that
children must have innate language learning mechanisms
that enable them to determine the grammatical characteristics of their language at a very early age (see innateness
and innatism ).
These two positions can be demonstrated by again looking at
a childs production of the words doggie and eat. A lean interpretation would be that the child has some initial semantic categorization of word meaning, such that doggie represents
an emerging category of animate objects, and eat represents
an emerging category of actions. A rich interpretation would
propose that innate principles enable the child to establish
explicit grammatical knowledge from these semantic categories.
One principle would be that languages universally categorize
things as nouns and actions as verbs. Another principle would
be that categories like noun and verb are heads of larger units
or phrases, that is, noun phrases (NPs) and verb phrases (VPs).
Another principle would be that NPs and VPs are semantically
connected through semantic relations such as agent and action.
Seeing the doggie eating would lead to the establishment of the
sentence category. This process of building grammatical knowledge from semantics has been called semantic bootstrapping
(Pinker 1984).
How, then, does one decide between these two positions? The
answer depends on the importance placed on supporting childrens grammatical advances with observable changes in their
linguistic behavior. The rich interpretation deals with grammatical development as a logical problem. Since the complexities of
language are acquired so rapidly, it seems reasonable to assume
that linguistic principles must be at work at the very onset of
language acquisition. Researchers who study childrens actual
comprehension and production of language, however, set an
additional requirement. Claims about childrens abilities must
be supported in the way children are comprehending and producing language. The remainder of the present entry discusses
the evidence for grammatical knowledge during the holophrastic stage, based on studies of childrens comprehension and
production.

Language Comprehension
Studying the language comprehension of one-year-olds can be
difficult since they are not capable of responding to the tasks typically used for older children and adults. Researchers, however,
have come up with several clever ways to get at least a general
idea of childrens understanding at this age. If childrens knowledge of language is limited to single words, then one would predict that they should do as well or better responding to utterances

Holophrastic Stage, The


of one word than to those of multiple words. The results of several studies, however, have shown that children can respond to
multiword utterances and that they have an awareness that the
words are related in a way suggestive of the relations between
words in a sentence.
E. Shipley, C. Smith, and L. Gleitman (1969) examined
young childrens responses to commands directed toward
them by their mothers while the children played. The commands were a single word, for example, ball!, two words,
for example, throw ball!, and well-formed commands, for
example, throw me the ball! The results showed that children
around the end of the holophrastic period and beginning of
word combinations actually responded most often when they
heard the well-formed commands. While not directly showing that the children understood the well-formed commands,
the results indicated that children were aware that such commands met the characteristics of English sentences, while the
other two do not.
Other studies have examined more directly whether very
young children can differentiate multiword utterances on the
basis of the specific words in them. One potential problem in
testing children on this aspect is that they may give the appearance of understanding a multiword utterance when they are only
doing what they typically would do. For example, a child who
throws a ball when told throw the ball may do so just because
that is what children typically do with a ball. To avoid this problem, J. Sachs and L. Truswell (1978) used novel combinations
that were not likely to be part of childrens experiences. Test sentences included unusual commands, such as smell truck and
kiss truck. The results indicated that young children were able
to respond correctly to such novel commands, suggesting that
they were aware of at least two-word relations in the sentences
they heard. Other research by J. Miller and colleagues (1980) has
examined the range of sentence types that are understood by
children during the holophrastic stage. The results indicate that
this range is limited. Children did best on responding to sentences that communicated an actionobject relation (e.g., kiss
the shoe) and to those that communicated a possessor to possessed relation (e.g., mamas shoe). They did less well on other
relations, such as agent action (e.g., make the horsey kiss).
These studies examined relations between lexical words in
sentences. A further question would be whether or not holophrastic children are also becoming aware that sentences
contain smaller functional words as well, such as articles and
auxiliaries. The fact that young children preferred the wellformed commands suggests that this may be the case. N. Katz,
E. Baker, and J. Macnamara (1974) explored this issue by showing children pictures that contained either a single instance of
a nonsense figure (e.g., an odd-shaped form called zav) or a
picture of more than one instance. The children were then asked
either show me zav or show me the zav. It was found that the
children tended to indicate the picture of the single instance in
the former case and the picture of the multiple instances in the
latter case. The children in this study were a bit older than holophrastic children, but they were not yet using articles in their
spontaneous speech.
In summary, a variety of studies on children in or around
the holophrastic stage indicate that they are beginning to

understand the nature of multiword utterances. They can process relations between at least two words in some sentences and
are aware that sentences contain both stressed lexical words
and unstressed words. They may not be aware of the nature of
the latter words, but they know that sentences require them to
be well formed.

Language Production
The claim that preliminary knowledge of grammar takes place
during the holophrastic stage would be strengthened if evidence
could also be found in childrens spoken language. At first glance,
this would seem impossible since holophrastic children are only
producing a single word at a time. There are, however, aspects
of childrens spoken language at this stage that, taken together,
indicate emerging grammatical knowledge as well.
It is well known that holophrastic children are not very intelligible, the result of the fact that they are limited in their phonetic skills and are, in most cases, mixing their single-word
productions with babbling . A. Peters (1983) also pointed out
that some children are not exclusively single-word producers.
He identified children who do not just attempt single words
but attempt to imitate and repeat longer sentences. These longer utterances are often hard to interpret and may be identified in many cases as some form of jargon, that is, attempts
to produce sentence-length productions without meaning.
Peters found, however, that some of these jargon productions
may be meaningful, though the meaning may be missed by
their parents. These productions do not represent evidence
that holophrastic children know grammar, but they support
the previously stated results on well-formed commands, that
is, that the children know that sentences consist of more than
single words.
Peters (1983) and others also drew attention to the fact that
even single-word productions are not exclusively single words.
With the advent of advanced tape recorders, researchers found
in their phonetic transcriptions that words often were preceded by brief phonetic material that was often difficult to hear
or interpret. For example, a child who was saying book, was
actually saying something like uh book or uhm book. These
brief phonetic instances have been called several terms, such as
filler syllables, phonetically consistent forms, and presyntactic devices. The last term reflects the opinion of many researchers that these filler syllables are not yet syntactic units, such
as articles or auxiliaries, but they are evidence that children
are taking notice that they exist and noting their distributional
characteristics.
A further characteristic of holophrastic speech is that single words are produced in sequences. Bloom (1973) examined the successive single-word utterances of her daughter to
see if these sequences reflected later word combinations. For
example, does a child who says eat cookie when she begins
word combinations show earlier single-word sequences, such
as eat, cookie, during the holophrastic stage? Such cases
would provide evidence of grammatical knowledge during
the holophrastic stage. Bloom found that her daughters early
sequences did not show these relations, in that each word had
its own context, that is, distinctly associated action. Later, however, toward the end of the holophrastic period, sequences on

365

Homologies and Transformation Sets


single words with a shared event began to emerge. For example, one sequence of words involving up, neck, zip was
all produced in the context of her daughter wanting her mother
to zip up her coat.
The last piece of evidence to suggest that early words are holophrastic comes from returning to the original sources of the term.
Early diary studies many years ago by parents on their childrens
language learning recorded observations that their children
used single words with differing communicative contexts. One of
the most famous was the study by W. Leopold (193949) on his
daughter, Hildegard. Leopold noted that Hildegard used her single words to express distinct communicative intents. Some utterances were intended to show her noticing something, some to
make a request for the parent to perform some action, and others
to demand something such as a toy out of reach. These different
functions were distinguished by variations in the prosody of the
word and by the childs gestures.

Summary
Children go through a period of language acquisition of six
months or so during the second year of life when they produce
single words at a time. This time of acquisition has been referred
to as the holophrastic stage. The reason is that the single words
used often communicate the idea of a sentence, that is, the
meaning of the word expressed and the communicative intent
of the utterance. The term is suggestive that preliminary grammatical acquisition is taking place during this period. Research
on childrens comprehension and production during this stage
suggests that this may be the case.
David Ingram
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bloom, L. 1973. One Word at a Time. The Hague: Mouton.
Ingram, D. 1989. First Language Acquisition. Cambridge: Cambridge
University Press.
Katz, N., E. Baker, and J. Macnamara. 1974. Whats in a name? A study of
how children learn common and proper names. Child Development
45: 46973.
Leopold, W. 193949. Speech Development of a Bilingual Child: A
Linguists Record. 4 vols. Evanston, IL: Northwestern University
Press.
Miller, J., R. Chapman, M. Bronston, and J. Reichle. 1980. Language
comprehension in sensorimotor stages V and VI. Journal of Speech
and Hearing Research 23: 284311.
Peters, A. 1983. The Units of Language Acquisition. Cambridge: Cambridge
University Press.
Pinker, S. 1984. Language Learnability and Language Acquisition.
Cambridge: Harvard University Press.
Sachs, J., and L. Truswell. 1978. Comprehension of two-word instructions by children in the one-word stage. Journal of Child Language
5: 1724.
Shipley, E., C. Smith, and L. Gleitman. 1969. A study in the acquisition of
language: Free responses to commands. Language 45: 32242.

only by locating them within complexes of relations or structures.


One simple and influential technique of structuralist analysis,
developed by Claude Lvi-Strauss, is the isolation of homologies.
A homology is a simple binary opposition mapped onto another
binary opposition as its structural equivalent. One model here is
phonology, where, for example, voiced/unvoiced pairs (e.g.,
b/p and d/t) may be understood as structurally equivalent with
respect to voicing. Lvi-Strauss takes homologies to manifest
important structural relations across a range of higher linguistic
levels, including, for example, narrative. Thus, we might say that
in Shakespeares Hamlet, Hamlet is to Laertes as Claudius is to
Hamlet.
In a way, this particular homology is self-evident. But why
does it work? Hamlet killed Laertes father, just as Claudius killed
Hamlets father. Moreover, Hamlet was in love with Laertes sister, just as Claudius was in love with Hamlets mother. Finally,
Hamlet was partially responsible for Laertes losing his sister,
as Claudius was responsible for Hamlet being, to some degree,
separated from his mother.
As this suggests, homologies operate through larger complexes of relations. In his four-volume Introduction to the Science
of Mythology (see, for example, Lvi-Strauss 1969), Lvi-Strauss
systematically explored structural analysis beyond homologies
through the concept of transformation sets. A transformation set
is a series of multiplace structures that map onto one another.
The mapping is defined by transformation rules, which are triggered by specifiable conditions, often cultural conditions. For
example, there may be a transformational relation between the
myths of two groups say, agriculturalists and fishers such that
when the myths of one group refer to earth, the parallel myths of
the other group refer to water.
One model here is morphology. For instance, in English,
the plural morpheme is pronounced s after unvoiced nonsibilants (as in cats), z after voiced nonsibilants (as in dogs),
and z after sibilants (as in bushes). S, z, and z form a transformation set, and the contextual trigger defining the transformation is phonological.
Clearly, the situation is more complex and the analysis less
straightforward with higher-level structures, such as literary
works. Consider Hamlet. In addition to the structures already isolated, Hamlet and Ophelia may be mapped onto each other in
losing their fathers due to an older relative/older lover, in feigning
madness/going mad, in contemplating suicide/committing suicide, and so on. The establishment of such a transformation set
raises intriguing questions. For example, are the Hamlet/Ophelia
structures differentiated by a simple gender context (i.e., male
vs. female)? If so, how is this related to the mapping of Hamlets
mother onto Laertes sister in the Hamlet/Laertes transformation
set? Lvi-Straussian analysis allows us to recognize such complex
patterns and, perhaps, begin to understand them as well.
Patrick Colm Hogan
WORK CITED

HOMOLOGIES AND TRANSFORMATION SETS


A fundamental principle of structuralism is that one cannot
understand elements in isolation. One can understand elements

366

Lvi-Strauss, Claude. 1969. The Raw and the Cooked. Trans. John and
Doreen Weightman. New York: Harper. This is the first volume of
Introduction to the Science of Mythology.

Icon, Index, and Symbol

I
ICON, INDEX, AND SYMBOL
The nineteenth-century American philosopher C. S. Peirce developed extensive sign theories in order to explain reference,
meaning, communication, and cognition. One of the central
and most innovative features of his theories was the icon, index,
symbol classification of signs.
A crucial aspect of understanding Peirces icon, index, symbol division is his account of sign structure. According to Peirce,
any instance of signification consists of three interrelated parts: a
sign, an object, and an interpretant. For the sake of simplicity,
we can think of the sign as the signifier, for example, a written
word or an animals footprint. The object, on the other hand,
is whatever is signified, for example, the object denoted by the
written word or the animal that left the print. The interpretant is
the understanding or interpretation that the sign/object relation
generates, for example, that the word or utterance is meant to
refer to its object or that the animal track signifies the presence
of the animal that made it. The importance of the interpretant
for Peirce is that signification is not a simple dyadic relationship
between sign and object: A sign signifies an object only if it can
be interpreted as such.
With this structure in mind, Peirce was interested in classifying the various ways in which the sign/object relation might
generate an interpretant. In particular, he thought that a sign
might come to signify its object, and so generate an interpretant,
in three possible ways. First, a sign may be understood as signifying in virtue of similarities or shared qualities between it and its
object. As Peirce says, I call a sign which stands for something
merely because it resembles it an icon (1935b, 362). His own
preferred examples of icons are portraits or mathematical diagrams indeed, he thought icons were especially important to
mathematical thought. However, we can also include examples
such as color swatches, sculptures, and so on. What is central to
iconic signification is that the qualities of the sign are also qualities of the signified object and that this sharing of qualities is crucial in enabling the sign to signify.
The second way in which a sign might be understood as signifying is in virtue of some physical or causal connection between
it and its object. Such a sign is an index. Peirces own description
of an index is as a sign which refers to the object that it denotes
by virtue of being really effected by that object (1935a, 248).
Again, there are numerous and wide-ranging examples, including demonstratives and indexical expressions, weather vanes,
barometers, fever as a sign of an underlying illness, or smoke
as a sign of fire. What is crucial to indices is that the object has
a causal effect upon the sign (as in the case of fire causing the
smoke that indicates it) or has some spatio-temporal proximity
to its sign, which can be used to aid an interpreter of the sign
to grasp that object (as in the case of pointing to some nearby
object).
The third way in which a sign might be understood as signifying is in virtue of some convention or law that connects it to its
object. Peirces own description of a symbol is as a sign which
refers to the object that it denotes by virtue of a law, usually an

association of general ideas, which operates to cause the symbol to be interpreted as referring to its object (1935a, 249). There
are numerous examples of symbols, from the various words and
utterances in human languages to such things as road signs.
What is crucial in the case of symbols is that there exists some
underlying convention, agreement, habit, or law that means
that invoking some symbol invokes its associated object. For
instance, a red traffic lights being symbolic of a lack of priority
at a road junction works because we have all agreed (by habit, by
convention, and by imposing traffic regulations) to use red traffic
lights this way.
Throughout his life, Peirce made numerous alterations to his
account of signs (see, for instance, Short 2004), but the broad
division among icons, indices, and symbols tends to find a place
throughout. There are, of course, some subtleties to Peirces
account. For instance, it is not clear that there are very many
examples of signs that are purely iconic, indexical, or symbolic
that is, which do not overlap with one or both of the other elements of the trichotomy. As an example, take a painted portrait
as a sign of the person it depicts. This sign is an icon in that it signifies its object in virtue of the qualities it shares with that object
the skin and hair color of the depicted person are replicated in
the painting. But, of course, many of the things that make a portrait a successful depiction of its sitter are due to particular conventions governing paintings and how particular blocks of color
in two dimensions can stand for some subject. This seems to
make the painting look as though it has symbolic elements, too.
Similar considerations hold for indices such as barometers
although such signs indicate their objects in virtue of a causal
and physical connection with their object; conventions about
how we should interpret this physical connection also seem to
play a part in signification. Whats more, there are clear instances
of symbols that have some iconic element. Obvious examples
might include forms of writing, such as Chinese, that involve
pictograms, at least partially. Even onomatopoeic words such as
cuckoo present clear cases of symbols with a strong iconic element the phonic qualities of the object are aped by the phonic
qualities of the word.
Peirce was aware of the various overlaps among icons, indices, and symbols, and at some point proposed to call icons and
indices with symbolic elements hypo-icons and subindices as
a way of acknowledging this. However, in any case where more
than one of the three elements is present, one will be most
prominent. Consequently, we can think of Peirces trichotomy
as dividing signs according to whether they are predominantly
iconic, indexical, or symbolic.
The main influence of Peirces division is in semiotics,
where his work is considered foundational. However, the icon,
index, symbol distinction has had some influence in philosophy, particularly through the work of Arthur Burks (1949), and
has even been used in such diverse areas as literary theory (see,
for example, Sheriff 1989), film theory (see, for example, Wollen
1969; see also film and language), and musicology (see Turino
1999; see also music, language and). The use and relevance of
this distinction to linguistics are similarly diverse, but it features
most prominently in analyses of the relation between animal
communication and human language and in some explanations of the evolution of language.

367

Icon, Index, and Symbol


In explaining animal communication, the distinction is especially useful since it allows us to classify various cases of animal language without treating all such instances as uniform.
Consequently, a diverse range of animal camouflage or cases
of mimicry can be classified as iconic instances of communication. For example, the harmless milk snakes mimicking of the
poisonous coral snakes red, black, and yellow coloring in order
to avoid predation is easily explained as an instance of iconic
communication these colors mean poisonous! As for indexical
communication, a well-discussed case is vervet monkey warning
calls (see Seyfarth, Cheney, and Marler 1980; see also primate
vocalizations). In such an example, the calls are classifiable
as indexical since they rely upon a causal and physical connection with particular predators in order to refer the calls are made
in response to the snakes, eagles, or leopards whose presence is
perceived. And this is all in contrast to human language, which
is predominantly symbolic and can enable communication even
if the objects referred to are not present. Ingar Brinck and Peter
Grdenfors (2003) make compelling use of the icon, index, symbol trichotomy in explaining animal communication where they
discuss the role of such communication in cooperation.
The most prominent use and interesting development of
Peirces icon, index, symbol trichotomy is Terence Deacons
(1997) account of the coevolution of human language and
brains. According to that account, language evolution is to be
explained by seeing iconic, indexical, and symbolic communication and reference as related to one another in a hierarchy.
What this means is that in order to master symbolic communication, we must first master indexical communication. And in
order to master indexical communication, we must first master
iconic communication. For instance, a predators inability to
distinguish the milk snakes coloring from that of a coral snake
is suggestive of iconic reference it is manifest in the predators inability to distinguish one type of snake from the other.
However, this iconic communication needs to be in place in
order for the predator to take the coloring of those snakes as
an indexical signifier of the poisonous status of the snake red,
yellow, and black banding are an index of a venomous snake.
Other instances of indexical reference work in just this way. It is
because the vervet monkey sees the eagle above as being qualitatively similar to previously experienced eagles (that is, as an
icon of a recognized predator) that it is able to produce a warning cry (an indexical reference) when that predator is present.
Symbolic reference requires the presence of indexicals but also
requires that the indexical relationship between words/sounds
and their objects has become ingrained, habitual, and appropriately interconnected with other symbols so that reference and
communication are maintained even if the stimulus to indexical
reference is lost or removed.
Once this symbolic threshold is achieved, complex relationships between words develop, allowing words to signify other
words and explain the relationships that exist among them. Such
a model is useful for explaining various differences between
cases like vervet monkey warning calls, captive chimpanzee
symbol manipulation, and human language learning in the
two former cases, the connection between sign and object is lost
when the object is absent for sustained periods. Consequently,
the habituation and interconnectedness of indexical signs that

368

Ideal Speech Situation


allows for the symbolic communication typical of human language is never attained, and vervet monkey calls and chimpanzee symbol manipulation never rise above the level of indexical
communication.
Albert Atkin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brinck, Ingar, and Peter Grdenfors. 2003. Co-operation and communication in apes and humans. Mind and Language 18: 484501.
Burks, Arthur. 1949. Icon, index and symbol. Philosophy and
Phenomenological Research 9: 67389.
Deacon, Terence. 1997. The Symbolic Species: The Co-evolution of
Language and the Human Brain. New York: Norton.
Liszka, James Jacob. 1996. A General Introduction to the Semiotic of
Charles S. Peirce. Bloomington: Indiana University Press.
Peirce, C. S. 1935a. The Collected Papers of Charles S. Peirce. Vol 2.
Cambridge: Harvard University Press.
. 1935b. The Collected Papers of Charles S. Peirce. Vol. 3.
Cambridge: Harvard University Press.
Seyfarth, R. M., D. L. Cheney, and P. Marler. 1980. Monkey responses
to three different alarm calls: Evidence for predator classification and
semantic communication. Science 210: 8013.
Sheriff, John K. 1989. The Fate of Meaning: Charles Peirce Structuralism
and Literature. Princeton, NJ: Princeton University Press.
Short, Thomas. 2004. The development of Peirces theory of signs.
In The Cambridge Companion to Peirce, ed. Cheryl Misak, 21440.
Cambridge: Cambridge University Press.
Turino, Thomas. 1999. Signs of imagination, identity, and experience: A
Peircian semiotic theory for music. Ethnomusicology 43: 22155.
Wollen, Peter. 1969. Signs and Meaning in The Cinema. London: Secker
and Warburg/British Film Institute.

IDEAL SPEECH SITUATION


This term was coined by the German social theorist and philosopher Jrgen Habermas to refer to the conditions necessary for free
and transparent communication and discussion. The concept of
ideal speech situation plays a key part in his early formulations of
a theory of communicative action and of universal pragmatics (Habermas 1979, 168; 1984; 1987). In his later writings,
the term has tended to be replaced by Karl-Otto Apels notion
of an unrestricted communication community (Apel 1980;
Habermas 1990, 88).
An ideal speech situation may be understood as the conditions that would allow for open discussion between free and
equal participants, who strive to come to an agreement upon
any topic purely through the force of better argument. Thus, the
participants enter a discussion assuming that their ideas may be
challenged by any other participant, but that only those ideas
and arguments that are rationally formulated and supported
by relevant and persuasive evidence will survive interrogation.
The personality, status, power, or rhetorical abilities of the person holding the idea will be rendered irrelevant in the course of
debate.
The idea of an ideal speech situation has its origins in the work
of the American pragmatist philosopher Charles Sanders Peirce.
In his philosophy of science, Peirce proposed the notion of an ideal
community of scientists. He recognized that scientific research is
a necessarily communal enterprise. Typically, scientists work in

Ideal Speech Situation


teams, but even if they work in individual isolation, they will still
be required to submit their research results to a process of peer
review. Scientific hypotheses are thus formulated, refined, and
finally accepted as (provisionally) true only through a process a
collective debate, criticism, and defense. Peirce was aware that
in practice, scientific debate falls well short of any ideal process
of rational scrutiny. Imperfections will occur in part due to practical limitations. Certain evidence may be unavailable due, for
example, to lack of sufficiently refined experimental technology
to test a hypothesis rigorously. More importantly for Habermass
use of Peirce, however, is the distorting role that hierarchies of
status and power may play within the scientific community. The
opinions of certain figures within the scientific community will
carry more weight than those of others. The opinions of a senior
researcher trump those of a laboratory assistant. Peirces concern is that open and rational debate is then being compromised
by hierarchies of power and status within the scientific community. The senior researchers opinions count not because of his or
her greater rationality or insight but simply because he or she is
in a position of power. Junior scientists may feel unable to raise
their criticisms in debate or may believe that their opinions have
no place in the debate. Peirces notion of an ideal community of
scientists can then be understood, in potential at least, as a critical tool that draws attention to the deficiencies of real scientific
communication.
Habermas may, therefore, be seen to use the notion of the
ideal speech situation, particularly in his early formulations of
it, in a similarly critical manner. It encapsulates a perhaps unrealizable standard against which actual communication and discussion can be measured. Real communication will be distorted,
perhaps because of a lack of relevant information, perhaps
because of deficiencies in the participants ability to recognize
good argument and evidence, but also, crucially, because some
will exercise power over the discussion. Power can be used to
introduce topics, to silence certain forms of criticism or suppress evidence, and to silence certain potential contributors.
Power can be exercised openly through, for example, threats and
intimidation. More subtly, it may be exercised through rhetorical
means so that the other participants fail to recognize that weak
arguments have been given or that relevant evidence has not
been presented. Perhaps most significantly, Habermas argues
that power differentials may be so ingrained in a culture that
participants take their inferiority or superiority for granted and,
as such, do not notice the influence of power on discussion. For
example, in a patriarchal society, women will typically have less
opportunity to raise topics in conversation or to challenge and
interrupt other (male) speakers (see gender and language).
Such implicit and unacknowledged power is characterized as
systematically distorted communication (Habermas 1970).
The images of an ideal community of scientists and an ideal
speech situation have a utopian ring about them, suggesting perfect societies at the end of human history. Habermas is keen to
reject such utopian interpretations of the ideal speech situation
(Habermas 1982, 261f). This becomes clearer in his later formulations of the argument. The ideal speech situation is understood as a counterfactual assumption made by all participants
in conversation and discussion. Upon entering a conversation
where the participants strive for mutual agreement rather than

Identity, Language and


manipulation of one another, Habermas argues, they all presuppose, with a rather studied navet, that the other participants
are telling the truth and being sincere in their participation in
the conversation, and that they have the right to speak and act
as they do. In practice, these assumptions can quickly be overturned. However, Habermass point is that a person could not
enter into a conversation assuming that the other participants
were lying or systematically trying to deceive. Every utterance
would be treated with suspicion, and ultimately no fixed meaning could be attributed to it. While some form of social interaction might continue, it would not be a true conversation or what
Habermas understands as communicative action. That is to say,
the participants would not be seeking to reach a mutual agreement (constrained only by the force of better argument). Rather,
each would be trying to manipulate the other (in what Habermas
calls strategic action; 1982, 266).
Andrew Edgar
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Apel, Karl-Otto. 1980. Towards a Transformation of Philosophy.
London: Routledge and Kegan Paul.
Habermas, Jrgen. 1970. On systematically distorted communication.
Inquiry 13: 36075.
. 1979. Communication and the Evolution of Society. Trans. Thomas
McCarthy. Boston: Beacon.
. 1982. A reply to my critics. In Habermas: Critical Debates, ed.
John B. Thomas and David Held, 21983. London: Macmillan.
. 1984. The Theory of Communicative Action. Vol 1. Reason
and the Rationalisation of Society. Trans. Thomas McCarthy.
Cambridge: Polity.
. 1987. The Theory of Communicative Action. Vol. 2. Life World and
System: A Critique of Functionalist Reason. Trans. Thomas McCarthy.
Cambridge: Polity.
. 1990. Moral Consciousness and Communicative Action. Trans.
Christian Lenhardt and Shierry Weber Nicholsen. Cambridge, MA: MIT
Press.

IDENTITY, LANGUAGE AND


Linguistic Identity
Our identities who we are are bound up with how we speak,
write, and sign. Whether or not we intend it or are even aware of
it, other people interpret clues from our use of language in order
to assign to us identity categories of all sorts, including gender,
race, ethnicity, nationality, the region, or even the precise locality we come from, age or generation, sexual orientation, religion,
level of education, and that vague complex of factors that bundle
together as class. All this is in addition to and part and parcel
with the decisions they make about our intelligence, likeability,
and trustworthiness and whether or not to believe what we are
telling them.
Although self-identity has long been given a privileged role,
the identities we construct for ourselves and others are not different in kind, only in the status we accord to them. The gap between
the identity of an individual and of a group seems most like a true
difference of kind, with group identities more abstract than individual ones. Brazilianness, after all, does not exist separately
from the Brazilians who possess it, except as an abstract concept.

369

Identity, Language and


Yet combinations of such abstractions are what our individual
identities are made up of, and group identity frequently finds
its most concrete manifestation in a single, symbolic individual.
Group identities nurture our individual sense of who we are but
can also smother it.
Recent work on the evolution of language has suggested that it
came about to fulfill something more than the two purposes traditionally ascribed to it, communication and representation.
Language also exists for the purpose of reading other people, in
order to discriminate useful allies from potential competitors.
sociolinguistic inquiry into identity and language is concerned with the way people read each other, in two senses. First,
how are the meanings of utterances interpreted, not just following idealized word senses and rules of syntax as recorded in
dictionaries and grammars, but in the context of who is addressing whom in what situation? Secondly, how are speakers themselves read, in the sense of the social and personal identities their
listeners construct for them based on what they say and how they
say it? This is a complex process because speakers output is usually shaped in part by how they have read their listeners and their
expectations. Every day, each of us repeatedly undertakes this
process of constructing our reading of the people we encounter,
in person, on the telephone, on the radio or the screen, or in writing, including on the Internet, on the basis of their language, at
least in part and in some of the media just mentioned, on the
basis of that alone.

Targeting Identity in the Analysis of Language


Modern linguistics has moved slowly but steadily toward embracing the identity function as central to language. The impediment
has been the dominance of the traditional outlook that takes
representation alone to be essential, with even communication
relegated to a secondary place. Although significant developments within linguistics (surveyed in Joseph 2004) pushed it in
the direction of attending to identity over the course of the twentieth century, a crucial prompting came from social psychology,
where one approach in particular needs to be singled out: social
identity theory, developed in the early 1970s by Henri Tajfel (see
ethnolinguistic identity). This approach was novel in not
being concerned with power but in the status we give ourselves
as members of in-groups and out-groups. This would come
into even greater prominence in the self-categorization theory
that developed as an extension of the original model, notably in
the work of Tajfels collaborator J. C. Turner (see Turner et al.
1987).
Partly under the influence of such work, many sociolinguists reoriented their object of investigation. L. Milroy (1980)
reported data from studies she conducted in Belfast showing
that the social class of an individual did not appear to be the
key variable allowing one to make predictions about the forms
that the person would use. Rather, the key variable was the persons social network, a concept borrowed from sociology, and
defined as the informal social relationships contracted by an
individual (Milroy 1980, 174). Where close-knit localized network structures existed, there was a strong tendency to maintain
nonstandard vernacular forms of speech.
Over the next two decades, sociolinguistic investigation
of groups ideologically bound to one another shifted from

370

statistically based examination of social networks to more interpretative examination of communities of practice, defined
as an aggregate of people who come together around mutual
engagement in an endeavor (Eckert and McConnell-Ginet
1992, 464). In the course of this endeavor, shared beliefs, norms,
and ideologies emerge, including, though not limited to, linguistic and communicative behavior. This line of research is thus
continuous with another one that has focused more directly
on the normative beliefs or ideologies by which national and
other group identities are maintained (see Verschueren 1999;
Blommaert 1999; Kroskrity 2000).
Other features of recent work on language and identity
include the view that identity is something constructed, rather
than essential, and performed, rather than possessed features
that the term identity itself tends to mask, suggesting as it does
something singular, objective, and reified. Each of us performs
a repertoire of identities that are constantly shifting and that we
negotiate and renegotiate according to the circumstances.

Co-constructing National Identity and Language


Within these repertoires, any particular identity can become the
salient one in a given context. None inherently matters more
than the rest. However, national identity requires a separate
discussion because of its unique impact on views about what
a language is. Modern nationalism has been grounded in a
belief that the best proof of a peoples historical authenticity and
right to self-determination is the possession of a language that
is uniquely theirs. Hence, one of the first obstacles to be overcome in establishing a national identity is the nonexistence of a
national language.
The nation-state myth that basic view of the world as consisting naturally of autonomous states, each corresponding to an
ethnically unified nation assumes that national languages are
a primordial reality. Dantes treatise De vulgari eloquentia (ca.
1306) lays out the process by which he claimed to discover, not
invent, the national language of a nation, Italy, that would take
five and a half centuries to emerge politically. This all seems a
fiction, a pretense of discovery in what will actually be Dantes
invention of an illustrious vernacular which will, in turn, camouflage how much of it is actually based on his native Tuscan.
But Dantes volgare illustre became the template upon which
other modern European standard languages were modeled.
Once the national languages existed, their invention was
promptly forgotten. The people for whom they represented
national unity inevitably came to imagine that the language had
always been there and that such dialectal difference that existed
within it was the product of recent fragmentation when, in fact, it
had preceded the unification by which the national language was
forged. By the early nineteenth century, this nationalist mythology would lead to Romantic theorizations of national political
identities being grounded in a primordial sharing of language.
One of the strongest expressions was that of Johann Gottlieb
Fichte ([1808] 1968, 1901):
The first, original, and truly natural boundaries of states are
beyond doubt their internal boundaries. Those who speak the
same language are joined to each other by a multitude of invisible bonds by nature herself, long before any human art begins;

Identity, Language and

Ideology and Language

they understand each other and have the power of continuing to


make themselves understood more and more clearly; they belong
together and are by nature one and an inseparable whole.

Fichte was writing in order to rouse the German nation


to repel the advance of Napoleon. However, in 1870, the shoe
was on the other foot when the Franco-Prussian War led to the
German annexation of Alsace, a German-speaking province that
had been part of France for more than two centuries and whose
population was mainly loyal to France in spite of their linguistic
difference. This provoked a sharp turn away from the Fichtean
view on the part of French linguists, such as Ernest Renan (1882),
who formulated a new view of national identity as based not in
any primordially determining characteristic such as language
but on a shared will to be part of the same nation, together with
shared memories.
The nation, in other words, exists in the minds of the people
who make it up. This is the conception that B. Anderson ([1983]
1991, 6) would return to in defining the nation as an imagined
political community. The legacy of memories to which Renan
pointed would dominate future philosophical and academic
attempts to analyze national identity. M. Billig, a colleague and
collaborator of Tajfel, has explored how the continual acts of
imagination on which the nation depends for its existence are
reproduced (1995, 70), sometimes through purposeful deployment of national symbols but mostly through daily habits of
which we are only dimly aware. Examples include the national
flag hanging in front of the post office and the national symbols
on the coins and banknotes we use each day. Billig introduced
the term banal nationalism to cover the ideological habits that
enable the established nations of the West to be reproduced. In
Billigs view, an identity is to be found in the embodied habits of
social life (1995, 8), including language. A. D. Smith (e.g., 1998,
Chapter 8) has emphasized how much of the effort of nationalism construction is aimed at reaching back to the past in the
interest of ethno-symbolism, and this can be seen particularly
in the strong investment made by modern cultures in maintaining the standard language, by which is meant a form resistant to
change, hence, harking backward (see Hobsbawm 1990). Every
time we attend to the fact that someone has spoken or written
in a standard or nonstandard way, we take part, usually without
realizing it, in both the national construction of our language and
the linguistic construction of our nation.
John E. Joseph
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, B. [1983] 1991. Imagined Communities: Reflections on the Origin
and Spread of Nationalism. 2d ed. London and New York: Verso.
Billig, M. 1995. Banal Nationalism. London: Sage.
Blommaert, J., ed. 1999. Language Ideological Debates. Berlin: Mouton de
Gruyter.
Bourdieu, P. [1982] 1991. Language and Symbolic Power: The Economy
of Linguistic Exchanges. Trans. G. Raymond and M. Adamson, ed.
J. B. Thompson. Cambridge: Polity, in association with Basil Blackwell.
Cameron, D., and D. Kulick. 2003. Language and Sexuality.
Cambridge: Cambridge University Press.
Eckert, P., and S. McConnell-Ginet. 1992. Think practically and look
locally: Language and gender as community-based practice. Annual
Review of Anthropology 21: 46190.

Fichte, J. G. [1808] 1968. Addresses to the German Nation. Trans.


R. F. Jones and G. H. Turnbull, ed. G. A. Kelly. New York: Harper Torch
Books.
Fishman, J. A., ed. 1999. Handbook of Language and Ethnic Identity.
Oxford: Oxford University Press.
Hobsbawm, E. J. 1990. Nations and Nationalism since 1780: Programmes,
Myth, Reality. Cambridge: Cambridge University Press.
Holmes, J., and M. Meyerhoff, eds. 2003. The Handbook of Language and
Gender. Malden, MA, and Oxford: Blackwell.
Joseph, J. E. 2004. Language and Identity: National, Ethnic, Religious.
Houndmills: Basingstoke; New York: Palgrave Macmillan.
Kroskrity, P. V., ed. 2000. Regimes of Language: Ideologies, Polities, and
Identities. Santa Fe, NM: School of American Research Press.
Milroy, L. 1980. Language and Social Networks. Oxford and New
York: Blackwell.
Renan, E. 1882. Quest-ce quune nation? Confrence faite en Sorbonne,
le 11 mars 1882. Paris: Calmann Lvy.
Smith, A. D. 1998. Nationalism and Modernism: A Critical Survey
of Recent Theories of Nations and Nationalism. London and New
York: Routledge.
Tajfel, H. 1978. Social categorization, social identity and social comparison. In Differentiation between Social Groups: Studies in the
Social Psychology of Intergroup Relations, ed. H. Tajfel, 6176.
London: Academic Press.
Turner, J. C., M. A. Hogg, P. J. Oakes, S. D. Reicher, and M. J. Wetherell.
1987. Rediscovering the Social Group: A Self-Categorization Theory.
Oxford: Blackwell.
Verschueren, J., ed. 1999. Language and Ideology: Selected Papers from
the 6th International Pragmatics Conference. Antwerp: International
Pragmatics Association.

IDEOLOGY AND LANGUAGE


The concept of ideology refers broadly to the ways in which a
persons beliefs, opinions, and value systems intersect with the
wider social and political structures of the society in which he
or she lives (cf. politics of language). Many linguists, especially those working in the traditions of critical linguistics (e.g.,
Fowler et al. 1979) and, more recently, critical discourse
analysis, take the view that language or more exactly, a range
of language practices are influenced by ideology. From this
perspective, all texts, whether spoken or written, are seen as
being inexorably shaped and determined by a mosaic of political
beliefs and sociocultural activities. The critical linguistic position
on language is, therefore, one that challenges directly the liberal
construal of texts as natural products of the free communicative
interplay between individuals in society. For critical linguists,
texts are anything but neutral or disinterested, and so it falls to
close linguistic analysis to help us understand how ideology is
embedded in language and, consequently, to become aware of
how the reflexes of dominant or mainstream ideologies are
sustained through textual practices.
Although coined in the early 1800s by the French philosopher Destutt de Tracy, the term ideology is normally associated
with Karl Marx, particularly with his treatise on The German
Ideology, a project developed in 18456, but published, in
various languages and installments, from the 1930s onward (see
Marx [1933] 1965). Over the intervening years, the concept has
been adopted more widely (and without any necessary adherence to Marxist doctrine) to refer to the belief systems that are

371

Ideology and Language


held either individually or collectively by groups of people and
to the social conditions that frame these systems. In Marxs
original conception, ideology is seen as an important means by
which dominant forces in society, such as royalty, the aristocracy, or the bourgeoisie, can exercise power over subordinated
or subjugated groups, such as the industrial and rural proletariat. His famous axiom that the ideas of the ruling class are in
every epoch the ruling ideas ([1933] 1965, 61), along with his
observation that the ruling material force is at the same time the
ruling intellectual force, has had a profound impact on the way
contemporary linguistic research has understood discourse
in the public sphere. Ideology, and its expression in the textual
practices that shape our everyday lives, is not something that
exists in isolation as a product of free will but is instead partial
and contingent. It is something whereby, as Louis Althusser
suggests, ideas are inserted into the hierarchical arrangement of
socially and politically determined practices and rituals, which
are themselves defined by material ideological state apparatuses
(Althusser 1971, 158). In short, in the Marxist tradition, ideology
is, most importantly, a system of beliefs that fosters consent to
social hierarchy, particularly class hierarchy. Subsequent writers influenced by this tradition have expanded the notion from
class hierarchy (as in class or, more specifically, capitalist ideology; see marxism and language ), to sex (patriarchal ideology; see gender and language ), to colonialism (colonial
ideology), and so on.
Against this theoretical backdrop, scholars researching the
interconnections between language and ideology build from the
premise that patterns of discourse are framed in a web of beliefs
and interests. A texts linguistic makeup functions to privilege certain ideological positions while downplaying others such that
the linguistic choices encoded in this or that text can be shown
to correlate with the ideological orientation of the text. Even the
minutiae of a texts construction can reveal an ideological standpoint, and productive comparisons can be drawn between the
ways in which a particular linguistic feature is deployed across
different texts. For instance, the following three simple examples
differ only in terms of the main verb used:
The senator explained the cutbacks were necessary.
The senator claimed the cutbacks were necessary.
The senator imagined the cutbacks were necessary.
Whereas the first example suggests that while the cutbacks were
unavoidable, the senators actions are helpfully explanatory,
the more tenuous claimed of the second example renders the
senators attempt to justify an unpopular measure less convincing. The third example is arguably more negative again, with the
nonfactive verb imagined suggesting that the obverse condition applies in the embedded clause, namely, that the senator is
mistaken in his belief that the cutbacks were necessary.
Another important assumption in work on ideology and language is that the linguistic structure of a text often works silently
or invisibly to reproduce relationships of power and dominance.
In consequence, the processor of a text such as the reader of a
tabloid newspaper, for example is encouraged to see the world in
particular ways and in ways that are often aligned with the dominant or mainstream ideology espoused by the paper. Crucially,

372

these ideological assumptions are transmitted surreptitiously and


are mediated through forms of language that present as natural
or common sense certain beliefs and values that may prove to
be highly contestable or dubious in their own terms.
Take as an example the following discourse event that
unfolded over a few months in the British tabloid The Sun. This
popular daily voiced vehement opposition to the British governments plans to celebrate the advent of the millennium by
the construction, at taxpayers expense, of a festival dome in
Greenwich, London. Notice how in these extracts the paper
sometimes uses italicization to enforce the common-sense status of its position on the Millennium Experience:
The Sun Speaks Its Mind: DUMP THE DOME, TONY! (June 17,
1997; original emphasis)
MPS, businessmen and charities yesterday backed our see-sense
campaign to axe the 800 million Millennium Exhibition planned
for Greenwich. (June 18, 1997; italics in original)
That dammed Dome has disaster written all over it. The creative
director accuses the Dome secretary of acting like a dictator
who is too easily swayed by public opinion. If only he was. Maybe
then this waste of public money would be axed. For thats what
public opinion wants. (Jan. 1, 1998; italics in original)

An appeal to commonsense values of the sort displayed here


allows the paper to present its objection to the dome as a position with which any sensible member of society could concur.
Among other things, the papers tactic is a good example of naturalization (Fairclough 1992, 678), which is the process whereby
an ideological position is presented as if it were simply part of
the natural order of things. Naturalization encourages us to align
ourselves with mainstream or dominant thinking, even when
that thinking is itself partisan, self-serving, and driven by economic and political interests. Indeed, to demur from The Suns
position would be to place oneself outside the community of
notional sensible subjects who share the same set of normative values as the paper. Yet if proof were needed of the partisan
and capricious nature of such naturalized ideological positions
in discourse, consider as a footnote the following breathtaking
U-turn that appeared in the same tabloid newspaper shortly
after the publication of the previous diatribes:
The plans for the Millennium Experience are dazzling. If it all
comes off, the Prime Ministers prediction will be correct: the
Dome will be a great advert for Britain. (Feb. 24, 1998; italics in
original)

It may have been entirely coincidental that this sudden change


in direction occurred on the same day that the papers owner
pledged 12 million worth of sponsorship to the Millennium
Dome.
A range of linguistic models have been used over the last
quarter of a century to explore the interconnections between
language and ideology. Prominent among these has been
the concept of register, which is a variety of language that
is defined according to context and use. Linguists have
noticed that in times of war and conflict, in particular, spurious specialist registers of discourse are quietly disseminated
through the public sphere by influential social and political

Ideology and Language


groups. In the specific context of the widespread proliferation
of nuclear arms in the 1970s, critical linguists adopted the term
Nukespeak, in an echo of George Orwells Newspeak, to refer
to a (mis)use of register in order to mask what for the general
public were the unpalatable horrors of nuclear conflict (see
Chilton 1985). In fact, Nukespeak still reverberates in the contemporary discourses of war: Collateral damage refers to the
unintentional killing of civilians and noncombatants, incontinent ordnance to poorly aimed missiles, and friendly fire to
the inflicting of injury or death to ones allies. While the phrase
human remains transportation pods is a heavily sanitized label
for body bags, the expression advanced marine biological systems refers rather improbably to dolphins, which, when suitably
armed, apparently make excellent seaborne weapons systems.
In addition to the exploitation of register, the strategic use
of metaphorical language has also been identified as a mechanism for sustaining and disseminating ideological dogma (see
Charteris-Black 2004). Paul Simpson (2004, 42) offers the following examples from print and broadcast coverage of the conflict
in Iraq in 2003:
The third mechanized infantry are currently clearing up parts
of the Al Mansur Saddam village area.
The regime is finished, but there remains some tidying up
to do.
Official sources described it as a mopping up operation.
These examples rehearse the same basic conceptual metaphor through three different linguistic realizations. The experience of war, the target domain of the metaphor, is relayed by the
idea of cleaning, the source domain (see source and target),
such that the metaphorical formula might be rendered thus: WAR
IS CLEANING. The ideological significance of this metaphor is
that it downplays the significance (and indeed risk) of the conflict, implying that it is nothing more than a simple exercise in
sanitation, a perspective, it has to be said, that is unlikely to be
shared by military personnel on the opposing side.
Ideological standpoint in language can also be productively
explored by means of comparisons between different texts, especially when the texts analyzed cover the same subject matter. Of
the range of linguistic models that have been thus employed,
those from functional linguistics have proved particularly
useful as an analytic tool for investigating ideological standpoint
across different portrayals of the same event or experience (see
Fowler 1991; Simpson 1993).
The investigation of ideology in language is an undeniably
important focus for the language sciences. That said, there have
been a number of stinging attacks on this area of study from
respected authorities (e.g., Widdowson 1995, 1996; Stubbs 1997;
Blommaert 2005), which have called into question the validity of key parts of the whole endeavor. These criticisms have
tended to cluster around three main issues. The first concerns
the term ideology itself, which, even since its inception in the
work of Marx and de Tracy, has proved too broad and too vague
a concept to slot comfortably into a formal analytic framework.
Indeed, Michel Foucault has argued that the notion of ideology is difficult to make use of because it always stands in virtual opposition to something else (1984, 60). The second type

Idioms
of criticism is about the sorts of texts that analysts choose to
subject to ideological analysis. To be blunt, if we know a text to
be ideologically problematic at the outset, then any subsequent
linguistic analysis will only confirm what we already know, and
any linguistic feature uncovered through the analysis can by
imputation be passed off as ideologically insidious. This deterministic approach connects with the third major area of concern, which is simply that studies of ideology and language tend
to be elitist. If the main purpose of the analysis is to uncover and
challenge the repressive discourse practices of powerful, interested groups, then what needs to be considered before anything
else are the effects of these practices on ordinary (nonacademic)
people. Reactions of ordinary communities to what the analysts
deem ideologically insidious discourse are rarely considered;
instead, the academic analyst comfortably assumes the perspective of those for whom the text was intended, moving seamlessly
in and out of the multiple interpretative positions of specialist
and nonspecialist alike. It is still early to say how these serious
and far-reaching criticisms will affect the ways in which scholars
investigate the widespread interconnections between ideology
and language.
Paul Simpson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Althusser, L. 1971. Lenin and Philosophy and Other Essays.
London: NLB.
Blommaert, J. 2005. Discourse: A Critical Introduction. Cambridge:
Cambridge University Press.
Charteris-Black, J. 2004. Corpus Approaches to Critical Metaphor Analysis.
Basingstoke and New York: Palgrave MacMillan.
Chilton, P. 1985. Language and the Nuclear Arms Debate: Nukespeak
Today. London: Pinter.
Fairclough, N. 1992. Discourse and Social Change. Cambridge: Polity
Press.
. 2001 Language and Power. 2d ed. London: Longman.
Foucault, M. 1984. Truth and power. In The Foucault Reader, ed.
P. Rabinow, 5175. Harmondsworth, UK: Penguin Books.
Fowler, R. 1991. Language in the News: Discourse and Ideology in the
Press. London: Routledge.
Fowler, R., R. Hodge, G. Kress, and T. Trew. 1979. Language and Control.
London: Routledge.
Marx, K. [with Frederick Engels]. [1933] 1965. The German Ideology. Ed.
and trans. S. Ryazanskaya. London: Lawrence and Wishart.
Simpson, P. 1993. Language, Ideology and Point of View.
London: Routledge.
. 2004. Stylistics. London: Routledge.
Stubbs, M. 1997. Whorfs children: Critical comments on critical discourse analysis (CDA). In Evolving Models of Language, ed. A. Ryan
and A. Wray, 10016. Swansea: BAAL.
Widdowson, H. 1995 Discourse analysis: A critical view. Language and
Literature 4.3: 15772.
. 1996. Reply to Fairclough: Discourse and interpretation: Conjectures and refutations. Language and Literature
5.1: 5770.

IDIOMS
Idioms hold an important place in the class making up fixed, nonliteral expressions. These have a perplexing characteristic: They

373

Idioms

Idle Talk and Authenticity

communicate something other than what the words usually


mean. Idioms are the best and most well known representative
of this class. Other examples are speech formulas (Nice to meet
you), proverbs (While the cats away, the mice will play),
clichs (Easy does it), and expletives (For heavens sake!). These
differ from metaphors, which, though also nonliteral, are made up
of novel word combinations. Terminology is inconsistent, but the
term formulaic language has become standard (Wray 2002).
Definitions are elusive, and lines between the categories are
not always clear. In defining formulaic expressions, it is easier
to say what they are not: They are not newly created phrases or
sentences made up of lexical elements (words) according to
grammatical rules. Instead, they are learned, stored, and processed as unitary configurations. The meaning of the idiom She
has him eating out of her hand does not convey information
about eating or hands but, instead, refers to a complex human
relationship whereby one person is submissive to the other.
Idioms benefit from this ability to pack an aura of connotations
in a complex meaning.
Idioms have two characteristic properties: stereotyped form
and conventional meaning. Stereotyped form means that certain words appear in a particular order with a particular speech
melody. The idiom I wouldnt want to be in his shoes, to be
well formed, must have precisely those words in that order,
with the accent on his. Changes may be introduced because
idioms are decomposable. Linguists have attempted to characterize syntactic operations that may be performed on idioms,
and many psychological studies have pursued this question.
It appears that much potential variation exists, depending on
communicative need. The complex meanings associated with
idioms contain emotional and attitudinal nuances. The idiomatic expression Its a small world signals recognition of a
chance meeting in an unexpected place by two acquaintances,
reflecting surprise and serendipity. In contrast, the literal
statement Its a small tree merely conveys information about
relative tree size. The stereotyped forms and conventional
meanings of idioms are known to the native speaker but difficult for the second language learner. Idioms are learned and
processed by cognitive and neurological mechanisms different
from those underlying novel expressions (Van Lancker Sidtis
2006). Their number is often underestimated; in efforts at compiling lists, no upper limit has yet been determined. Corpus
studies often utilize computer search techniques to quantify
incidence, but a human interface is necessary for identifying
idiomatic forms.
Diana Van Lancker Sidtis
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Cacciari, C., and P. Tabossi, eds. 1993. Idioms: Processing, Structure, and
Interpretation. Hillsdale, NJ: Lawrence Erlbaum.
Nunberg, Geoffrey, I. Sag, and T. Wasow. 1994. Idioms. Language
70: 491538.
Titone, Deborah, and C. Connine. 1999. On the compositional and noncompositional nature of idiomatic expressions. Journal of Pragmatics
31: 165574.
Van Lancker Sidtis, Diana. 2006. Where in the brain is nonliteral language? Metaphor and Symbol 21. 4: 21344.

374

Wray, Alison. 2002. Formulaic Language


Cambridge: Cambridge University Press.

and

the

Lexicon.

IDLE TALK AND AUTHENTICITY


Martin Heideggers magnum opus, Being and Time, advances a
hermeneutical phenomenology that is, an interpretive description of what it means to be Dasein (a German word for existence, which Heidegger uses to mean human being). Dasein
is not the disembodied rational soul, mind, or subject described
by traditional metaphysics and epistemology but an embodied,
worldly person. Having a world is constitutive of human existence; as Heidegger says, Daseins being is essentially being-inthe-world (In-der-Welt-sein).
To be in the world is to be thrown into an environment and a
tradition beyond your choosing and to take future-defining possibilities from it as your own or to fail to do so. The way in which
people typically fail to make their possibilities, hence their being,
their own is by simply doing what one does or, as Heidegger
says, being ruled by the one (das Man). Social conformity
cannot be avoided altogether, of course, but beyond a certain
point it amounts to what he calls inauthenticity, or disownedness (Uneigentlichkeit). Authenticity, by contrast, he describes
as forerunning resoluteness (vorlaufende Entschlossenheit),
which is to say, facing up (resolutely) to the concrete situation
and embracing (running up into) your death. Forerunning resoluteness is nothing self-destructive or suicidal, though, since
by death, Heidegger means neither the biological end nor the
biographical conclusion of a life but, rather, the constant closing
down of possibilities.
A crucial contributing factor to Daseins characteristic lapse
into inauthenticity is its unavoidable involvement in a public
language governed by anonymous norms of correctness and
propriety. In Being and Time, Heidegger refers to language
not as a formal syntactic or semantic system, but as the
concrete manifestation of discourse (Rede), or expressivecommunicative behavior broadly construed. Most everyday
discourse, he says, is idle talk (Gerede) or chitchat, generic
conversation in which we merely pass the word along, as
opposed to speaking authentically. Although the public, generic
character of a shared language contributes to our lapse into idle
talk, not all speech is inauthentic. Dasein, that is, can speak in
conformity with publicly recognized norms of correctness and
yet still speak in its own voice.
Heideggers concept of idle talk owes a large but unacknowledged debt to Sren Kierkegaards account of talkativeness, or
chat, in his critique of the present age in A Literary Review
([1846] 2001). Heidegger even follows Kierkegaard in describing
the banalizing, conformity-inducing effect of chatter as a kind of
leveling process. But whereas Kierkegaard saw such conformism and superficiality as characteristic of modern European
culture, Heidegger regarded it is essential to our being-in-theworld. For Heidegger, that is, there could be no shared public
world at all in the absence of a relatively bland background
of established ways of conducting oneself, linguistically and
nonlinguistically.
Taylor Carman

Ijtihd (Interpretive Effort)


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Blattner, William. 2006. Heideggers Being and Time. London:
Continuum.
Carman, Taylor. 2003. Heideggers Analytic: Interpretation, Discourse, and
Authenticity in Being and Time. Cambridge: Cambridge University
Press.
Dreyfus, Hubert L. 1991. Being-in-the-World: A Commentary on
Heideggers Being and Time, Division I. Cambridge, MA: MIT Press.
Heidegger, Martin. 1962. Being and Time. Trans. J. Macquarrie and
E. Robinson. New York: Harper and Row.
Kierkegaard, Sren. [1846] 2001. A Literary Review. Trans. A. Hannay.
New York: Penguin.
Wrathall, Mark. 2005. How to Read Heidegger. New York: Norton.

IJTIHD (INTERPRETIVE EFFORT)


Ijtihd sometimes contrasted with taqld or uncritical acceptance of someone elses opinion (Chaumont 1997, 11) is a concept from Muslim legal theory (see legal interpretation). It
refers to the interpretive effort involved in understanding a point
of law. The various schools of Muslim legal thought differ on the
degree to which they accept or advocate ijtihd.
There are two places in which ijtihd is likely to arise as an
issue. One is in the extension of particular judgments to new
cases. When this occurs, the common method of inference is
qiys or analogy (see Jokisch 1997). Take a simple example
using one hadth. (A hadth is a brief narrative of a decision or
practice, most often from Muhammad, that has been passed
down by tradition and has roughly the status of a precedent in
law.) Muhammad says that there is no tax payment required
for fewer than five camels (Ali n.d., 215). One might narrowly
constrain the application of the hadth, not extending it beyond
camels. Alternatively, one might calculate the worth of the
camels and use that as a basis for application. A third possibility involves qiys. But this is not simple. For example, five
cars would probably not parallel five camels. Perhaps the analogy should be based on the importance of the objects for ordinary functioning in society. This is where interpretive effort is
required.
The second obvious place for ijtihd is in understanding the
general meaning of texts. The Quran suggests this, explaining of
itself that some of its verses are decisive and others are allegorical (Ali 1995, 3.6). The mere existence of the allegorical or,
in Dawoods translation, ambiguous verses indicates the
necessity of interpretive effort in these cases. Indeed, the difference in translations suggests the importance of interpretive
effort. Moreover, once meaning is determined, interpretative
effort may be required for understanding the purpose of a passage. For example, many stories in the Quran are presented as
literally true accounts of Allahs responses to human behavior.
These frequently serve as warnings. But understanding these
warnings may require interpretive effort to discern the exact
nature of the actions forbidden. When someone is punished,
we must infer what construal to give that persons actions so as
to understand the point of the punishment. For example, Lots
wife was punished because she remained behind when Lot
left (Dawood 1990, 7:83). What is her sin disbelief in a prophet,
disobedience of her husband, association with the sinful? (This

is what writers in the European ethical tradition refer to as the


problem of relevant act descriptions [see Chapter 2 of Nell
1975].)
There are some general principles that may guide ijtihd. For
example, in cases that involve punishment of an offender, one is
to incline toward mercy (see Waines 1996, 81). More generally,
interpretation is to some degree constrained by two sorts of
coherence coherence with the body of the law and coherence
with the views of jurists. On the other hand, jurists often disagree.
Moreover, apparent contradictions with the body of law may be
resolved by reinterpreting other parts of the law. Thus, in debating a passage P, one jurist may cite passage N against the interpretation put forth by a second jurist. But this second jurist may
interpret the passage N in such a way that it fits with his or her
initial interpretation of passage P. In this way, the criterion of
coherence is often not decisive.
In many ways, ijtihd is as much an attitude as a method. The
point is suggested by a hadth that contrasts Quranic verses that
anyone can understand with those that someone with a pure
mind can understand (Gleave 1997, 33). Ijtihd necessarily
draws on techniques of inference (such as analogy) and principles of selection across possible inferences (such as coherence).
But it fundamentally involves sincere and open-minded work to
understand the purport of a legal or other passage.
In this respect, ijtihd may be seen as an instance of a general pattern in Islam. The word Islam means submission.
Submission to Allah is paired with a struggle against anything
that would inhibit that submission. A possible inhibition may be
a fitnah, or trial. When faced with such a possible inhibition, the
believer engages in a struggle, or jihad, in order to bear witness to
truth. That inhibition may be ones own inclination to err. In that
case, ones jihad is against that inclination. It may also be against
some violence perpetrated upon oneself or others. In that case,
jihad may take the form of battle (hence, the popular conception
of jihad as holy war).
One may think of ijtihd as analogous to jihad, thus as a struggle to establish and bear witness for truth, in this case through
interpretation. (The words are etymologically related.) In keeping with this, the Quran opposes right interpretation to fitnah
connected with false interpretation (Ali 1995, 3.6).
Though linked in its origins to religious attitudes and faith in
revelation, the general concept of ijtihd may not be irrelevant
to secular forms of interpretation within or outside of Muslim
hermeneutic traditions (see philology and hermeneutics).
Indeed, the concept seems germane to a range of legal, literary, conversational, and even scientific discourses (on the
relevance of issues in interpretation, including scriptural interpretation, to science, see Lecture 4 of van Fraassen). It suggests
something about the ways in which we come to understand
these discourses when our spontaneous or automatic response
has been interrupted and we must engage in self-conscious
reflection on meaning (cf. passing theories). Indeed, its
opposition to taqld suggests partially parallel oppositions
found in influential Western theories, such as those of Bakhtin
(see dialogism) and Martin Heidegger (see, for example,
Heidegger 1962, 21114).
Patrick Colm Hogan

375

I-Language and E-Language


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ali, Maulana Muhammad, ed. and trans. 1995. The Holy Qurn.
Columbus, OH: Ahmadiyyah Anjuman Ishaat Islam.
. N.d. A Manual of Hadith. Lahore: Ahmadiyya Anjuman Ishaat
Islam.
Chaumont, ric. 1997. Ijtihd et histoire en islam sunnite classique selon
quelques juristes et quelques thologiens. In Gleave and Kermeli
1997, 723.
Dawood, N. J., ed. and trans. 1990. The Koran. New York: Penguin.
Gleave, Robert. 1997. Akhbr Sh usl al-fiqh and the Juristic Theory of
Ysuf b. Ahmad al Bahrn. In Gleave and Kermeli 1997, 2445.
Gleave, R., and E. Kermeli, eds. 1997. Islamic Law: Theory and Practice.
London: Tauris.
Heidegger, Martin. 1962. Being and Time. Trans. John Macquarrie and
Edward Robinson. New York: Harper and Row.
Jokisch, Benjamin. 1997. Ijtihd in Ibn Taymiyyas fatw. In Gleave
and Kermeli 1997, 19937.
Nell, Onora. 1975. Acting on Principle: An Essay on Kantian Ethics.
New York: Columbia University Press.
van Fraassen, Bas C. 2002. The Empirical Stance. New Haven, CT: Yale
University Press.
Waines, David. 1996. An Introduction to Islam. Cambridge: Cambridge
University Press.

I-LANGUAGE AND E-LANGUAGE


In order to counter earlier misunderstandings, Noam Chomsky
(1986, 20ff.) made a distinction between E-language and
I-language. E-language stands for externalized language and
I-language for internalized language. E-language is defined
as language independent of the properties of the mind/brain.
I-language, in contrast, is language seen as a property of an
individuals mind/brain. The neologism mind/brain reflects
Chomskys belief that theories of mental faculties, particularly
generative grammars, are ultimately about the brain at
some level of abstraction.
There are various ways of seeing language as external to the
human mind. The immediate target of Chomsky (1986, 20) is
language as a collection (or system) of actions or behaviors of
some sort. This is the behavioristic view of language that Chomsky
criticized in L. Bloomfield and B. F. Skinner. A related notion,
defended by W. V. Quine and R. Montague, sees human languages
as analogous to formal languages. According to this conception,
languages are extensionally defined as sets of sentences or wellformed formulas, and there is no empirical dimension to the question which grammar is the correct one: Grammars are equally valid
variants if they generate extensionally equivalent languages.
A very common tradition of E-language sees languages as
largely existing in some mind-external cultural record. This tradition goes back to Aristotle, was selectively revived by J. G. von
Herder and the German Romanticists, and deeply influenced
the European structuralism of Ferdinand de Saussure and the
American structuralism of F. Boas and E. Sapir. Ludwig
Wittgensteins idea that language consists of conventional rules
constitutive of language games also fits this tradition, as does
Karl Poppers proposal that language is part of his world 3 (some
recent ideas related to E-language may be found in meaning
externalism). Relativistically interpreted, this tradition is at
variance with universal grammar.

376

Illocutionary Force and Sentence Types


A last influential tradition of E-language is platonism,
revived by J. Katz and others. According to this view, language
consists of abstract objects and properties that have an existence
independent of both our mind and cultural record.
As for I-language, Chomsky (1986, 21) refers to O. Jespersen,
who claimed that there is some notion of structure in the mind
of the speaker. According to Chomsky, it is a distinct system of
the mind/brain that grows in the individual from an initial state
S0 to a stable state Ss. This growth, comparable to the growth of
an organ, involves only minimal external factors, such as those
that help set the parameters that distinguish the grammars of different languages. Seen this way, the study of language is part of
individual cognitive psychology and, ultimately, part of human
biology. In order to counter the obvious objection that language
also involves external elements, Chomsky makes a distinction
between the faculty of language in the narrow sense (FLN) and
the faculty of language in the broad sense (FLB). The notion of
I-language particularly applies to FLN, which has recursion as its
core property (Hauser et al. 2002).
It is a matter of controversy whether the opposition of
E-language to I-language makes sense, since FLN, no matter how
internal, involves words and therefore externally coded conventions. Even in its narrowest sense, then, language seems to integrate E- and I-elements.
Jan Koster
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1986. Knowledge of Language. New York: Praeger.
Hauser, Marc D., Noam Chomsky, et al. 2002. The faculty of language: What is it, who has it, and how did it evolve? Science 298
(November): 156979.

ILLOCUTIONARY FORCE AND SENTENCE TYPES


John L. Austin (1962) and John R. Searle (1969, 1975) distinguish
the meaning (see sense and reference) of a sentence from
its illocutionary force. Austin and Searle conceive of the illocutionary force as the act performed by the speaker with his/
her utterance. Examples of illocutionary force are expressing a
belief, asking the addressee a question, and warning or advising someone. The illocutionary force must be intended by the
speaker, and it must be possible for the addressee to recognize
it (see also speech-acts , performative and constative ,
perlocution).
Illocutionary force interacts with sentence grammar. This
short entry focuses on illocutionary consequences of the syntactic sentence types declarative and interrogative, with some
reference to imperatives. (See Knig and Siemund 2007 on
the cross-linguistic usefulness of this traditional three-way
distinction.)
THE MEANING OF DECLARATIVES. In philosophical logic and
linguistic semantics, propositions traditionally represent
the content of both that-clauses (that it is raining) and declaratives (It is raining). Austin and Searle draw on this notion in distinguishing the meaning of a clause from its illocutionary force
according to the schema Force(proposition). For example, the

Illocutionary Force and Sentence Types


two aspects of the statement It is raining can be rendered as The
speaker expresses that (s)he believes (that it is raining).
THE MEANING OF INTERROGATIVES. Semantic theory (see, for
example, Karttunen 1977; Groenendijk and Stokhof 1982) has
since developed meanings for syntactic interrogatives that cover
unembedded interrogatives like (1a), (2a) and embedded interrogatives like (1b), (2b). These meanings (differing in detail
between authors) may be paraphrased as in (1c), (2c). The meanings assigned by Jeroen Groenendijk and Martin Stokhof (1982)
are called partitions.
(1) a. Is it raining?
b. Mary wonders
[whether it is raining]
c. the true one of
{that it is raining,
that it is not raining}

(2) a. Whom did he invite?


b. Mary wonders
[whom he invited]
c. the true one(s) of
{that he invited Bill,
that he invited Jane, }

Note that the partitions in (1c) and (2c) do not contain a component of illocutionary force. They are also the meanings of embedded interrogatives like (1b) and (2b), and embedded clauses are
standardly assumed not to have illocutionary force.
Thus, the meaning of a declarative is a proposition and the
meaning of an interrogative is a partition. Let us consider how
illocutionary force of two kinds may be added to these meanings: i) illocutionary force as a statement: the speaker expresses
that (s)he believes something to be true, and ii) illocutionary force
as a question: the speaker wants to learn from the addressee the
truth in an as-yet open issue. Let us begin by assuming that the
choice between statement and question is pragmatically inferred
rather than grammatically triggered.
THE ILLOCUTIONARY FORCE OF DECLARATIVES. The meaning of a
declarative (It is raining) as a proposition (that it is raining) makes
it likely that the speaker intends a statement interpretation of his/
her utterance (the speaker conveys that [s]he believes that it is raining), perhaps in connection with the maxim of quality in Grice
(1975), which requires such truthfulness (see also cooperative
principle). If we assume that this statement interpretation is not
hardwired into declaratives, we correctly allow declarative questions, as in It is raining? Here, a declarative sentence with question intonation is used as a question. Unlike in statements, the
speaker does not convey that he/she believes that it is raining.
THE ILLOCUTIONARY FORCE OF INTERROGATIVES. No single state
of affairs (ordinary proposition) is offered in the interrogative
meaning (the true one among , see [1c] and [2c]), and so
there is no specific content for an interpretation as a statement.
The interrogative meaning makes a question interpretation
particularly likely, in which the speaker wants the addressee
to pick the true one among the possibilities. If we assume that
this illocutionary force is not hardwired into root interrogatives,
we correctly allow the flexibility for untypical uses. In monological questions, such as in a lecture, the speaker later provides the answer him-/herself: Who was behind these reforms?
Everything points towards Bismarck (Brandt et al. 1992). In rhetorical questions like Am I your servant? the true one among
the given possibilities is to be inferred and need not be stated.

Thus, the semantic distinction between declaratives and


interrogatives is useful for distinguishing typical and untypical
uses of declaratives and for distinguishing typical and untypical
uses of interrogatives. The flexibility for different uses for both of
these sentence types may suggest a purely pragmatic assignment
of illocutionary force to propositions and partitions.
GRAMMAR AFFECTS ILLOCUTIONARY FORCE. Yet, there is evidence that grammar interferes with the assignment of illocutionary force:
First, a purely pragmatic account of declarative questions leads
to the expectation that a declarative questions like John is at
home? shares the illocutionary force of a yes/no question like
Is John at home? In the declarative question, the speaker would
indicate by the intonation that he/she does not commit to p, and
so the next likely interpretation would be that the speaker is wondering whether p is true and seeks the help of the addressee in
this regard. In fact, however, declarative questions strongly differ
from yes/no questions.
(3) [Telephone conversation] (4) [Steve calls Ann on the phone]
Ann: Hang on, Ill ask John Ann (picks up phone): Hello?
Steve: John is at home?
a. Steve: # John is at home?
cf. b. Steve: Is John at home?

The declarative question John is at home? can be asked only


where the speaker can assume that the addressee believes that
John is at home (Gunlogson 2001). This is the case in (3) but
not in (4a), where the declarative question is accordingly infelicitous (symbolized by #; see also felicity conditions).
Interrogatives do not carry this strong requirement; compare
(4b). Thus, even though declaratives cannot have a statement
interpretation hardwired into them, they seem to have something more flexible hardwired into them that goes beyond the
propositional content: A declarative sentence seems to either
commit the speaker to p (statement) or assume a commitment of
the addressee to p (declarative question) (see Gunlogson 2001).
Second, root interrogatives also seem to have an illocutionary
element hardwired into them. If Steve and Ann share ignorance
of things mechanic, Steve may say (5a) to Ann, but not (5b).
(5)

Steve to Ann (both ignorant about car problems):


a. The carburetor of my car is broken. I wonder whether the
repairs will be expensive.
b. The carburetor of my car is broken. # Will the repairs be
expensive?

This is unexpected on the view that a semantic partition is flexibly


augmented with illocutionary force by pragmatic inferences. In
(5a), the matrix predicate I wonder conveys that Steve is interested in the answer but does not have an expectation toward Ann
to provide it. The felicity of (5a) shows that this is not excluded
for lack of conversational relevance or for other pragmatic reasons. Why cannot the unembedded partition Will the repairs be
expensive? in (5b) have a pragmatically inferred similar illocutionary interpretation? It seems that some hardwired element
of illocutionary interpretation here leads to an expectation of
an answer from the addressee and thus to the inappropriateness in (5b). A promising hypothesis about such a hardwired

377

Illocutionary Force and Sentence Types


requirement is again that a commitment by a salient individual
is assumed or expressed. In the case of interrogatives, a commitment to (1c)/(2c) is a commitment to the true element(s) of the
partition, that is, knowledge of the correct answer (whatever it
may be). The assumption of such a commitment in root interrogatives is compatible with the examples discussed here. In rhetorical questions, knowledge of the correct answer is assumed of the
speaker and perhaps of the addressee. In monological questions,
knowledge of the correct answer can be assumed of the speaker,
who will later provide this answer. In standard questions,
knowledge of the correct answer would then be assumed of the
addressee. Then (5b) would be infelicitous because knowledge of
the correct answer could not be assumed of either the addressee
or the speaker (see Truckenbrodt 2004, 2006a, 2006b).
Third, German declaratives (syntactically V2-clauses) can
replace embedded dass-(that)-clauses under a range of attitude verbs, including sagen (say) and glauben (believe) and
excluding bezweifeln, (doubt) and leugnen (deny): Maria
glaubt/*bezweifelt, Hans ist zu Hause, Mary believes/*doubts,
Hans is at home. A semantic restriction seems to be in effect that
relates declaratives to a salient individuals beliefs (or a similar
domain) Marias beliefs in the preceding example, the beliefs
of the speaker or the addressee in unembedded use (statements
and declarative questions) (Truckenbrodt 2006a, 2006b). More
generally, there is a class of syntactic phenomena (root phenomena) that occur in unembedded clauses, as well as embedded under the verbs believe, say, and other predicates that have
been characterized as assertive (see discussion and references
in Heycock 2006). It is possible that root phenomena are phenomena that trigger a semantic requirement of a commitment
by a salient individual. Such a requirement might lead to the
observed restrictions on embedded use. The requirement would
crucially contribute to illocutionary force in unembedded use.
Finally, verbal mood (indicative, subjunctive, imperative) is a
grammatical category that interacts with illocutionary force. The
imperative seems to be interpreted deontically (Schwager 2005),
that is, along the lines of You should with strength varying
between a demand or request (Open the window!) to invitation
(Have another piece of cake!) or even a wish (Have a good break!).
A strong interaction of verbal mood with illocutionary force is
this: Have finished eating by 12:30! cannot be a statement to the
effect that the addressee has finished eating by 12:30 (rather than
by 12:45, as the addressee may mistakenly believe). Nor can it
be a question that wants the addressee to clarify whether in fact
the addressee has finished eating by 12:30. More generally, the
deontic interpretation of the imperative cannot be replaced with
an epistemic interpretation (i.e., one that negotiates knowledge),
while the indicative in declaratives and interrogatives typically
leads to epistemic (statement and question) interpretations. (See
Portner 1997 on indicative and subjunctive verbal mood.)
In sum, the grammatical distinction between declaratives
and interrogatives leads to a semantic distinction between a
proposition and a partition. This distinction is useful for a first
understanding of the typical and untypical illocutionary force of
declaratives and of interrogatives. Illocutionary force may in part
be assigned pragmatically. However, the assignment seems also
to be subject to a grammatically triggered interpretation in root
clauses. For declaratives and interrogatives, these interpretations

378

Image Schema
can be approximated in terms of a commitment by a salient individual that is expressed or assumed. Sentence mood (such as
imperative vs. indicative) plays a further crucial role in conditioning the illocutionary interpretation of a clause.
Hubert Truckenbrodt
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Austin, John L. 1962. How To Do Things with Words. Oxford: Clarendon.
Brandt, Margareta, Marga Reis, Inger Rosengren, and Ilse Zimmermann.
1992. Satztyp, Satzmodus, und Illokution. In Satz und Illokution 1,
ed. Inger Rosengren, 190. Tbingen: Niemeyer.
Grice, H. P. 1975. Logic and conversation. In Syntax and Semantics
3: Speech Acts, ed. P. Cole and J. Morgan, 4158. New York: Academic
Press.
Groenendijk, Jeroen, and Martin Stokhof. 1982. Semantic analysis of
WH-complements. Linguistics and Philosophy 5: 175233.
Gunlogson, Christine. 2001. True to form: Rising and falling declaratives
as questions in English. Ph.d. diss., University of California at Santa
Cruz.
Heycock, Caroline 2006. Embedded root phenomena. In The Blackwell
Companion to Syntax. Vol. 2. Ed. M. Everaert and H. van Riemsdijk,
174209. Oxford: Blackwell.
Karttunen, Lauri 1977. Syntax and semantics of questions. Linguistics
and Philosophy 1: 344.
Knig, Ekkehard, and Peter Siemund. 2007. Speech act distinctions in
grammar. In Language Typology and Syntactic Description. Vol. 1. Ed.
T. Shopen, 276324. Cambridge: Cambridge University Press.
Portner, Paul. 1997. The semantics of mood, complementation, and
conversational force. Natural Language Semantics 5: 167212.
Schwager, Johanna Magdalena. 2005. Interpreting imperatives. Ph.d.
diss., University of Frankfurt am Main.
Searle, John R. 1969. Speech Acts. An Essay in the Philosophy of Language.
Cambridge: Cambridge University Press.
. 1975. A taxonomy of illocutionary acts. In Language, Mind,
and Knowledge, ed. K. Gunderson, 34469. Minneapolis: University
of Minnesota Press. Repr. John R. Searle, Expression and Meaning
(Cambridge: Cambridge University Press, 1979), 129.
Truckenbrodt, Hubert. 2004. Zur Strukturbedeutung von
Interrogativstzen. Linguistische Berichte 199: 31350.
. 2006a. On the semantic motivation of syntactic verb movement
to C in German. Theoretical Linguistics 32: 257306.
. 2006b. Replies to the comments by Grtner, Plunze and
Zimmermann, Portner, Potts, Reis, and Zaefferer. Theoretical
Linguistics 32: 387410.

IMAGE SCHEMA
A foundational concept in cognitive linguistics, image
schemas are associated most closely with the work of Mark
Johnson (1987) and his collaborator George Lakoff (1987). Image
schemas are thought to play a key role in the acquisition, structure, and use of language; a central tenet of cognitive linguistics
is that the perceptual interactions and motor engagements with
the physical world shape the way we think. An image schema
can be defined as a condensed representation of bodily motor
experience for purposes of mapping spatial structure onto conceptual structure. According to Johnson, these patterns emerge
as meaningful structures based on our bodily movements
through space, our manipulations of objects, and our perceptual interactions (1987, 29). Jean Mandler aptly likens image

Image Schema
schemas to the representations one is left with when one has
forgotten most of the details (2004, 7981).
Lack of specificity and content makes image schemas highly
flexible preconceptual and primitive patterns used for reasoning
in an array of contexts (Johnson 1987, 30). A partial inventory of
image schemas is as follows: container; balance; blockage;
counterforce; restraint removal; source-path-goal; link;
center-periphery; verticality; scale; part-whole; support.
Since the publication of the influential works of Johnson and
Lakoff, the notion of image schemas has been theorized, investigated, and applied in multiple domains of inquiry. Within
cognitive linguistics, the notion of an image schema has taken
center stage in research in semantic and grammatical analysis, psycholinguistics, cognitive development, and neurocomputational modeling, to name the most prominent areas of
activity in the language sciences. Controversy has ensued over
the definition of and criteria for positing something as an image
schema, over its status as conscious or unconscious representations, and even over its status vis--vis individual and social cultural cognition.

Image Schemas Among Cognitive Linguists


Image schemas play an important role in studies focusing on
polysemy of words and constructions, semantic change,
and text analysis. Studies of words and constructions include
Alan Cienkis (1998) comparison of the metaphoric projections of straight and prjamo in English and Russian. Likewise,
image schemas figure prominently in arguments about semantic change. For instance, Marolijn Verspoor (1995) argues that
semantic change preserves image schematic structure, whereas
Yo Matsumoto (1995) challenges a strong version of that hypothesis. Much text analysis and literary criticism inspired by cognitive linguistics specify image schemas as that which make
linguistic innovation possible. Literary and textual criticism
within the image schema tradition is represented most prominently by Mark Turner (1991).
Raymond W. Gibbs (2005) has conducted extensive psycholinguistic experiments designed to demonstrate that image schemas organize not only experience but also semantic structure and
usage. His early experiments support the claim that image schemas are psychologically real and imply that they are enduring
mental representations, while his later experiments refine
this view, suggesting that they are real but not representational
structures per se. Rather, they are emergent structures continuously created on the fly as part of human beings dynamic
simulations of actions and situations. In short, Gibbs argues that
the psychological reality of image schemas emerges from continuous interaction in a three-dimensional world and not from
their being prestored representations in long-term memory
(2005, 132).
Image schemas figure prominently in some developmental accounts of the acquisition of concepts and language. Two
notable lines of research in this area include Mandler (2004)
and Sinha and Jensen de Lopez (2000). Mandler has argued that
image schemas arise from a process of perceptual meaning analysis. Human neonates and infants appear to be engaging in forms
of perceptual analysis of space and motion, such that primitive
schemas for goal-path; linked path; self-motion; animate

Implicational Universals
motion; and caused motion emerge as preverbal, perceptually
based conceptual primitives underlying cognitive development
and language acquisition. Chris Sinha and K. Jensen de Lopez
take a slightly different tack in emphasizing the social-cultural
and artifact dimensions of the acquisition of locative prepositions in and under in Danish-acquiring, English-acquiring, and
Zapotec-acquiring children. They conclude that the sociocultural environment plays a greater role in cognitive development
than previously thought. Image schemas may in fact be distributed throughout the local environment, with considerable differences appearing among cultures with very different material
makeups.
Finally, image schema theory is a central pillar of the neural
theory of language project initiated by Jerome Feldman (2006),
Lakoff (1987), Srini Naryanan, and others. A crucial component
of this neural computational model is the execution or x-schema
protocol for representing human actions, which include models
for enacting drop-schema that simulate the neural computational activity involved. For instance, a computational representation for distinguishing between verbs such as push and shove
invokes the slide x-schema but differs in the microdetails of
body-part movement and acceleration.
Todd Oakley
WORKS CITED AND SUGGESTION FOR FURTHER READING
Cienki, Alan. 1998. Straight: An image schema and its transformations.
Cognitive Linguistics 9.2: 10749.
Feldman, Jerome. 2006. From Molecule to Metaphor. Cambridge,
MA: MIT Press.
Gibbs, Raymond W. 2005. The psychological status of image schemas.
In Hampe 2005, 11335.
Hampe, Beate, ed. 2005. From Meaning to Perception: Image Schemas
in Cognitive Linguistics. Berlin: Mouton De Gruyter. An edited collection of essays presenting a wide range of views on the nature of image
schemas.
Johnson, Mark. 1987. The Body in the Mind. Chicago: University of
Chicago Press. This book is considered the locus classicus of image
schema theory.
Lakoff, George. 1987. Women, Fire, and Dangerous Things.
Chicago: University of Chicago Press.
Mandler, Jean. 2004. The Foundations of Mind. New York: Oxford
University Press.
Matsumoto, Yo. 1995. From attribution/purpose to cause: Image
schema and grammaticalization of some cause markers in Japanese.
In Lexical and Syntactic Constructions and the Construction of
Meaning, ed. Marolijn Verspoor, K. Dong, and E. Sweetser, 287307.
Amsterdam: John Benjamins.
Sinha, Chris, and K. Jensen de Lopez. 2000. Language, culture, and the
embodiment of spatial cognition. Cognitive Linguistics 11.1/2: 1741.
Turner, Mark. 1991. Reading Minds. Princeton, NJ: Princeton University
Press.
Verspoor, Marolijn. 1995. Predicate adjuncts and subjectification.
In Lexical and Syntactic Constructions and the Construction of
Meaning, ed. Marolijn Verspoor, K. Dong, and E. Sweetser, 43349.
Amsterdam: John Benjamins.

IMPLICATIONAL UNIVERSALS
typological universals seek to capture the limits of grammatical variation, that is, the extent to which particular traits

379

Implicational Universals
or features of languages may covary, both within languages
and across languages. Typological universals may be classifed
along several dimensions. The most important distinction is
between substantive and implicational universals. The former deal with a single property and reflect what occurs in all
languages (e.g., all languages have vowels). Since such unrestricted universals tell us little about variation, they are considered to be of moderate interest. For the study of variation,
implicational universals are of considerably greater interest
since they reflect not only the existence of variation but, crucially, the constraints imposed on it.
Implicational universals relate two (or more) logically independent features or properties and apply to some subset of
languages for which the given features or properties obtain.
Implicational universals may be absolute or statistical.
Absolute implicational universals hold for all languages; that is,
they specify that if a language has feature A, then it will have
feature B. By contrast, statistical implicational universals specify relationships that hold only at a certain level of probability.
Statistical implicational universals take the form, If a language
has feature A, then with greater than chance frequency it will
have feature B. Examples of the two types of implicational universals are given in (1a) and (2a), respectively, and the distribution of features that each universal reflects is presented in (1b)
and (2b) (in the form of a tetrachoric table). (The data depicted
in (2b), where the numerals reflect the number of languages
with the specified features, are taken from Greenberg (1963,
Appendix II.)
(1)

a. If the demonstrative follows the head noun, then the relative clause also follows the head noun
b.
DemN
NDem
RelN
+

NRel
+
+

(2)

a. If a language has basic subject-object-verb (SOV) order,


than with greater than chance frequency it will have
postpositions
b.
SOV
non-SOV
prep
5
73
post
97
10

Note that the five prepositional SOV languages are counterexamples to the universal. Although many absolute implicational
universals have been posited in the typological literature, most,
particularly of the simple kind (see the following), have turned
out to be in fact statistical.. Needless to say, these may be assessed
from the point of view of their relative strength (the number of
overall languages considered, the number of languages displaying the features in question, the number of exceptions to the
universals) and their validity in relation to criteria such as those
presented in Stassen (1985, 201).
Implicational universals may be monodirectional or bidirectional. The absolute implicational universal in (1) is monodirectional since NRel order is compatible with both DemN and
NDem. Consequently, while NDem order entails NRel, NRel
order does not entail NDem. The statistical implicational universal in (2), on the other hand, is bidirectional; not only does SOV
order favor postpositions over prepositions, but postpositions

380

also favor SOV order over non-SOV order. However, given the
data in (2b), the strength of the implicational universal with SOV
as the antecedent and postpositions as the consequent is greater
(95%) than the converse universal with postpositions as the antecedent and SOV order as the consequent (91%).
The implicational universals in (1) and (2) are simple ones,
as they specify a dependency between only two traits. Relations
between several traits are captured by means of complex universals that may involve conjunctions or disjunctions of traits within
the antecedent and/or the consequent, as depicted in (3).
(3)

a. X (Y & Z)
b. (X or Y) Z

Individual implicational universals may be combined into


chains or hierarchies, such that the implicatum (or conclusion)
of the first universal is the implicans (or premise) of the second,
the implicatum of the second is the implicans of the third, and so
on. Since representing chains of implicational universals as such,
that is, as in (4a), is quite cumbersome, they tend to be depicted
in the form of a hierarchy, as in (4b).
(4)

a. ((A B) & B C) & C D


b. D > C > B > A

It is important to note that the distributions captured in the form


of these typological hierarchies define a frequency cline, such
that the phenomenon in D is more frequent than that in C, which
in turn is more frequent than in B, and so on. This follows from
the fact that if any term involved in the hierarchy is present, all
the terms to the left of it on the chain must also be present. And if
any term involved in the hierarchy is absent, all the terms to the
right of it must also be absent.
Typological hierarchies thus make very strong statements
about the possible distribution of properties across languages
and their overall frequency of occurrence. Consequently, much
typological research has been aimed at elaborating such hierarchies, be it with respect to segmental inventories (see, e.g.,
the sonority hierarchy of Hopper 1976, 196), morpho-syntactic
encoding (see, e.g., the complement deranking-argument
hierarchy of Cristofaro 2003, 131), or behavioral properties
(see, e.g., Keenan and Comries 1977 noun phrase accessibility hierarchy).
While statistical implicational universals are the dominant
means of expressing typological generalizations by typologists,
it must be pointed out that there is a long-standing controversy
over whether such universals do indeed capture significant
relationships between aspects of linguistic structure or merely
incidental relationships resulting from historical accident
(see, e.g., Maslova 2000; Bakker 2008). A technique of critically
evaluating such universals has been recently developed by
Elena Maslova (2003), and an excellent account of typological
universals in general is provided by William Croft (2003, 4969,
1228).
Anna Siewierska
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bakker, Dik. 2008. LINFER: Inferring implications from the WALS database. STUF 61.3: 18698.

Indeterminacy of Translation
Cristofaro, Sonia. 2003. Subordination. Oxford Studies in Typology and
Linguistic Theory. Oxford: Oxford University Press.
Croft, William. 2003. Typology and Universals. 2d ed.
Cambridge: Cambridge University Press.
Greenberg, Joseph. 1963. Some universals of grammar with particular reference to the order of meaningful elements. In Universals of
Human Language, ed. Joseph Greenberg, 73113. Cambridge, MA:
MIT Press.
Hopper, Joan Bybee. 1976. An Introduction to Natural-Generative
Phonology. New York: Academic Press.
Keenan, Edward, and Bernard Comrie. 1977. Noun phrase accessibility
and universal grammar. Linguistic Inquiry 8: 6399.
Maslova, Elena. 2000. A dynamic approach to the verification of distributional universals. Linguistic Typology 3/4: 30733.
. 2003. A case for implicational universals. Linguistic Typology
7.1: 1018.
Stassen, Leon 1985. Comparison and Universal Grammar. Oxford: Basil
Blackwell.

INDETERMINACY OF TRANSLATION
In Word and Object (1960), one of the most influential books in
the history of American philosophy, W. V. O. Quine presents an
argument along the following lines. One may always formulate
mutually exclusive hypotheses regarding the meaning of a linguistic item such that there is no fact as to which hypothesis is
correct. Put differently, there are always divergent ways of defining a given word such that no one of these ways is correct. This
highly influential, and highly controversial, indeterminacy of
translation thesis extends Quines argument against analyticity (the view that some statements are true simply due to the
meaning of the terms see Quine 1961; see also meaning and
belief) and points toward his conception of ontological relativity (that reference is not absolute, but relative to a coordinate
system [Quine 1969, 48], that Factuality is internal to our
theory of nature [Quine 1981, 23]).
To explore Quines idea, we might begin with a simple distinction between decidability criteria and demarcation criteria. A decidability criterion allows us to chose among different
hypotheses regarding a words meaning (see word meaning).
A demarcation criterion distinguishes in principle between the
validity of two such hypotheses. A demarcation criterion presumably isolates a factual difference; a decidability criterion, first of
all, isolates an evidential difference (or, more broadly, a distinction by the lights of scientific method; thus, it may incorporate
simplicity or other adjudicative criteria). Quine argues that there
is no strict decidability criterion for meaning. We may refer to
this as uncertainty. But he also maintains that for at least some
set of mutually exclusive hypotheses, there is no demarcation
criterion either. The latter is indeterminacy proper.
Quines arguments are bound up with his holism, the
view that hypotheses of any sort, including hypotheses regarding meaning, are part of larger complexes of belief and are not
understandable outside those complexes. Crucially, this means
that apparent counterevidence regarding one part of the whole
need not have simple, local consequences. Such counterevidence may be accommodated by alterations elsewhere in the
whole. Suppose I am a field linguist investigating a new language.
I encounter the word vagavai, which I take to mean gold. I then

find a native speaker apparently referring to brass as vagavai. I


might conclude that vagavai does not mean gold. But I might also
conclude that the speaker does not know the difference between
brass and gold, that the speaker (e.g., a young child) is mistaken
about the meaning of vagavai, and so on.
Of course, Quines claims are more significant and less intuitive than this example suggests. To get a better idea of what is at
stake in the indeterminacy of translation, we might distinguish
two levels or perhaps two poles of uncertainty and indeterminacy. The first is global. This is the level at which we may say that
any meaning may be mapped onto any translation, given enough
manipulation of the rest of the system. For example, if we are willing to revise enough of our beliefs about the world, we can maintain a translation of the French word lapin (rabbit) as stone.
Of course, it wont be easy. But it is possible. We expect any word
to have a number of misuses even by fluent speakers. Typically,
however, the correct uses overwhelmingly outnumber the incorrect uses. We may have to change our assumed proportions in the
case of lapin. For example, we might infer that French speakers
are correct in their use of lapin only when they point to rabbit
sculptures or when, at a distance, they mistake a stone for a rabbit to put the matter in our commonsense, lapin = rabbit idiom.
Chomskys influential criticisms of Quines views on indeterminacy most obviously concern this pole. Chomsky rightly
points out that all theories are underdetermined by evidence
(1980, 15). One consequence is that we rely on other criteria for
adjudication, such as simplicity. It is presumably simpler (or, in
another terminology, less ad hoc) to assume that fluent speakers
err at similar rates for all frequently used common nouns than to
assume that lapin is exceptional in this regard. Thus, one might
conclude that there is uncertainty here. However, the uncertainty
is adjudicable (at least within limits), and there is no reason to
conclude that this uncertainty implies indeterminacy.
The other level or pole is highly local. It is limited to differences that we cannot formulate or, more generally, that we do
not encode (i.e., roughly, make into information that may be
processed cognitively) in particular cases. For instance, one of
Quines central examples is that of a field linguist encountering gavagai. The field linguist takes the term to mean rabbit. However, the available evidence is equally consistent with
undetached rabbit part, rabbit stage, and so on. Here, one
might argue that this is really best understood as a case of global
uncertainty/indeterminacy, which we have already considered. Specifically, given the structure of human perception and
human memory, it seems very unlikely that speakers of other
languages would not encode part and stage, thus distinguishing
rabbit, undetached rabbit part, and rabbit stage in some way. But
that is merely a problem with the particular example. There are
clear cases where one language involves distinctions that speakers of another language do not encode. For instance, a particular
group may not distinguish between gold and brass or, better
still, fools gold. Examples such as this may suggest that there is a
sort of indeterminacy at the subencoding level across languages,
or even within languages. For example, when I use the word gold,
I have no sense of different types of gold. In this sense, relative to
jewelers, my use of gold is indeterminate among those different
types. On the other hand, one might reasonably contend that this
is not really indeterminacy at all, since there is presumably some

381

Indexicals
psychological fact about the degree of vagueness or ambiguity of my usage at particular times and places.
These arguments, then, point toward global, though still adjudicable uncertainty along with local, subencoding indeterminacy
in the limited sense of vagueness or ambiguity. Both are significant and both are highlighted by Quines discussions. However,
neither is radical indeterminacy.
On the other hand, to make these arguments is to go against a
range of Quines other views for example, his behaviorism and
his commitment to ontological relativity. Like all theories, Quines
own theories operate holistically. Even if he were to accept some
version of the preceding arguments, he could accommodate
these arguments by alterations elsewhere in the system. Thus,
despite such arguments, radical indeterminacy remains at least
a continuing and important philosophical challenge.

deictic use (see deixis), but indexicals have many other uses.
I is used demonstratively when (1) is written near a picture of
Truman with an arrow pointing at the picture. It is used anaphorically in (2) as opposed to (3):
(2)

Truman believes I am president.

(3)

Truman believes that I am president.

It refers to Truman in (2), the speaker in (3). Indexicals are often


used pseudodeictically in novels. When Gore Vidal wrote (4) in
Live From Golgotha,
(4)

I am Timothy,

he was not referring to himself. Finally, indexicals are often used


non-referentially, as in
(5)

Je means I in French.

(6)

Many a car is such that it needs gas.

Patrick Colm Hogan


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1980. Rules and Representations. New York: Columbia
University Press.
Fllesdal, Dagfinn, ed. 2001. Philosophy of Quine. Vol. 3. Indeterminacy of
Translation. New York: Garland.
Glock, Hans-Johann. 2003. Quine and Davidson on Language, Thought
and Reality. Cambridge: Cambridge University Press.
Quine, W. V. 1960. Word and Object. Cambridge, MA: MIT Press.
. 1961. From a Logical Point of View: Logico-Philosophical Essays.
2d ed. New York: Harper and Row.
. 1969. Ontological Relativity and Other Essays. New York: Columbia
University Press.
. 1981. Theories and Things. Cambridge: Harvard University Press.

INDEXICALS
An indexical is a linguistic expression whose reference (extension) may vary from one context of use to another even when
used with the same meaning (sense) and with respect to the
same possible world or story. An example is the pronoun I,
which I use to refer to myself and you to yourself. Sentence (1)
itself is indexical: true when used by Harry Truman in 1951 but
not when used by Churchill.
(1)

I am president.

Tensed verbs (am, was) are indexical, as are many adjectives


(foreign, local), common nouns (enemy, neighbor), and adverbs
(yesterday, recently).
Indexicals are often defined simply as terms with different
extensions in different contexts. But plane has different extensions (airplanes versus wood planes) because it is ambiguous
rather than indexical. I too is ambiguous, meaning one (1) or
iodine as well as me. But it is used with the very same meaning when you and I use it to refer to ourselves. Furthermore, while
the 34th president refers to Dwight Eisenhower when describing the actual world, it refers to Adlai Stevenson when describing a hypothetical case in which he won in 1954. The referent of
an indexical varies even when speakers are describing the same
possible world.
It is often said that the meaning of I is given by the rule that
its referent in any context of use is the speaker. This holds for the

382

In a semantic use like (5), it is crucial that I be used with the first
person meaning we have been focusing on; if it means iodine,
(5) is false. Sentence (6) illustrates the quantificational use, with
pronouns bound by quantifiers like variables in quantification theory.
Indexicals present a problem for formal theorists because
assigning extensions and intensions (functions from possible
worlds to extensions) to indexicals cannot adequately represent
their role in determining the truth conditions of sentences (see
truth conditional semantics). David Kaplans ([1977]
1989) seminal solution was to represent the meanings of indexicals by assigning them characters (functions from contexts to
intensions). For I, Kaplan had in mind the function satisfying the following condition: i(c) is the intension whose value in
any world w is the speaker uttering I in c. Thus, the value of
i(c) in any context in which Theodore Roosevelt used I is the
constant function tr(w) whose value is Theodore Roosevelt for
every world. The value of i(c) in any context in which Franklin
Roosevelt used I is the constant function fdr(w) whose value
is Franklin Roosevelt for every world.
The character theory of meaning has many of the difficulties
of referential theories generally. For example, I and the person identical to me have the same character, but not the same
meaning (Freges problem). The character function for he would
seem to be undefined when it is used in After Santa came down
the chimney, he left the presents; Santa does not exist (Russells
problem). Quantificational and semantic uses are nonreferential
for different reasons. Another difficulty is the Enterprise problem.
Suppose Mary points at the bow of a big ship and says That is
the Enterprise and then points at the stern and says That is not
the Enterprise. The character function for that would assign it
the same intension in both these contexts the one whose value
at every world is the Enterprise. Yet Mary is not contradicting
herself the way she would if the same pointing gesture accompanied both her utterances. The essential indexical problem is that
sentences like (2) and (7) may differ in truth value, as in cases of
amnesia or delusion.
(7)

Truman believes that Truman is president.

On the standard analysis, S believes p is true if and only if S


stands in the belief relation to the proposition expressed by

Inequality, Linguistic and Communicative


p. But if propositions are intensions, then the complement in
(2) expresses the same proposition as the one in (7): the proposition true in any world if and only if Truman is president.
Fregean theories solve the essential indexical problem by taking indexicals to express a distinctive type of mode of presentation
(concept or mental representation) and taking propositions to be structured entities consisting of modes. Then, (2)
entails that Truman believes the proposition whose subject is his
own self-concept, whereas (7) entails that Truman believes
the proposition whose subject is the concept of Truman. Marys
two utterances are not contradictory because they express propositions with different modes of presentation of the Enterprise. A
difficulty for Fregean theories is to explain why I is not ambiguous if it is used by different speakers to express different concepts. Another is to account for nonreferential uses, in which I is
not used to express the speakers self-concept.
Wayne A. Davis
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Braun, D. 1996. Demonstratives and Their linguistic meanings. Nos
30: 14573.
Burks, A. W. 1949. Icon, index, and symbol. Philosophy and
Phenomenological Research 9: 67389.
Evans, G. 1980. Pronouns. Linguistic Inquiry 11: 33762.
Forbes, G. 1987. Indexicals and intensionality: A Fregean perspective.
Philosophical Review 96: 331.
Frege, G. [1918] 1977. Thoughts. In Logical Investigations, ed.
P. T. Geach, 130. Oxford: Basil Blackwell.
Gale, R. M. 1967. Indexical signs, egocentric particulars, and tokenreflexive words. In Encyclopedia of Philosophy. Vol. 4. Ed. P. Edwards,
1514. New York: Macmillan.
Kaplan, D. [1977] 1989. Demonstratives. In Themes from Kaplan, ed.
J. Almog, J. Perry, and H. Wettstein, 481563. Oxford: Oxford University
Press.
Knne, W. 1997. First person propositions: A Fregean account.
In Direct Reference, Indexicality, and Propositional Attitudes, ed.
W. Knne, A. Newen, and M. Anduschus, 4968. Stanford, CA: CSLI
Publications.
Nunberg, G. 1993. Indexicality and deixis. Linguistics and Philosophy
16: 143
Perry, J. 1979. The problem of the essential indexical. Nos 13: 321.

INEQUALITY, LINGUISTIC AND COMMUNICATIVE


The idea that linguistic variation goes hand in hand with linguistic inequality was brought on stage by Basil Bernstein and started
influencing the development of the new discipline of sociolinguistics, which became a discipline devoted to the study of the
unequal and nonrandom distribution of linguistic resources in
society. Bernstein (1971), in an attempt at understanding class
differences in the education system, distinguished between two
codes, one elaborate and characteristic of middle-class children, another restricted and characteristic of working-class
and minority children. The elaborate code was the normative
one: It was privileged by the teachers and used as the yardstick,
not only for good language but also for wider behavioral and
cognitive assessments of pupils. Bernstein demonstrated that
speaking different varieties of language meant speaking unequal

varieties of language, varieties that did not offer the same social,
cultural, and political rewards as others.
The central insight here is indeed the parallelism between
linguistic variation and social differentiation, between linguistic differences and social hierarchies and forms of stratification
often organized around the dynamics of prestige and stigma.
This insight has been developed by a number of scholars in sociolinguistics and related fields of study (Irvine 1989). I first give a
brief survey of the development of this topic in sociolinguistics
and then turn to a discussion of two authors: Pierre Bourdieu
and Dell Hymes.

Inequality in Sociolinguistics
Bernsteins thesis about elaborated and restricted codes became
a deeply controversial one, though often for the wrong reasons. It
coincided with the emergence of modern sociolinguistics, a discipline that saw itself initially as devoted to the study of linguistic
variation and distribution as a horizontal exercise: an exercise
in mapping linguistic varieties over fragments of a population,
based on an assumption of the fundamental equivalence of every
linguistic variety. Thus, Bernsteins vertical image of stratified
variation variation that comes with differential value attribution was countered by William Labov (1972) and others, who
argued that a linguistic variety such as black American English
should not be seen as bad English (which Labov chose to equate
with Bernsteins restricted code) but as a complex, sophisticated
code used by virtuoso speakers. Labov, like Bernstein, started
from an awareness of real inequalities in language: The black
American English variety was stigmatized in U.S. education and
its speakers were often negatively categorized. But Labovs efforts
were aimed at demonstrating the intrinsic linguistic equivalence of these different varieties, whereas Bernstein addressed
the extrinsic attributional, ideological nonequivalence of
the varieties. Seen from a historical distance, both efforts were
connected by a joint concern for how particular language varieties disenfranchized speakers, not because of their intrinsic
inferiority but because of social and political perceptions of the
varieties and their speakers. In other words, both were addressing
a language-ideological effect in which linguistically equivalent
varieties indexed socially unequal features and categories.
Similar ideas were to be found in the work of many of the early
sociolinguists. John Gumperz (1982) stressed the different ways
in which intercultural misunderstandings were not a result of
participants intentions but were an effect of small differences
in linguistic and communicative structures in a particular context of different contextualization cues. In his work as well, we
encounter an awareness of relativity: Different language varieties
can fulfill the same functions, but the perception of these functions by people not sharing the contextual conventions of a particular variety can differ. Here again, we encounter the tension
between intrinsic equivalence and extrinsic nonequivalence.
From a different vantage point, sociolinguists concerned
with language policy stressed the fact that the political
institutionalization of language in multilingual societies (see
bilingualism and multilingualism) usually involved a
stratification based not on intrinsic superiority of one language
but on prestige hierarchies and ideological images of society.
J. Fishman (1974) presented studies in which former colonial

383

Inequality, Linguistic and Communicative


as well as local languages were lifted ideologically to the status
of prestige language by means of language policies favoring a
standardized, tightly controlled, and preferably ethnically
neutral language; C. Eastman (1983) described the different
stages of language planning, demonstrating how language varieties could be turned into institutionalized, power-emanating
elements of social structure.
A central ingredient of every form of language planning is the
construction of a written, orthographically standardized variety
of language. Scholars in the field of literacy also devoted attention to the ways in which literacy may introduce stratification in
languages and communities and often becomes an opportunity
as well as an obstacle defining peoples social mobility trajectories (Street 1995; Kress 2000). The creation of a literate norm
almost invariably means that access to the prestige variety of
the language comes to be controlled by the education system,
which functions as a very effective filter on social mobility.

State of Affairs
In all of the work discussed so far, there is an awareness of
inequality, notwithstanding that the emphasis would be on difference and variation, rather than on the power and inequality
dimensions of difference and variation. Other scholars developed
full-blown theories of linguistic and communicative inequality,
and two stand out: Bourdieu and Hymes.
Bourdieu (1991) emphasized the ways in which language is
part of the superstructural apparatus of society; as a nonmaterial resource, it can nevertheless be imbued with an economic
value: social or cultural symbolic capital. As in hard economic
sectors, the value of different resources is different and can fluctuate, and not every resource can be traded for another one; the
terms of exchange are unpredictable. Bourdieus work on language fitted into his larger program of analyzing the economies
of taste, culture, and ideas from within a generalized materialism
and with the aim of redefining (and empirically substantiating)
the notion of social class. In his view, the traditional distinction between material and immaterial resources in society was
unjustified since the same forces of differential value allocation
appeared to operate on both: Class is both a material and an
immaterial thing, and language differences do play a role in this
value allocation. Consequently, the stability of class as a material
complex is also there in the immaterial aspects: A lack of real capital is often paralleled by a lack of symbolic capital, and social hierarchies play as much into the symbolic diacritics as they do into
the material ones. (See also field and habitus, linguistic.)
It is not difficult to see the similarities between the project
developed by Bourdieu and the (more limited) one presented
by Bernstein. Both treat language not just as an opportunity a
positive feature of humans but also as a constraint, as something inherently limited and limiting. And rather than enabling
people to perform particular acts, language also restricts them
and prevents them from performing other acts. Underlying this
idea of language as constraint is a model of society as nonegalitarian and stratified, with relatively stable strata, some of which
are tightly controlled as to access while others are more democratically accessible. Particular language resources are required
to move into the more controlled strata, and membership of
these strata for example, elite, professional milieux requires

384

a constant reenactment of communicative practices dependent


on these exclusive resources (Bourdieus analyses of academic
discourse are telling in this respect; see Bourdieu, Passeron, and
de Saint Martin 1994).
It is at this point that the contribution of Hymes (1980, 1996)
comes into view. He draws on a long anthropological tradition of
seeing language in context and use speech and when language is seen from this practical angle, diversity and inequality
are the rule rather than the exception. Whereas Bernsteins and
Bourdieus reflections on language remained largely abstract
and general, Hymess approach is soundly empirical. Hymes
starts from registers, repertoires, and genres rather than
languages, and such practical linguistic and communicative
instruments are performed in contexts where they take actual
shapes: conversations, stories, lectures, and so on. Hymes
himself focused on narrative, and he observed that one
form of inequality in our society has to do with rights to use
narrative, with whose narratives are admitted to have cognitive function (1980, 126). The reason is that conventions for
producing narratives are culturally and socially sensitive, and
particular contexts in society require particular types of narrative. People can be very good storytellers in their neighborhood
but ineffective ones in front of a judge in court, because the
stylistic and genre conventions for those particular events are
fundamentally different. Thus, the capacity to be recognized as
a competent speaker or an articulate storyteller, lecturer,
or conversationalist is a socially sensitive phenomenon on
which social forces of differential value attribution operate. For
Hymes, this dialectic between real capacity and expected performance defines voice, and one can be voiceless for many
reasons (Blommaert 2005).
The approaches of Bourdieu and Hymes both present a fully
developed theory of inequality in the field of language and communication. From their work, we can see how the production of
meaning is a regulated, regimented process in which the speakers choice is never unlimited and in which the effects of his/her
words are judged by others using social and ideological yardsticks. The advantage of Hymess approach is its strongly developed empirical dimension, which offers a range of opportunities
for applied research.

Evaluation
Despite its implicit focus on inequality, sociolinguistics privileges difference and variation as its objects and avoids explicit
analyses of how such variation converts into social inequality.
This is an effect of the descriptive bias in sociolinguistics, as well
as of the hesitant and ambivalent relationship between sociolinguistics and social theory (Williams 1992). Many studies of
variation, consequently, consider differences merely in terms of
spread and distribution and see the power and inequality relationships among language varieties in simplified terms. The fact
that speaking with a stigmatized accent is not merely a matter
of distribution but a matter of social opportunities, determined
by the way in which others rank such accents in an ideologically
informed hierarchy, could be more central to sociolinguistic
reflections.
This is a critical insight and eminently applicable to many
sociolinguistic and discursive phenomena in the contemporary

Infantile Responses to Language


world. Emergent applied studies have shown its relevance,
for example, in the field of asylum seekers narratives (Maryns
2006), literacy in education (Collins and Blot 2003), and social
work (Hall, Slembrouck, and Sarangi 2006).
Jan Blommaert
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bernstein, B. 1971. Class, Codes and Control. Vol. 1. Theoretical Studies
Towards a Sociology of Language. London: Routledge and Kegan Paul.
Blommaert, J. 2005. Discourse: A Critical Introduction. Cambridge:
Cambridge University Press.
Bourdieu, P. 1991. Language and Symbolic Power. Cambridge,
UK: Polity.
Bourdieu, P. J. C. Passeron, and M. de Saint Martin. 1994. Academic
Discourse: Linguistic Misunderstanding and Professorial Power.
Stanford, CA: Stanford University Press.
Collins, J., and R. Blot. 2003. Literacy and Literacies. Cambridge:
Cambridge University Press.
Eastman, C. 1983. Language Planning: An Introduction. San Francisco:
Chandler and Sharp.
Fishman, J., ed. 1974. Advances in Language Planning. The Hague:
Mouton.
Gumperz, J. 1982. Discourse Strategies. Cambridge: Cambridge University
Press.
Hall, C., S. Slembrouck, and S. Sarangi. 2006. Language Practices in Social
Work. London: Routledge.
Hymes, D. 1980. Language in Education: Ethnolinguistic Essays.
Washington, DC: Center for Applied Linguistics.
. 1996. Ethnography, Linguistics, Narrative Inequality: Toward an
Understanding of Voice. London: Taylor and Francis.
Irvine, J. 1989. When talk isnt cheap: Language and political economy.
American Ethnologist 12: 73848.
Kress, G. 2000. Early Spelling: Between Convention and Creativity.
London: Routledge.
Labov, W. 1972. Language in the Inner City. Philadelphia: University of
Pennsylvania Press.
Maryns, K. 2006. The Asylum Speaker: Language in the Belgian Asylum
Procedure. Manchester: St Jerome.
Street, B. 1995. Social Literacies. London: Longman.
Williams, G. 1992. Sociolinguistics: A Sociological Critique.
London: Routledge.

INFANTILE RESPONSES TO LANGUAGE


Infant responses to speech and language have been used to study
both early communicative development and early language perception in infants. Observations of interactions between infants
and caregivers have shown that infants under three months of
age become quiet in response to speech. Infants as young as
six weeks engage in interactions or protoconversations, or turntaking in gaze and/or vocalizations with caregivers (Bruner 1975;
Trevarthen 1974). Between four and six months, infants respond
to speech with vocalizations (see babbling), and by six to nine
months they begin initiating interactions, including games such
as peek-a-boo.
Experimental investigations into infant responses to speech
have yielded information about infants abilities to decode language well before they begin to produce language themselves.
Numerous experimental methods have been employed to examine their perception, including behavioral, biological, and, more

recently, brain measures. The major behavioral measures rely on


infants ability to associate a behavioral response with stimuli.
Behavioral measures examine either their responses to new
stimuli after habituation or their attention (e.g., eye gaze, head
turn) to certain types speech or language stimuli. For example,
High-Amplitude Sucking and Conditioned Headturn are both
habituation paradigms. Researchers using these paradigms
habituate the infant to the stimulus, change the stimulus, and
then measure whether the infant responds to the change. Other
procedures, namely, the Headturn Preference Procedure and the
Intermodal Preferential Looking Paradigm, measure how long
infants attend to stimuli by measuring either headturns toward
attention-getting lights or gazes to pictures/video accompanying
the stimuli. Biological techniques measure reactions to speech,
most frequently by measuring infant heart rate. More recent
research has turned to brain measures, including electrophysical responses such as event related-potentials (ERP), electroencephalogram (EEG), and neuroimaging techniques such as
fMRI (functional magnetic resonance imaging) to examine early
speech and language perception. For an overview of methods,
see Jusczyk (1997); Mills, Coffey-Corina, and Neville (1997).
Using experimental methods, researchers have investigated
a wide range of issues concerning infants understanding of
speech and language, including responses to individual speech
contrasts, differences in speakers, sensitivity to the prosody of
the language, lexical learning, and ability to detect grammatical patterns. (See also speech perception in infants.)
This research has shown that fetuses can hear during the third
trimester of development (although the sound is low-pass filtered by the amniotic fluid and the uterine walls) and will recognize recurrent patterns (e.g., rhymes) from the mothers speech
(DeCasper et al. 1994). Neonates prefer their mothers voices
to others (DeCasper and Fifer 1980). In addition, neonates and
infants prefer to listen to infant-directed speech (speech with
fewer words per utterance, more repetition, slower articulation, and greater prosodic swings) over adult-directed speech
(Fernald 1985; Cooper and Aslin 1990)
Two-month-old infants can distinguish their native language
from other languages on the basis of prosodic characteristics; by
six months infants are sensitive to their native language stress
patterns and by nine months to native language phonotactics.
Infants use prosodic information to group words and are sensitive to prosodic units that mark both clauses (by 46 months)
and phrases (by 9 months). As prosodic units often correspond to
grammatical units, this ability may give infants additional information about grammatical structure. There is also evidence that
prosody can help infants remember information about speech
(Jusczyk 1997).
Work on the perception of speech contrasts has shown that
from birth, infants can discriminate both consonants and vowels
that contrast in any of the worlds languages, even if the sounds
are not found in the infants native language (Werker and Curtin
2005; Kuhl 2000). This ability holds even if the sounds are produced by different speakers of different ages. Work has also
investigated the acoustic cues infants use to detect differences,
as well as the nature of categorical perception. Results suggest
that infants can be sensitive to the same range of acoustic cues
as adults, including the effects of coarticulation (where sounds

385

Infantile Responses to Language


in a word affect the articulation of each other) and the multiple
acoustic cues that can indicate the identity of a phonetic contrast. The ability to discriminate phonetic contrasts from native
and unfamiliar languages is maintained until 6 months, but by
1012 months, infants only discriminate contrasts that are phonologically contrastive in the target language. This shift from
language-general to language-specific discrimination occurs
first for vowels (by 68 months) and later for consonants (1012
months) (Werker and Curtin 2005).
Infants perception of speech sounds shifts as they begin
to learn words. Infants younger than 8 months are sensitive to
phonetic differences, while infants approaching one year ignore
allophonic variation and attend only to phonological differences
found in their native language(s). At the same time, young word
learners (14-month-olds) do not attend to some phonetic detail
(e.g., dih vs. bih) when in a word-learning task (Stager and
Werker 1997), even though younger infants can discriminate
these contrasts in perception tasks and older infants attend to
these differences in word-learning tasks. Thus, the experience
learning language influences how and when contrasts are perceived (Kuhl 2000).
Infants also attend to statistical regularities in the speech
stream. These abilities have been shown in the perception of
phonetic segments, in word segmentation, and grammatical
patterns (Saffran 2003). Attention to statistical regularities can
help infants extract familiar patterns and track relationships
within a speech stream and, thus, may help infants uncover
basic structural relations in sentences as they move into the
acquisition of grammar. Infants sensitivity to grammar has
been demonstrated both behaviorally and with brain measures.
It begins with sensitivity to the phonetic form of function words
that mark grammatical relations. English-learning infants of
1011 months demonstrate sensitivity to the phonetic properties of functor words and can use them to help segment nouns.
French-learning and German-learning infants show even earlier
sensitivity to functor words, segmenting them by 78 months of
age. By 16 months, infants can distinguish passages in which
English functors are properly or improperly ordered (Shady,
Jusczyk, and Gerken 1998). By 18 months, infants are beginning
to be able to track grammatical relations, such as that which
occurs between is and -ing.
Infants acquiring both English and German are able to perform these tasks, though their ability to do so is influenced by
language-specific factors (Santelmann and Jusczyk 1998; Hhle
et al. 2006). English infants are sensitive to the distance between
the morphemes, whereas German infants are sensitive to the
type of constituent. In addition, by 17 to 18 months, infants are
able to use the presence of nouns in a sentence to help decode
verb meanings. This result is most likely due to the structure of
the two languages: English allows few elements between auxiliary and main verbs (mostly adverbs, or in questions, subjects).
The structure of German (with nonfinite verbs occurring at the
end of the sentence) requires objects and adverbs to occur in
between the auxiliary and main verbs. It appears that infants
in the two languages are sensitive to these distributional properties in the syntax and are using these patterns to help organize the speech stream. Sentences with two nouns (Oscar chases
Elmo) are matched with transitive actions, while sentences with

386

a compound noun (Oscar and Elmo are running) are matched


with intransitive actions. Thus, infants are able to both track
grammatical information and use word order to determine relationships in English (Hirsh-Pasek and Golinkoff 1996).
More work is needed to examine how infants from different
backgrounds, in particular infants from non-Western cultures,
non-Indo-European languages, and bilingual infants, perceive
speech and language. Existing work with bilingual infants suggest that exposure to more than one language helps infants
maintain categorical contrasts found in both languages (Werker
and Tees 1984) and that lateralization can occur independently
for each language to which an infant is exposed (Conboy and
Mills 2006; see bilingualism, neurobiology of). Research
is just beginning to explore the relationship between early speech
perception and later language development. Early results suggest that some aspects of speech perception are related to later
vocabulary development (e.g., Newman et al. 2006).
One area of debate concerns protoconversations (turntaking in gaze and/or vocalizations), whether they are driven
primarily by the caregivers or the infant takes an active role in
shaping them. Research suggests an active role for infants in
co-constructing interactions (e.g., Trevarthen, Kokkinaki, and
Fiamenghi, Jr. 1999). Another long-standing debate resulting
from experimental work concerns whether infants responses
result from a specialized speech-processing system or from
general auditory processing plus categorization abilities. Most
researchers currently argue that the ability to discriminate
speech contrasts is domain-general with some innate perceptual
biases (Kuhl 2000). A further current debate parallels an ongoing debate in cognitive science concerning whether infants use
statistical regularities alone or create abstract rules from statistical information (Saffran 2003). Evidence from other domains
(e.g., production) suggests that infants can abstract rules, and
most linguists would argue for rule creation rather than pure
statistical learning (Marcus et al. 1999), while many connectionists would argue that rule-like effects are artifacts of stable
connections. They argue that learning is based on the specific
stimuli and is possible without the abstraction of rules (Conway
and Christiansen 2006).
Lynn Santelmann
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bruner, Jerome S. 1975. From communication to language: A psychological perspective. Cognition 3: 25587.
Conboy, Barbara T., and Debra L. Mills. 2006. Two languages, one developing brain: Event-related potentials to words in bilingual toddlers.
Developmental Science 9.1: F1F12.
Conway, Christopher M., and Morten H. Christiansen. 2006. Statistical
learning within and between modalities: Pitting abstract against stimulus-specific representations.. Psychological Science 17: 90512.
Cooper, R. P., and R. N. Aslin. 1990. Preference for infant-directed
speech in the first month after birth. Child Development 61: 158495.
DeCasper, A. J., and W. P. Fifer. 1980. Of human bonding: Newborns
prefer their mothers voices. Science 208: 11746.
DeCasper, Anthony J., Jean-Pierre Lecanuet, Marie-Claire Busnel,
Carolyn Granier-Deferre, and Roselyne Maugeais. 1994. Fetal reactions to recurrent maternal speech. Infant Behavior and Development
17: 15964.

Information Structure in Discourse


Fernald, A. 1985. Four-month-old infants prefer to listen to motherese.
Infant Behavior and Development 8: 18195.
Hirsh-Pasek, Kathy, and Roberta M. Golinkoff. 1996. The Origins of
Grammar: Evidence from Early Language Comprehension. Cambridge,
MA: MIT Press. Overview of studies examining infants sensitivity to
grammar, including description of methodologies.
Hhle, B., M. Schmitz, L. M. Santelmann, and J. Weissenborn. 2006.
The recognition of discontinuous verbal dependencies by German
19-month-olds: Evidence for lexical and structural influences on
childrens early processing capacities. Language Learning and
Development 2.4: 277300.
Jusczyk, P. W. 1997. The Development of Speech Perception. Cambridge,
MA: Bradford Books. Overview of studies examining infants sensitivity
to grammar, including description of methodologies.
Kuhl, Patricia K. 2000. A new view of language acquisition. Proceedings
of the National Academy of Science 97: 118507. Overview of historical
positions and framework incorporating both innate biases and statistical learning.
Marcus, G. F., S. Vijayan, S. Bandi Rao, and P. M. Vishton. 1999. Rule
learning by seven-month-old infants. Science 283: 7780.
Mills, Debra L., Sharon Coffey-Corina, and Helen J. Neville. 1997.
Language comprehension and cerebral specialization from 13 to 20
months. Developmental Neuropsychology 13: 397445.
Newman, Rochelle S., Nan Bernstein Ratner, Ann Marie Jusczyk, Peter W.
Jusczyk, and Kathy A. Dow. 2006. Infants early ability to segment the
conversational speech signal predicts later language development: A
retrospective analysis. Developmental Psychology 42: 64355.
Saffran, Jenny R. 2003. Statistical language learning: Mechanisms and
constraints. Current Directions in Psychological Science 12.4: 11014.
Santelmann, Lynn M., and Peter W. Jusczyk. 1998. Sensitivity to discontinuous dependencies in language learners: Evidence for limitations in
processing. Cognition 69: 10534.
Shady, Michele, Peter W. Jusczyk, and LouAnn Gerken. 1998. Infants
sensitivity to function morphemes. Proceedings of the Annual Boston
University Conference on Language Development 19: 55363.
Stager, Christine, and Janet F. Werker. 1997. Infants listen for more phonetic detail in speech perception than in word learning tasks. Nature
388: 3812.
Trevarthen, Colwyn. 1974. Conversations with a two-month old. New
Scientist 62: 2305.
Trevarthen, Colwyn, Theano Kokkinaki, and Geraldo A. Fiamenghi, Jr.
1999. What infants imitations communicate: With mothers, with
fathers and with peers. In Imitation in Infancy, ed. Jacqueline Nadel
and George Butterworth, 12785. New York: Cambridge University
Press.
Werker, Janet F., and Suzanne Curtin. 2005. PRIMIR: A developmental framework of infant speech processing. Language Learning and
Development 1.2: 197254. Develops a model of early speech perception/word learning. Concise overview of findings of speech perception
research.
Werker, J. F., and R. C. Tees. 1984. Cross-language speech perception: Evidence for perceptual reorganization during the first year of
life. Infant Behavior and Development 7: 4963.

INFORMATION STRUCTURE IN DISCOURSE


The phrase information structure is used to indicate the organization of elements within a sentence in terms of their pragmatic
contribution (in terms of givenness-newness, theme-rheme)
to a piece of discourse or text, as opposed to their syntactic
role (subject, object, etc.) or their semantic role (agent, goal,
beneficiary, etc.). A sentence elements degree of importance,

or salience, will have certain repercussions on its linguistic


realization; in particular, it will influence grammatical choices
(most prominently word order patterns but also, for instance,
voice), prosodic choices (choice of intonation contours and,
especially, placement of sentential stress), and lexical choices
(such as definiteness, ellipsis, pronominalization, and use of
specific particles).
Modern notions of information structure can be traced back
to the Prague School work on functional sentence perspective, as
summarized most accessibly in Firbas (1992). J. Firbas developed
the idea that sentence elements have varying degrees of communicative dynamism (CD), depending on the extent to which they
carry the message forward. In the unmarked case (the basic
distribution), degrees of CD will be reflected in the linear ordering of elements, with sentences starting with the element carrying the lowest CD (defined as the theme), followed by a gradual
rise in CD. It should be noted that there is a relationship between
the degree of CD of an element and its status in terms of givennew (see the following).

Given-New Structure versus Theme-Rheme Structure


Two dimensions, which may be to some extent correlated but
should be kept conceptually distinct, are important here. On
the one hand, the degree of givenness of a piece of information
reflects the extent to which an element can be treated as in some
way recoverable from what precedes or can be assumed to be
present in the hearers consciousness; its thematicity, on the
other hand, reflects the extent to which it represents what the
message is about. As an illustration of how these dimensions
influence linguistic choices, consider the following example:
(1)

A: What happened to Mary? Is she still single?


B: Well, Mary married John last year, but shes already
divorced him.

In Bs response, the noun phrase (NP) Mary in the first sentence (or, more accurately, the piece of referential information represented by the NP Mary) is given information, as it
was mentioned in the immediately preceding turn; John, on
the other hand, is new information, as it cannot be assumed to
be in the hearers consciousness. In Bs second sentence, both
Mary and John are now given information, which explains why
they can be pronominalized as she and him, respectively. At
the same time, it could be argued that both sentences are more
about Mary than about John and that Mary is, therefore, the
theme of both sentences, the remaining information being rhematic (i.e., providing additional information about the theme).
Although both levels are functionally independent, they do
appear to correlate strongly, in that themes tend to consist of
given information.
Example (1) also illustrates some tendencies with regard to
word order and prosody. First of all, new information tends to
occur toward the right of the sentence, whereas given information tends to be initial (in English, therefore, being an subjectverb-object (SVO) language, there is a strong correlation between
givenness and grammatical subjecthood, as shown by Mary in
Bs contribution). Secondly, thematic information tends to be
sentence-initial. In fact, some linguists argue that the theme
is initial by definition (e.g., Halliday 1994). Thirdly, the main

387

Information Structure in Discourse


sentence stress, or tonic nucleus (i.e., the syllable carrying the
stongest degree of prosodic prominence), normally falls somewhere on the new information, namely, on John in sentence 1 of
Bs turn and perhaps on the second syllable of divorced in sentence 2.
The effects of givenness and/or thematicity on word order
are acknowledged by most authors. However, in a substantial
proportion of the worlds languages, word order is partially, or
even predominantly, determined by syntactic rather than information structure considerations. English is a case in point since
it has a fairly strict SVO order from which it is hard to deviate.
Some languages, therefore, have developed other mechanisms
for marking information structure, such as particular nonprototypical patternings resulting in different ordering of elements.
Examples that are prominent in English are the passive voice
(not, strictly speaking, a word order variation but nevertheless
a syntactic operation that has a big impact on linear order); cleft
and pseudocleft constructions (e.g., It is a beer that I would like
or What I would like is a beer); the fronting of an element (sometimes referred to as topicalization, or Y-movement; e.g., A
beer I would like); extraposition of an element (e.g., It is enjoyable to drink a beer on a warm evening); and left and right dislocation (e.g., [As for] John, he loves beer and He loves beer, John
[does], respectively). Some languages may employ other formal
resources for marking thematicity or givenness-newness, the
best known examples perhaps being particles (such as ga and
wa in Japanese).
Example (1) also reveals some potential problems for any
definitions of givenness or thematicity. First of all, it raises the
question whether the given-new distinction is a simple binary
distinction. The verb married in Bs first sentence, for instance,
is new in the sense that it has not been mentioned, but one could
also argue that previous mention by A of the element single has
to some extent activated its antonym (see spreading activation), so that married does not have quite as high a degree of
newness as, say, last year. Secondly, it is important here to distinguish between thematicity on a sentence level, which attempts to
provide an answer to the question what a sentence (or clause) is
about, and the notion of aboutness on the macro level of a longer
stretch of discourse; in the latter case, the term topicality or topichood is often employed (see Brown and Yule 1983). In the previous example, one could argue that the entire exchange is more
about Mary than it is about John, making Mary the discourse
topic for the whole exchange (but not necessarily the theme for
all sentences comprising the exchange).
It should be noted, incidentally, that conceptual vagueness
and terminological confusion abound in the literature: The
terms topic and theme are often used interchangeably, as are
terms such as given-new and background-focus. As has already
been pointed out, Firbas defines the theme as the element with
the lowest degree of CD. Others define theme in terms of aboutness, which is reminiscent of the concept of psychological subject, or what the speaker is talking about. M. A. K. Halliday
defines the theme as the point of departure for the message,
which always correlates with clause-initial position; the rheme
then represents what is said about the theme. He thus assumes
a direct link between discourse function (theme) and linguistic
form (word order).

388

Previous Research on Given-New


In her influential paper on the given-new distinction, E. F. Prince
(1981) starts with an appealing classification of the literature on
given-new according to three approaches to givenness: givenness in terms of shared knowledge, in terms of cognitive salience,
and, finally, in terms of recoverability or predictability. When
givenness in terms of shared knowledge is evaluated, the basic
criterion is the speakers assumption that the hearer knows,
assumes, or can infer a particular thing (but is not necessarily
thinking about it). This definition is problematic in that it fails
to distinguish systematically between knowledge that the hearer
can derive through contextual clues (such as is needed for the
decoding of an anaphoric element) and what he/she knows
as part of his or her background knowledge of the world.
Givenness in terms of cognitive salience can be defined as
knowledge that the speaker assumes to be in the hearers consciousness at the time of utterance. One problem with this view
is that while it might well represent an accurate picture of the
decisions that the speaker has to make on a cognitive level, this
dimension is inherently inaccessible to the outside observer. A
discourse analyst confronted with textual data can only assess
what the speaker might have assumed as given by examining the
linguistic choices the speaker has made (in terms of word order,
intonation, pronominalization, definiteness, and so forth).
The view of givenness in terms of recoverability, finally, is most
often associated with Hallidays systemic-functional framework
(e.g., 1994), which was in turn influenced by the Prague School.
Given information is defined as information that is predictable
or derivable from the preceding discourse context, new information being defined as what the speaker presents as not being
recoverable. Important to note here is that, for Halliday, givennew information is marked almost exclusively through prosody,
more particularly the placement of the tonic nucleus (or sentence
stress). Word order does play an important part in information
structure, but is argued by Halliday to be an indicator of thematicity rather than givenness-newness. Having said that, he does
acknowledge that some noncanonical word order formats, such
as cleft sentences, can be used as markers of given-new flow.
Princes taxonomy of given-new information (1981) defines
givenness (or assumed familiarity, as she calls it) in terms of
speaker assumptions regarding an elements cognitive salience
but offers an attempt at a more sophisticated classification.
First of all, elements can be new, inferrable, or evoked (given).
New entities can be either brand new (i.e., not assumed to be in
any way known to the hearer) or unused (i.e., part of the hearers background knowledge but not in his/her consciousness at
the time of utterance). Inferrables can be retrieved from other
entities via inferential processes (as in The car was useless, as
the battery was flat, where the battery is inferrable from the car
through the inference that cars have batteries). Evoked entities,
finally, can be either textually evoked (i.e., from the surrounding text) or situationally evoked (i.e., through the physical context). The familiarity status of an element will have an influence
on its potential linguistic realization: Brand-new entities, for
instance, will be typically realized as indefinite full NPs (e.g.,
a guy in I met a weird guy yesterday), whereas unused entities
will be definite NPs (e.g., the game in I went to the Knicks game
yesterday).

Information Theory

Challenges for Future Research


One important drawback of most research in this area of discourse analysis has been that quite often, a top-down analytical
apparatus is employed, whereby a classification of givenness
types is proposed on the basis of constructed examples and
is then applied to actual texts (often focusing on narratives, mostly written narrative texts). On the whole, information structure in conversation has not received the attention it
deserves; some so-called word order variations appear primarily to be interactive phenomena, cases in point being left and
right dislocations (see Geluykens 1992, 1994). In particular, the
dynamic, procedural nature of conversation as a collaborative
enterprise and the effect that the turn-taking system may have
on the givenness status of elements have not yet been examined systematically. In addition, there is a dearth of experimentally based analyses trying to determine the exact effect of one
potential variable (such as the referential distance between an
element and its previous mention, for instance) on the givenness status of an item. If one assumes that givenness erodes
over time due to the limits of short-term memory (a reasonable
enough assumption), then that effect should be measurable
under controlled conditions. Experimental studies, however,
tend to be limited to the impact of givenness status on prosodic
realization.
Ronald Geluykens
WORK CITED AND SUGGESTIONS FOR FURTHER READING
Birner, B. J., and G. Ward. 1998. Information Status and Noncanonical
Word Order in English. Amsterdam and Philadelphia: John
Benjamins.
Brown, Gillian, and George Yule. 1983. Discourse Analysis.
Cambridge: Cambridge University Press.
Firbas, J. 1992. Functional Sentences Perspective in Written and Spoken
Communication. Cambridge: Cambridge University Press.
Geluykens, R. 1992. From Discourse Process to Grammatical
Construction: On Left-Dislocation in English. Amsterdam: Benjamins.
. 1994. The Pragmatics of Discourse Anaphora in English: Evidence
from Conversational Repair. Berlin: Mouton de Gruyter.
Halliday, M. A. K. 1994. An Introduction to Functional Grammar. 2d ed.
London: Edward Arnold.
Lambrecht, K. 1996. Information Structure and Sentence Form.
Cambridge: Cambridge University Press.
Prince, E. F. 1981. Toward a taxonomy of given-new information. In
Radical Pragmatics, ed. P. Cole, 22355. New York: Academic Press.

INFORMATION THEORY
This is a general purpose and abstract theory for the study of
communication. Standard information theory was founded
by Claude Shannon (1948) and is based on a communication
framework in which a) the sender must transform a message
into a code and send it through a channel to the receiver, and
b) the receiver must obtain a message from the received code.
Communication is successful if the message of the sender is the
same as the message obtained by the receiver. For instance, a
speaker utters a word for a certain meaning and then the hearer
must infer the meaning that the speaker had in mind. Noise can
alter the code when traveling from the sender to the receiver

through the channel (for instance, the speaker produces Paul


but the hearer understands ball).
While mainstream linguistics is focused on human language,
information theory has been applied to many other contexts,
such as the communication systems of other species (McCowan,
Hanser, and Doyle 1999; Suzuki, Buck, and Tyack 2006), genetic
information storage in the DNA (Li and Kaneko 1992; Naranan
and Balasubrahmanyan 2000), and artificial systems such as
computers and other electronic devices (Cover and Thomas
1991).
Information theory has myriad applications even within the
domain of the language sciences. I give only a few examples. First,
it provides powerful metrics in psycholinguistics for measuring the cognitive cost of a) processing a word (McDonald and
Shillcock 2001), b) an inflectional paradigm (Moscoso del Prado
Martn, Kostic, and Baayen 2004), or c) the whole mental lexicon (Ferrer i Cancho 2006). Second, information theory allows
one to explain certain actual properties of human language. For
instance, the tendency of words to shorten as their frequency
increases (Zipf 1935) can be interpreted as increasing the speed
of the information transmitted (e.g., the number of messages
per second) by assigning shorter codes to more frequent codes.
Another well-known property of human language is G. Zipfs
law for word frequencies, one of the most famous laws of
language. It has been argued that this law could be an optimal
solution for for maximizing the information transmitted when the
mean length of words is constrained (Mandelbrot 1966) or maximizing the success of communication while the cognitive cost of
using words is minimized (Ferrer i Cancho 2006). Third, information theory has shed light on the evolution of language. It
has been hypothesized that the presence of noise in the communication channel could have favored the emergence of syntax
in our ancestors (Nowak and Krakauer 1999), which turns out to
be a reformulation of fundamental results from standard information theory (Plotkin and Nowak 2000). Finally, information
theory offers an objective framework for studying the differences
between animal communication and human language.
It is well known that the occurrence of a certain word depends on
distant words within the same sequence, for example, a text, in
human language (Montemurro and Pury 2002; Alvarez-Lacalle
et al. 2006), and information theory studies in other species
provide evidence that long-distance dependences are not
uniquely human (Suzuki, Buck, and Tyack 2006; Ferrer i Cancho
and Lusseau 2006). Furthermore, research on humpback whale
songs (Suzuki, Buck, and Tyack 2006) questions the conjecture
of Marc Hauser, Noam Chomsky, and W. Tecumseh Fitch (2002)
that only humans employ recursion to structure sequences.
Ramon Ferrer i Cancho
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alvarez-Lacalle, Enric, Beate Dorow, Jean Pierrer Eckmann, and Elisha
Moses. 2006. Hierarchical structures induce long-range dynamical
correlations in written texts. Proceedings of the National Academy of
Sciences USA 103.21: 795661.
Cover, Thomas M., and Joy A. Thomas. 1991. Elements of Information
Theory. New York: Wiley.
Ferrer i Cancho, Ramon. 2006. On the universality of Zipfs law for word
frequencies. In Exact Methods in the Study of Language and Text: To

389

Innateness and Innatism


Honor Gabriel Altmann, ed. Peter Grzybek and Reinhard Khler,
13140. Berlin: Gruyter. It discusses the problems of classic models for
explaining Zipfs law for word frequencies.
Ferrer i Cancho, Ramon, and David Lusseau. 2006. Long-term correlations in the surface behavior of dolphins. Europhysics Letters
14: 10951101.
Hauser, Marc, Noam Chomsky, and W. Tecumseh Fitch. 2002. The faculty of language: What is it, who has it and how did it evolve? Science
298: 156979.
Li, Wentian, and Kunihiko Kaneko. 1992. Long-range correlations and
partial 1/f spectrum in a noncoding DNA sequence. Europhysics
Letters 17: 65560.
Mandelbrot, Benoit. 1966. Information theory and psycholinguistics: A
theory of word frequencies. In Readings in Mathematical Social
Sciences, ed. P. F. Lazarsfield and N. W. Henry, 15168. Cambridge,
MA: MIT Press.
McCowan, Benoit, Sean F. Hanser, and Laurance R. Doyle. 1999.
Quantitative tools for comparing animal communication systems: Information theory applied to bottlenose dolphin whistle repertoires. Animal Behavior 57: 40919.
McDonald, Scott A., and Richard Shillcock. 2001. Rethinking the word
frequency effect: The neglected role of distributional information in
lexical processing. Language and Speech 44: 295323.
Montemurro, Marcelo, and Pedro A. Pury. 2002. Long-range fractal correlations in literary corpora. Fractals 10: 45161.
Moscoso del Prado Martn, Fermn, Alexander Kostic, and Harald R.
Baayen. 2004. Putting the bits together: An information theoretical
perspective on morphological processing. Cognition 94: 118.
Naranan, Sundaresan, and Vriddhachalam K. Balasubrahmanyan. 1998.
Models for power law relations in linguistics and information science. Journal of Quantitative Linguistics 5.3: 3561. A summary of
many optimization models of Zipfs law for word frequencies that are
based on information theory.
. 2000. Information theory and algorithmic complexity: Applications to linguistic discourses and DNA sequences as complex systems. Part I: Efficiency of the genetic code of DNA. Journal of
Quantitative Linguistics 7.2: 12951.
Nowak, Martin A., and David Krakauer. 1999. The evolution of language.
Proceedings of the National Academy of Sciences USA 96: 802833.
Plotkin, Joshua, and Martin A. Nowak. 2000. Language evolution and
information theory. Journal of Theoretical Biology 205.1: 14759.
Shannon, Claude E. 1948. A mathematical theory of communication.
Bell Systems Technical Journal 27: 379423, 62356.
Suzuki, Ryuji, John R. Buck, and Peter L. Tyack. 2006. Information
entropy of humpback whale songs. Journal of the Acoustical Society
of America 119: 184966.
Zipf, George Kingsley. 1935. The Psycho-biology of Language.
Boston: Houghton Mifflin.

INNATENESS AND INNATISM


What does our knowledge come from? Innatism is the position
that at least some of our knowledge is inborn rather than derived
from experience. If so, the question naturally comes up concerning what types of knowledge should be taken to be innate,
giving rise to specific hypotheses about innateness. In linguistics, innateness became an important issue with the development of generative grammar, giving rise to a debate that
is inextricably linked to language acquisition, namely, whether
or not language acquisition is based on an innate universal
grammar (UG). Innatism originated as a philosophical position, often focusing on concepts varying from god to justice to

390

mathematics. The issue in linguistics is empirical, though. The


question of whether some knowledge of language must be innate
has sparked a great amount of controversy over the last decades,
during which philosophical positions (rationalism versus empiricism) were not always kept separate from the empirical issue.
Prominent protagonists include Noam Chomsky, Jerry Fodor,
Ray Jackendoff, Steven Pinker, Kenneth Wexler, Geoffrey Pullum,
Terrence Deacon, Michael Tomasello, and earlier, from somewhat different perspectives, W. V. O. Quine and Eric Lenneberg.
The debate arises in essentially two domains: concept learning and the acquisition of the structure of a language. Concept
learning is associated with Quines thesis of the indeterminacy of translation (1960), illustrated by his gavagai
example: Suppose one is trying to learn a hitherto unknown language. One is walking around accompanied by one of the speakers of the language. Suddenly a rabbit runs by and the speaker
utters gavagai. As Quine shows, there are innumerable English
translations one could come up with for gavagai, among which
it would be impossible to decide on the basis of the experience
alone. To the extent to which in practice we succeed in doing
so, this is something to be explained, for instance, by ascribing
to us an innate conceptual structure, which is how Fodor (1983
and subsequent work) proposes that we should account for our
convergence.
The task of acquiring a language can be characterized as follows (as in Chomsky 1986 and subsequent work). A human infant
who is exposed to a language will in the course of roughly four to
six years (seven years if one counts full mastery of irregular morphology, or raising verbs; for the latter, see Hirsch and Wexler
2007) acquire a knowledge of that language identical to that of
a human adult who knows the same language (with the exception of the lexicon, which keeps on growing in adulthood). In this
respect, he or she is entirely unlike apes, dogs, bees, and so on,
that, regardless of how much they are exposed to a language, will
never reach anything like human adult competence. It is also
uncontroversial that a child is not more predisposed to learning one language than another. But given that a child exposed
to Dutch ends up learning Dutch and not, for instance, Chinese,
and that the converse holds for a child exposed to Chinese, the
input must be crucial in determining which language ends up
being learned.
Consequently, language acquisition can be schematically represented as in Figure 1, where the various stages that a child goes
through are represented by Si and the Di are the data to which
he or she is exposed, yielding a change in state (not prejudging
the question of whether these stages involve major qualitative
discontinuities). S0 is termed the initial state, before the child is
exposed to language. (For the discussion, it is immaterial at what
age we put S0. If exposure to language can already take place in
some form in the womb, as there is evidence to believe, S0 can be
put just before the first exposure in the womb.) Sn is the steady
adult state that does not change anymore if over time more data
are presented (with the exception of the lexicon).
This sketch as such should be uncontroversial, and even by
a professed non-nativist such as Tomasello (2003), it can be
rejected only at the price of incoherence. The real issue involves
the properties of S0 (abstracting away from the possible role of
maturation of the brain). By definition, S0 is the state of being

Innateness and Innatism

S1

S0

S2

Si

Sn

Sn

Sn
Figure 1.

D1

D2

Di

Dn

Dn+1

able to acquire language, distinguishing humans from apes,


dogs, and so on. That is, it reflects mans innate capacity for
language. By the definition of the term universal grammar as it
is used in generative grammar, S0 coincides with UG. Whereas
a non-nativist could object to the term UG, nothing more is
involved than a terminological issue. What is really at stake are
questions such as the following: i) What properties must S0 have
in order to be able to account for the fact that language can be
acquired, given what we know about time course and access
to data? ii) Which of these properties are specific to man? And
iii) What aspects of S0 are specific to language, and how are the
other aspects related to other human cognitive capacities? The
first question can only be successfully approached by carefully
investigating necessary and sufficient conditions for
learning and for learnability of (classes of) languages of the
human type. The second and third questions require an understanding not only of the human language capacity but also of
those other cognitive capacities among which it is embedded.
Logically, S0 could be empty. However, this would entail no
difference between humans and animals, contrary to what we
know. So for this empirical reason alone, S0 cannot be empty.
But as is discussed in the next section, it would also entail that
humans cannot acquire language contrary to what we know.

The Logical Problem of Language Acquisition


A useful strategy for demonstrating the difficulty of a problem is
to simplify it. If the simplified problem is still hard, one knows
that the original problem is at least as hard. So we take a highly
simplified question as a starting point (Wexler and Culicover
1980): What does a person who knows a language minimally
know? A reasonable answer is the following: A person who knows
a language knows at least which strings of words correspond
to well-formed sentences in that language and which strings
dont. (This simplification is valid irrespective of the changes in
the significance attached to this particular aspect of linguistic
knowledge, from Chomsky 1957 to Chomsky 1995.)
In this simplified picture, we view a language as a subset
of the set of all expressions one can form over a given vocabulary. That is, assuming that the vocabulary of English contains
the elements the, dog, bites, man, the set of English sentences
will contain the dog bites, the dog bites the man, and so on, but
not the bites man, bites dog the, and so on. The task of the child
acquiring English, therefore, minimally includes determining
what the full set of English sentences is like on the basis of the
sentences he/she is exposed to for some period of time, lets say
for six years. The question is then to get an impression of how
hard this task is.
Note that there is no upper bound to the length of individual
sentences. This makes the set of sentences in a language effectively infinite. However, even if restricted to sentences under

Dn+2 ........
a reasonable length, the number of well-formed sentences in
English is astronomical. It has been estimated that the number
of grammatical English sentences of 20 words and less is 1020
(Levelt 1967). (Note that this very normal sentence is exactly 21
words and costs nine seconds to pronounce). At an average of six
seconds per sentence, it will cost 19 trillion years to say (or hear)
them all. In the case of nonstop listening, the percentage of these
a child could have heard in six years time is 0.000000000031,
clearly still a gross overestimation as compared to what the child
can actually be expected to hear. So on the basis of at most such
an extremely small percentage of the potential input, the child
gets to know how language works.
There are many further practical complications we ignored,
such as lack of homogeneity and the presence of errors in the
data. If we were to take these into account, the task would only
become more formidable. This sets the stage for the logical
problem of language acquisition (for instance, Chomsky 1965,
1986), which can be formulated as the projection problem:
Consider a given finite set taken from some (infinite) superset.
Determine (the characteristic function of) this superset on the basis
of this subset.

Like anyone can see, this task is in its generality impossible. For
any finite given set, the projection problem has infinitely many
solutions.
For a concrete illustration, consider the following task involving the completion of a series of numbers:
(1)

1,2,3,4,5,6,7, .

One might say that finding the next number is easy. It should
obviously be 8. But of course, this is not at all guaranteed. It is easy
to think of a perfectly well-behaved function that enumerates the
first seven natural numbers, followed by their doubles, triples,
quadruples, and so on. This illustrates a very simple point: There
is no general procedure to establish the correct completion of
some initial part of a sequence, whether a sequence of numbers
as in (1) or a data sequence (D1, D2, Di, Dn) as in Figure 1.
This fact reflects a poverty of the stimulus in a fundamental
sense, as a trivial logical truth.
The completion task may become possible, however, if it is
redefined as the task to find a solution within a restricted space of
possible solutions. In that case, certain instances of the projection
problem become even trivial. For instance, (1) can be trivially
completed if it is given that there is a constant difference between
each member of the series and its successor.
As E. Gold (1967) showed, even highly restricted hypothesis
spaces may not ensure a solution of the projection problem as
defined. The task may become easier if the input does contain
systematic evidence as to what is not in the target language.
Since, as is generally acknowledged, the input to the child does

391

Innateness and Innatism


not contain systematic negative evidence, it becomes of prime
importance to identify the types of hypothesis spaces that do
allow learning of natural languages by presentation only. The
absence of negative evidence, together with the fact that a substantial number of actual utterances a child may hear will be less
than entirely well formed, are often referred to as poverty of the
stimulus as well. But this is, in fact, not the same notion as the
fundamental, logical one employed in the discussion of Figure 1
and (1). For a rational debate it is crucial to keep the, first, logical
sense and the, second, narrower, empirical sense apart.
Given the poverty of the stimulus in the logical sense, language acquisition cannot be accounted for without the assumption of innate genetically determined restrictions. The poverty
of the stimulus in the empirical sense may help provide further
evidence on what these are. In the generative literature, it is
often claimed that these restrictions have the form of an inventory of grammatical principles. It is presumably this reference
to grammatical principles that led to the poverty of the stimulus
debate that is, for or against the existence of innate principles
of grammar as it is usually conducted with its emphasis on the
poverty of the stimulus in the narrower sense. But for a fruitful
discussion, it is crucial to distinguish between the minimal properties that S0 must have in order to explain language acquisition
and the further question of whether S0 has properties that are
specific to language.

Learnability and Complexity


Each restriction on the hypothesis space defines a class of grammars and languages. As pointed out by Gold (1967) and Wexler
and Peter Culicover (1980), learnability does not depend on the
complexity of the individual language/grammar but only on the
structure of the class in which the selection must be carried out.
Many contributions to the debate center on specific examples, such as the question of whether the child uses and understands utterances that are unexpected given the input up to a
certain stage, whether or not the input is restricted (as in the case
of motherese, a restricted register caretakers use in addressing
their children), or whether the input is richer and provides more
clues than one might have initially expected. As Pullum and
Barbara Scholz (2002) point out, one must distinguish between
the general issue of a specific genetic endowment and debates
as to whether particular clues that have been argued to be nonexisting can or cannot be found in natural language corpora. But
important as issues of the latter type are, they are independent of
the general problem.
Deacon (1997) attributes to Chomsky the position that natural
language is too complex to be learned without rich innate mechanisms and then proceeds to argue that it is in fact not as complex as is being claimed. Part of the argumentation in Tomasello
(2003) is based on the same premises. As demonstrated in the
previous section, none of these issues bears on the poverty of the
stimulus in its logical sense.
The same applies to arguments stating that the input is much
richer than nativists presuppose. As Wexler and Culicover
(1980) show, proposals involving an enriched input (a more
structured presentation in which pragmatic clues are provided
by the context of the utterance, as in the case of motherese ) magnify the logical problem instead of decreasing it. Such proposals

392

require a theory of how to acquire the ability to use the context of


utterance, or a formal proof that this structuring of the presentation sufficiently aids the child in setting up correct hypotheses
and rejecting incorrect ones but, crucially for the non-nativists,
without attributing to the child innate knowledge as to what evidence is to be absorbed and what evidence ignored. These are all
equally susceptible to the poverty of the stimulus argument in
the logical sense.
Clearly, it is important to have a conception of what it means
for something to be a property of UG. For instance, Chomsky
(1980) suggests that the specified subject condition (SSC) defining the domain in which anaphors must be bound (condition A) and pronominals must be free (condition B) is part of
our genetic endowment. The question is, then, what it means
for UG. Must an SSC be hardwired as such? Or is it sufficient
if the restrictions on binding descriptively captured by the SSC
follow from basic properties of our computational system, modulo properties of the mental space in which these computations
take place? For instance, Chomsky (1995) and subsequent work
argue that grammar is based on a very simple set of combinatory principles (essentially merge and Agree) and conditions
that follow from general properties of computation. If so, that
is what UG essentially amounts to. Reuland (2005) shows
that the core of condition B the need to license reflexivity of
a predicate follows from the fact that no computational system can distinguish between indistinguishables, as in the case
of the arguments of a reflexive predicate. Elements like self or
other morphological markers must be added for the system
to handle these arguments. To derive condition A, no more is
needed than a general principle of economy of encoding, the
general combinatorics of the language system, and the lexical
semantics of self as an identity predicate. Thus, prima facie
substantive properties of language and UG reduce to the interaction between general properties of mental computations and
lexical representations. If so, there is indeed no sense in which
conditions A and B are acquired. They reflect basic properties
of the system embedded in our wetware but it takes extensive
linguistic research to show that this is so.

Language and Pattern Recognition


Non-nativists crucially invoke general learning strategies that
originate in our general cognition. However, a statement that
language is an emergent property of our general cognitive system requires a substantive theory of its workings, specifying how
its operations account for language with the same amount of
precision as the rules of formal linguistics (none of the properties in the explanans should invoke the explanandum). So far, no
precise proposals have been made available.
The essence of Tomasellos claim is that there is no interesting problem in language acquisition, since whatever is needed is
provided by our abilities for pattern recognition. Tomasello (and
others) assume that there exists an ability for pattern recognition
that provides the tool for finding the patterns in language. Of
course, we humans have the ability to find patterns. However, it
is a fallacy to think that what we do is finding the patterns that
are out there. The main message about concept learning to be
gleaned from Quines insights is that our mind must impose patterns. Elementary considerations from particle physics show us

Innateness and Innatism


the same: Our common senses are blatantly incapable of seeing
reality as it is. As our extended senses (in the form of experimentation and model building) teach us, what is actually out
there bears minimal resemblance to what we can observe with
our common senses. If even individual events that we observe,
imprint in our memory, and store can only have a remote resemblance to what is there, the more so for the patterns that we
find.
Any pattern involves extrapolation beyond what can be
observed; even the simplest of observations requires an active
mind shaping our internal representation of what is out there.
We can do no more than impose a pattern (one of the zillions of
possible patterns any piece of reality embodies), hopefully in
a way compatible with our survival. The good thing is that evolution resulted in our patterns being useful enough for everyday purposes since we managed to survive (for the moment).
The bad thing is that we can do so not because we are so smart
but because we are so limited. We can learn only if we ignore
the many logically possible alternatives. Evolution keyed us
to the universe. The key to understanding how we learn is to
understand our limitations, the logically possible patterns we
ignore.
In this respect, learning a language is like finding the patterns in the surrounding universe. In another respect, there is
a difference, although it has been misconstrued. Deacon (1997)
proposes that our capacity to learn language is not surprising
since language is a human product and, therefore, is made to
be learned. However, unless we are careful, this leads us back
to the whole range of issues discussed, framed slightly differently: How are the properties allowing language to be learned
reflected in its structure, and what does the fact that we acquire
language tell about our cognitive abilities?
Nevertheless, it also contains a relevant insight. Unlike the
physical world, language is a product of the human mind. So we
know that the childs mind is keyed to getting to know language
in a way he or she will never know the physical world: Complete
knowledge is attainable. As in the case of the physical world, the
input for learning language is external. However, unlike in the
case of the physical world, we know the nature of the input to
an extent that is unique. So language acquisition reflects laboratory conditions for learning, facilitating an understanding of
learning per se.
It is surprising that so many researchers of human learning
have such a hard time accepting the implications of the projection problem, though the moral is so simple, like the first law of
thermodynamics prohibiting the perpetuum mobile. At it reads
in the well-known words of C. P. Snow:
You cannot win (that is, you cannot get something for nothing)

Applied to learning:
Learning a recursive step by presentation only, without restrictions
on the hypothesis space, is as impossible as creating the perpetuum
mobile.

The last sentence in Tomasello reads: How children become


competent users of a natural language is not a logical problem
but an empirical problem (2003, 328). Paradoxically, I agree
and, at the same time, would like to say that it illustrates the

Integrational Linguistics
depth of misunderstanding involved: It is an empirical problem,
but analyzing the logical problem is essential for solving it.
All this does not demonstrate that the restrictions necessary
for language acquisition are specific to language. It does show
that such restrictions are there and have to be studied if we are to
understand language acquisition. No insight can be gained unless
precise and substantive hypotheses are formulated and tested in
a way that reflects what we already know about language.
Eric Reuland
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton.
. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
. 1980. On cognitive structures and their development. In
Language and Learning, The Debate between Jean Piaget and Noam
Chomsky, ed. M. Piattelli-Palmarini, 3552. Cambridge: Harvard
University Press.
. 1986. Knowledge of Language: Its Nature, Origin and Use. New
York: Praeger.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
Deacon, Terrence. 1997. The Symbolic Species. New York: Norton.
Fodor, Jerry. 1983. The Modularity of Mind. Cambridge, MA: MIT Press.
Gold, E. 1967. Language identification in the limit. Information and
Control 16: 44774.
Hirsch, C., and K. Wexler. 2007. The late development of raising: What
children seem to think about seem. In New Horizons in the Analysis
of Control and Raising, ed. W. Davies and S. Dubinsky, 3770.
Heidelberg: Springer.
Kandel, Eric, James Schwartz, and Thomas Jessell. 2000. Principles of
Neural Science. 4th ed. New York: McGraw-Hill.
Lenneberg, Eric H. 1967. Biological Foundations of Language. New
York: Wiley.
Levelt, Willem. 1967. Over het Waarnemen van Zinnen. Groningen,
Germany: Wolters.
Pullum, Geoffrey K., and Barbara C. Scholz. 2002. Empirical assessment
of stimulus poverty arguments. Linguistic Review: 19.1/2: 950.
Quine, Willard Van Orman. 1960. Word and Object. Cambridge, MA: MIT
Press.
Reuland, Eric. 2005. Binding conditions: How are they derived? In
Proceedings of the HPSG05 Conference Department of Informatics,
University of Lisbon, ed. Stefan Mller. Stanford, CA: CSLI Publications.
Available online at: http://csli-publications.stanford.edu/
Tomasello, Michael. 2003. Constructing a Language. Cambridge: Harvard
University Press.
Wexler, Kenneth, and Peter Culicover. 1980. Formal Principles of
Language Acquisition. Cambridge, MA: MIT Press.

INTEGRATIONAL LINGUISTICS
This is the application of integrational semiology to the study
of language. Integrational linguistics is based on the assumption that human communication, whether verbal or nonverbal,
involves the creation of signs in particular contexts whereby two
or more individuals engage in joint integration of their respective activities. Thus, for example, speech communication would
be impossible without integration of the biomechanically separate activities of vocalization and hearing. Written communication requires the production of marks on a surface that can be
integrated with programs of optical scanning. The appropriate
integration of these and many other activities is one of the major

393

Integrational Linguistics
functions of the cerebral cortex, and failure to achieve the integrational proficiency required for social fluency in communication is commonly perceived as a defect or handicap of some
kind (e.g. deafness, dyslexia, etc.) when due to physiological
factors. Integrationists differentiate forms of communication
according to the range of activities typically integrated by the
participants, and the kinds of integrational proficiency typically
required for participation. A blind person, for obvious reasons,
lacks the integrational proficiency presupposed in various forms
of visual communication (see blindness and language).
From this perspective, the term language does not correspond
to any one mode of communication but straddles or conflates
several. In this respect, integrational linguistics differs radically
from mainstream schools of thought in linguistics and neighbouring disciplines, which tend to assume that language is a single human faculty, common to all humanity, and that languages
(English, French, Latin, etc.) are different social codes enabling
individual users to exercise this faculty. For integrationists, on
the other hand, individuals are not language users but language
makers. They make language by their creative integration of verbal signs into a myriad diverse activities, in both expected and
unexpected ways, with due regard for the circumstances, just as
they make human relationships by the ways in which they interact with others in particular cases.

Axioms of Integrational Semiology


The axioms of integrational semiology are as follows:
1. What constitutes a sign is not given independently of the
situation in which it occurs or of its material manifestation in
that situation.
2. The value of a sign (i.e., its signification) is a function of the
integrational proficiency that its identification and interpretation presuppose.
As applied to language studies, this means that verbal communication of whatever kind cannot be decontextualized.
Episodes of communication are episodes in the lives of particular individuals at particular times and places. These episodes
have to be studied as such. We learn nothing from an analysis
telling us, for instance, that someone uttered the sentence John
loves Mary, that John is the subject of the sentence, Mary is the
direct object of the verb love, love is a transitive verb, and so on
nothing, that is, except information about the metalinguistic
assumptions of the analyst. (These assumptions may be worth
studying in their own right, but that is not the same as studying
the facts pertaining to the utterance in question. On the contrary,
the analysis is one that already embarks on a decontextualization
of the episode allegedly described.)
It is not simply a matter of knowing who said what to whom,
where, and in what circumstances. Nor are the circumstances
what happened immediately before and after. Orthodox modern
linguistics, like traditional grammar, routinely assumes the legitimacy of abstracting from all these features of context. Its statements are supposedly generalizations across indefinitely many
unidentified episodes of language use. Integrational linguistics
rejects on principle the legitimacy of such generalizations: Again,
they tell us nothing about linguistic facts only about the intellectual preferences or prejudices of the analyst.

394

The Principle of Noncompartmentalization


It also follows from the semiological axioms that there is no strict
or objective dividing line between linguistic knowledge and nonlinguistic knowledge, or, as some theorists put it, between knowledge of a language and knowledge of the world. Recognition of
this indivisibility is referred to by integrationists as the principle
of noncompartmentalization. In other words, human beings do
not live in a communicational environment where what pertains
to language belongs to one compartment and the rest belongs to
some other compartment (or compartments).
Non-compartmentalization is also heresy in orthodox linguistics, since it implies that linguistics cannot be a science.
(A physicist who confessed inability to differentiate between the
facts of the physical world and the nonphysical world would be
confessing to a similar heresy.) In an academic milieu where
every inquiry aspires to be scientific, this doctrine is not
popular.
The integrationist view can be illustrated by considering what
happens at a cocktail party. Physically, a certain level of audible
vibration is generated (often said to be deafening, though very
small as compared with the energy required to light one electric
lamp). Physiologically, there is much expenditure of effort in
terms of the muscular action of vocal apparatus (but again very
small by comparison with the effort required to walk across the
road). Mentally, there is doubtless engagement in interactions
with others, but it cannot be quantified. So where in all of this is
the language component? It seems to be in there somewhere,
but exactly where defies exact location. To ask where is itself a
nonsense question. And to grasp why it is a nonsense question is
already halfway to subscribing to the integrationist principle of
noncompartmentalization.
The orthodox linguistic answer is that the language component resides somewhere in the heads of the talkers and listeners. But so, presumably, does their knowledge of football, food,
local politics, and everything else; that is, all the things being
talked about at the cocktail party. So exactly the same compartmentalization perplexity arises at one remove. Human beings
cannot in their everyday lives distinguish between knowing
something about X and being able to talk about it.

The Principle of Cotemporality


The integrationist principle of cotemporality complements the
principle of noncompartmentalization. Everyday experience
recognizes that an event occurring at time t may affect how we
interpret an event occurring earlier or later than t. Temporal
sequence is an intrinsic aspect of contextualization.
Compare the situation in which (1) landlord says The water
is turned off today and tenant says I must have a shower with
the situation in which (2) tenant says I must have a shower and
landlords response is The water is turned off today. Ostensibly,
the same information has been exchanged and the same words
used. But what emerges from the communicational episode is
quite different in the two cases.
For integrationists, the question of temporal sequence
involves both verbal and nonverbal behavior. In brief, there is no
way that what is said can be set apart from the train of events in
which it occurs, whether these are verbal or nonverbal. We all
know this. That is why in legal disputes courts treat differently

Integrational Linguistics
a case in which A insulted B, who then struck A, from a case in
which A struck B, who then insulted A. As individuals, we are
time-bound agents in all our activities. Our linguistic acts do
not have some special time track of their own. There is no such
thing as a contextless linguistic sign.
This does not mean that the context simply is a given
sequence of events. Integrationists take a very different view of
context from that usually found in orthodox linguistics. Context
is not to be equated with situation (which may be irrelevant)
and even less with preceding or following speech-acts. It is
not some kind of local backdrop against which communication
takes place. Context, for the integrationist, is always the product
of contextualization, and each of us contextualizes in our own
way. The individual participants in any communication situation
will each contextualize what happens differently, as a function
of the integrational proficiency each exercises in that situation.
This does not mean that we can never reach communicational
agreement, but it explains why we often do not. It is not enough
to say that every act of communication is unique. Each such act
is in principle subject to multiple contextualizations and recontextualizations. That is what makes it essential in linguistic analysis for the analyst to specify what forms of contextualization are
presupposed (a requirement that the great majority of orthodox
linguists ignore). For integrationists, a sign is not a sign until it
has been contextualized: The act of contextualization and the
establishment of the sign are one and the same.

Meaning
It follows, then, that integrationists take a quite different view of
meaning from that which informs most work in orthodox linguistics, where the meaning of a linguistic form is usually construed
as some kind of concept (as in the definitions of conventional
dictionaries) or, even more vaguely, mental representation.
This is traditionally imagined to yield a more or less permanent
value attached to the form and known to all competent speakers and writers of the language in question (even if the language
is no longer living). Thus, the meaning of classical Latin aqua is
treated as timeless: It still means for Latin specialists what it
meant in the days of Julius Caesar and will mean in a thousand
years from now.
For integrationists, this assumption conflicts with the principle of cotemporality. There are no atemporal invariants in language. Meaning is made by participants as part of the process of
communication. It is thus subject to the principle of cotemporality in just the same way as all other aspects of communication.
There are no fixed meanings. There is nothing in language to provide us with a miraculous guarantee of the stability of meaning(s)
over time or even from one moment to the next. To demand
such a guarantee for any mode of communication is as futile as
demanding that a currency remain stable in value from one day
to the next. (The indeterminacy holds regardless of whether people shopping in High Street are aware of it.)
We have no option but to interpret particular episodes of
communication by integrating them into the unique temporal
sequence of events that constitutes our previous experience.
Thus, where there are two or more participants, what is communicated must be open to two or more interpretations. These
cannot be guaranteed to coincide. People are no more obliged

to agree about the meanings of words than they are obliged to


agree about the value of goods. In both cases, it usually suits their
purposes to make some kind of compromise with those they are
dealing with. But the nature of this semantic compromise is
essentially ad hoc.

Parameters of Communication
For integrationists, there are certain capacities required of an
individual in order to participate in communication. These
capacities are of three kinds: biomechanical, macrosocial, and
circumstantial. The first relates to the human organism and the
ability to integrate activities requiring a wide variety of physiological and mental processing (e.g., the very different biomechanical requirements involved in speech production and
hearing). The second relates to the human ability to integrate
particular activities into sets of assumptions provided by social
conventions of various kinds. The third relates to the ability to
integrate ones activities into whatever else is going on at the
time.
A simple illustration of all three is provided by what happens
when a motorist encounters a pedestrian about to cross the road.
At the macrosocial level, there are assumptions about the road,
the conventions of the highway code, and so on. At the biomechanical level, there are questions about what the motorist sees
within a certain range of visibility and the alertness necessary for
initiating the appropriate physiological actions for driving the
vehicle. At the circumstantial level, even when all of these aspects
have been taken into account, the individual motorist must constantly be prepared to modify any action taken, depending on the
moment-to-moment behavior of the pedestrian and other road
users.
This does not mean that the levels of integration are independent. Circumstantial factors intervene across the board. An
accident may be caused because the pedestrian is shortsighted
or the road badly lit. The drivers exercise of biomechanical skills
may be affected by knowing or not knowing that the brakes of
this particular vehicle are not very reliable.
Much recent work in integrational linguistics has focused on
analyzing how, at the macrosocial level, societies succeed in constructing supercategories, which integrate what would otherwise
be quite separate activities in such a way as to set up a common
framework for the intellectual and practical pursuits of society as
a whole. These supercategories include science, art, history, and
religion, each with a dedicated discourse of its own.
Roy Harris
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Harris, R. 1998. Introduction to Integrational Linguistics. Oxford:
Pergamon. A concise survey of the whole field.
Harris, R., and C. Hutton. 2007. Definition in Theory and Practice.
London: Continuum. An integrationist approach to problems of definition, with particular reference to lexicography and the law.
Love, N., ed. 2006. Language and History: Integrationist Perspectives.
London: Routledge. Collection of papers bearing on the construction
of the macrosocial supercategory history.
Toolan, M. 1996. Total Speech: An Integrational Linguistic Approach to
Language. Durham, NC: Duke University Press. Includes interesting
discussions of literal meaning, metaphor, and related issues.

395

Intension and Extension


Wolf, G., and R. Harris , eds. 1998. Integrational Linguistics: A First Reader.
Oxford: Pergamon. Collection of papers covering a wide range of topics
from an integrationist perspective.

INTENSION AND EXTENSION


semantic theories commonly distinguish two aspects of linguistic meaning: intension and extension. Roughly, the intension of a
linguistic expression is what it means, and its extension is what it
refers to. For example, on some views, Neil Armstrong and the
first human being to walk on the Moon have the same extension but different intensions. Often it is held that intension determines extension (expressions with the same intension have the
same extension) but is not determined by extension (expressions
with the same extension may have different intensions).
This entry describes various roles that intensions and extensions play in contemporary semantic theories. While much
recent discussion uses formal methods, the presentation here is
nontechnical.
Some possible terminological confusions should be noted.
A long tradition of distinguishing different aspects of meaning
(including Porphyrys third-century commentaries on Aristotle,
the Port-Royal logic of 1662, Mill [1843] 1872, and Frege [1892]
1997) has left us with a hodgepodge of terms and distinctions.
One should be wary whether, for example, the Port-Royal logics comprehension and extension, or Freges Sinn and
Bedeutung (often translated sense and reference; see sense
and reference), or Mills connotation and denotation
mark the same distinction as intension and extension in
recent theories. Moreover, extension is sometimes, but not
always, taken to be synonymous with denotation, designation, or referent, while intension, intensional, and intensionality are sometimes, but should not be, confused with
intention, intentional, and intentionality. Arguably, there
is a relation between a speakers intentions and the meanings of
the words he or she uses; however, the connection should not
be drawn via terminological confusion. intentionality, the
distinctive property of thoughts and other mental phenomena,
is yet something else entirely.

Why Accept a Distinction Between Intension and


Extension?
Knowing what sentence (1) means is enough to be able to know
that it is true.
(1)

Mark Twain is Mark Twain.

However, (2) differs.


(2)

Mark Twain is Samuel Clemens.

Although (2) is true, merely knowing what (2) means is not


enough to be able to know that it is true. So (1) and (2) differ
in meaning. But (2) is just like (1) except for an occurrence of
Samuel Clemens instead of Mark Twain. So, Mark Twain and
Samuel Clemens differ in meaning since the meanings of (1) and
(2) are determined only by the meanings of their parts and the
way they are put together. (See compositionality.)
Similarly,
(3)

396

Lola believes that Mark Twain is Mark Twain.

differs in meaning from


(4) Lola believes that Mark Twain is Samuel Clemens.

For if (3) and (4) have the same meaning, then they are either
both true or both false. But (3) may be true while (4) is false for
example, if Lola does not realize that Mark Twain and Samuel
Clemens refer to the same man. Since (3) is just like (4) except for
an occurrence of Samuel Clemens instead of Mark Twain, Mark
Twain and Samuel Clemens differ in meaning.
Such are two arguments that Mark Twain and Samuel Clemens
differ in meaning. But if the extension of a proper noun is its
referent, then Mark Twain and Samuel Clemens have the same
extension. Thus, some (but not all) conclude, there is a distinction between two aspects of meaning: intension and extension.

Extensional and Nonextensional


Arguably, then, a true sentence (3) may be turned into a false sentence (4) merely by substituting an occurrence of Mark Twain by
Samuel Clemens. Suppose that the extension of a sentence determines its truth value and that Mark Twain and Samuel Clemens
have the same extension. If so, then to change (3) into (4) is to
replace part of sentence (3) with a coextensional expression,
while failing to preserve the extension of the entire sentence.
Contexts like Lola believes that are typically thought to
be nonextensional. An extensional context is a context wherein
substitution by a coextensional expression always preserves the
extension of the larger expression. A nonextensional context is
thus a context wherein substitution by a coextensional expression sometimes fails to preserve the extension of the larger
expression.
Nonextensional contexts appear to be widespread in natural
language. In English, for instance, propositional attitudes
(like Lola believes that , John desires that ), as well as
many other constructions (involving, for example, seeks,
admires, avoids, resembles, necessary, possibly,
must, may, obviously, and because), have been held to
be nonextensional.
Nonextensional contexts are often called intensional. But
sometimes, more carefully, intensional is reserved for those
contexts wherein substitution of a cointensional expression
always preserves the intension of the larger expression. Contexts
wherein cointensional substitution may fail are then called nonintensional or hyperintensional. Lola believes that is one
context thought by some to be non-intensional in this sense. For
example, according to some, eye doctor and ophthalmologist have the same intension, but Lola believes that Eve is an
eye doctor and Lola believes that Eve is an ophthalmologist
have different intensions: If Lola doesnt know the word ophthalmologist, the former may be true and the latter false.

Semantics Without Intensions


W. V. O. Quine argued that intensions and other elements of traditional theories of meaning have no place in a scientific description of the world. He claimed that intensions are on a par with
the Homeric gods: Intensions play no useful explanatory role in
a scientific description of the world. According to one of his most
influential arguments, there is no noncircular way to make sense
of traditional notions like meaning, synonymy, analyticity,

Intension and Extension


and the like. Indeed, he argued, there is no distinction between
sentences true simply in virtue of meaning analytic truths and
other true sentences synthetic truths (Quine 1953, 1960).
Greatly influenced by Quine, Donald Davidson (1967) proposed a semantic theory for natural language with no place for
intensions. In a Davidsonian theory, each linguistic expression
is paired with an extension (other terms like referent or semantic
value are often used instead of extension), and rules of composition state how the extension of a larger expression is determined
by the extensions of its parts. For example, in one version of the
theory, the extension of John is John, the extension of runs
is the set of running things, and the extension of John runs is a
truth value (see truth conditional semantics).
Challenging for this approach are the problems that drive
some to posit intensions in the first place. If Mark Twain and
Samuel Clemens have the same extension, then how to explain
the differences between (1) and (2) or between (3) and (4)? But
if Mark Twain and Samuel Clemens have different extensions,
then how to explain why Mark Twain is Samuel Clemens is
true? (For possible answers see, for example, Larson and Segal
1995.)

Semantics with Intensions


More commonly, intensions are taken seriously. In Meaning and
Necessity, Rudolf Carnap ([1947] 1956) presented his method
of intension and extension as an improvement over Gottlob
Freges way of distinguishing sense and reference. In Carnaps
system, each meaningful expression is assigned both an extension and intension. For example, the extension of human is the
class of human beings, the intension of human is the property of
being human, the extension of Walter Scott is Walter Scott, the
intension of Walter Scott is an individual concept of Walter Scott,
the extension of a sentence is its truth value, and the intension
of a sentence is the proposition it expresses. Notably, unlike
Freges senses, Carnaps intensions do not vary with linguistic
context, thereby, according to Carnap, avoiding a serious objection both to Freges approach and the related proposal by the
influential logician Alonzo Church (1951).
In applying a precise distinction between intension and extension to natural language, Richard Montague (1974; see montague
grammar) was particularly influential. Inspired by Carnap and
Church, Montague took an intension to be a mathematical function. This permitted a graceful way for meanings to combine, as
the application of function to argument. In one version of this
sort of theory (possible world semantics), the intension of
a sentence is a function from possible worlds to truth values, and
the extension of that sentence relative to a possible world is the
result of applying its intension function to that possible world. For
example, the intension of the sentence Hong Kong is in China
might be a function yielding the value true for all possible worlds
where Hong Kong is in China and false otherwise. Variations and
improvements to Montagues general approach have dominated
the field of formal semantics (Gamut 1991; Heim and Kratzer
1998; Chierchia and McConnell-Ginet 2000).

Complications and Problems


Complications ensue when accounting for context-sensitive expressions like the indexicals I or here or now.

Montague proposed complex intensions, mapping not possible


worlds but indices a combination of a possible world with
persons, places, times, and so on. David Kaplan (1989) instead
divided intensions into two pieces: character (the linguistic
meaning of an expression type) and content (the meaning of an
occurrence of an expression). Roughly, character plus context
determines content, and content plus context yields extension.
Kaplans arguments about indexicals and demonstratives, as
well as Saul Kripkes and Hilary Putnams on proper names and
natural kind terms, have led many to accept that these terms
are devices of direct reference; in effect, occurrences of these
terms have extensions but no intensions. Others have responded
by adopting the framework of two-dimensional semantics,
where expressions are assigned two different intensions (GarciaCarpintero and Maci 2006).
While semantic theories inspired by Carnap and Montague
do deal in intensions, they have much in common with the nointension theories discussed earlier. Not only are both sorts of
theories compositional and referential, but both sorts of theories
can also arguably fail in various ways to capture intuitive notions
of meaning. For example, if the intension of a sentence is a function from possible worlds to truth values, then if two sentences
have the same truth value at every possible world, then they have
the same intension. That means that all true mathematical sentences have the same intension, for presumably a mathematical
truth is true at every possible world. But 2 < 3 and 2 + 2 = 4 do
not have the same meaning.
Within current frameworks, lively discussion about these and
other questions continues. (For a useful introduction, see von
Fintel and Heim 2007.) More radically, others propose to rework
the foundations of semantics to produce more fine-grained
intensions (e.g., Fox and Lappin 2005). And then there is the
extreme view of J. J. Katz (1990): We should give up the claim that
intension determines extension and adopt an internalist notion
of intension better suited to the traditional duty of explaining
analyticity, synonymy, meaningfulness, and so forth.
A variety of views about the distinction between intension and
extension are alive. Widely, but not universally, it is thought that
both intensions and extensions are needed in semantics. Debate
continues about what intensions and extensions are, about what
the relation between intension and extension is, and about the
sorts of linguistic expressions that have them.
Patrick Hawley
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Carnap, Rudolf. [1947] 1956. Meaning and Necessity: A Study in Semantics
and Modal Logic. Chicago: University of Chicago Press.
Chierchia, Gennaro, and Sally McConnell-Ginet. 2000. Meaning and
Grammar. 2d ed. Cambridge, MA: MIT Press.
Church, Alonzo. 1951. A formulation of the logic of sense and denotation. In Structure, Method and Meaning: Essays in Honor of H. M.
Sheffer, ed. P. Henle, H. Kallen, and S. Langer, 324. New York: Liberal
Arts Press.
Davidson, Donald. 1967. Truth and meaning. Synthese 17: 30423.
Reprinted in Inquiries into Truth and Interpretation (Oxford: Oxford
University Press, 1984).
Fox, Chris, and Shalom Lappin. 2005. Foundations of Intensional
Semantics. Oxford: Blackwell.

397

Intentionality
Frege, Gottlob. [1892] 1997. On sense and reference. In The Frege
Reader, ed. M. Beaney, 15171. Oxford: Blackwell.
Gamut, L. T. F. 1991. Logic, Language and Meaning. Chicago: University
of Chicago Press.
Garcia-Carpintero, Manuel, and Josep Maci, eds. 2006. Two-Dimensional
Semantics. Oxford: Oxford University Press.
Heim, Irene, and Angelica Kratzer. 1998. Semantics in Generative
Grammar. Oxford: Blackwell.
Kaplan, David. 1989. Demonstratives. In Themes from Kaplan, ed. J.
Almog, J. Perry, and H. Wettstein, 481563. New York: Oxford University
Press.
Katz, J. J. 1990. The Metaphysics of Meaning. Cambridge, MA: MIT Press.
Kripke, Saul. Naming and Necessity. 1980. Cambridge: Harvard University
Press.
Larson, Rich, and Gabriel Segal. 1995. Knowledge of Meaning. Cambridge,
MA: MIT Press.
Mill, John Stuart. [1843] 1872. A System of Logic. 8th ed.
London: Longmans.
Montague, Richard. 1974. Formal Philosophy: Selected Papers of Richard
Montague, ed. Richmond Thomason. New Haven, CT: Yale University
Press.
Porphyry. 2003. Porphyrys Introduction. Trans. and commentary by
Jonathan Barnes. Oxford: Oxford University Press.
Putnam, Hilary. 1975. The meaning of meaning. In Philosophical
Papers. Vol. 2. Mind, Language and Reality. Cambridge: Cambridge
University Press.
Quine, W. V. 1953. Two dogmas of empiricism. Philosophical
Review 60: 2043. Reprinted in From a Logical Point of View, 2d ed.
(Cambridge: Harvard University Press, 1961).
. 1960. Word and Object. Cambridge, MA: MIT Press.
von Fintel, Kai, and Irene Heim. 2007. Intensional Semantics. Manuscript,
Massachusetts Institute of Technology.

INTENTIONALITY
Aboutness
The closest thing to a synonym for intentionality is aboutness;
something exhibits intentionality if and only if it is about something. The relevant sense of about is best elucidated by example: The sentence Saul Kripke is a philosopher is about Saul
Kripke; my belief that the weather in South Bend is dreary is about
the city of South Bend, Indiana; the black lines and curving blue
stripe on the map in my hand are about the streets of South Bend
and the St. Joseph River; the position of the needle on the gas
gauge in my car is about the amount of gasoline in its tank. While it
is difficult to find an uncontroversial and illuminating paraphrase
of the relevant sense of about, its hard to deny that there is some
reasonably clear sense of aboutness common to these examples.
This characterization of intentionality as aboutness is only
true to a first approximation because something can exhibit
intentionality without being about anything if it purports to be
about something. Zeus is not about, does not represent, anything; this name, unlike Saul Kripke, does not have a worldly
correlate. Nonetheless, Zeus counts as an example of intentionality by virtue of the fact that it (in a difficult to explain sense)
aims to be about something, even if it does not succeed.

categories. On the one hand, we have reference, denotation,


and extension; on the other hand, we have content, meaning,
sense, connotation, and intension. The relationship between
these categories of terms is best illustrated via the intentionality
of linguistic expressions.
Just as names are about, in the relevant sense, the objects for
which they stand, so, one might think, predicates are about
the things of which they are true. Green is about the green
things, happy about the happy things, and so on. The things
that words are about, in this sense, are their references (denotations, extensions). But, plausibly, a theory of reference for a
language would not be a full account of the content (meaning,
sense) of expressions of the language. To adapt an example from
W. V. O. Quine (1953), the sentences Dolly is a renate and
Dolly is a cordate may be alike with respect to the reference
of the expressions that compose them (because the set of cordates is identical to the set of renates) even though, intuitively,
the two sentences say different things about Dolly. So it seems
that two expressions can have the same reference while differing in content. But many have thought that, as Gottlob Frege
(1892) suggested, the converse does not hold: two expressions
cant have the same content without also having the same reference. Intuitively, two sentences cant say the same thing about
the world or express the same thought without being about the
same things. This combination of views that the content of an
expression is standardly something over and above its reference,
and that the content of an expression determines its reference
is very widely accepted. (Though not universally; it is rejected by
defenders of a Chomskyan internalist view of meaning who take
meanings to be internal to the language-processing systems of
language users [Pietroski 2003] and by skeptics about content
[Quine 1953, 1960; Kripke 1982].)
These views about the relationship between content and reference structure much contemporary work on intentionality, for
if content determines reference, it is natural to think that content
explains reference: Intentional phenomena come to be about
things by virtue of their possessing a content. This way of thinking about intentionality has several virtues. One is that it seems to
offer an explanation of the aforementioned example of Zeus; if
aboutness is typically explained by possession of a content, then
perhaps the sense in which Zeus aims to be about something is
that it, like expressions that are genuinely about something, has
a content. Its just that in the case of Zeus, this content fails to
determine a reference.
Virtually nothing more can be said about content, reference,
and the relationship between the two without entering into matters about which there is not even rough agreement. Theorists
differ about what sorts of things contents are, about whether
there are any expressions for which content and reference
coincide, and about whether there are any kinds of expressions
that cannot possess a content without possessing a reference.
Canonical works on these topics include Frege (1892), Russell
(1905), Frege (1918), Carnap (1947), Kripke (1972), and Kaplan
(1989).

Intentionality, Content, and Reference

Intentionality, Intensionality, and Intentions

Glossing over a wealth of distinctions, we see that the vocabulary


used in discussions of intentionality is divisible into two broad

It is worth mentioning at this point two persistent, though purely


terminological, sources of confusion about intentionality: the

398

Intentionality
distinctions between intentionality and intensionality, on the
one hand, and intentionality and intentions, on the other.
Intensionality is a property of sentence contexts. Given any
context in a sentence, we can then ask: Can we, by replacing
one expression or phrase in that context with another that has
the same reference, change the truth value of the sentences as a
whole? If so, then the context is said to be intensional.
So far, the connection between intentionality and intensionality may seem to be merely orthographic. But it has been claimed
that the latter is a criterion for the former: that descriptions of
intentional phenomena will always include an intensional context (Chisholm 1957). For descriptions of propositional attitudes like beliefs, this seems plausible. For example,
John believes that the worlds most famous sheep is famous.

may be true while


John believes that Dolly the sheep is famous.

is false, even if the worlds most famous sheep and Dolly the
sheep have the same reference. But the criterion seems to fare
less well in other cases. For example,
The thick blue line on my map of South Bend represents the
St. Joseph River.

appears to ascribe the right sort of aboutness to qualify as a sentence about intentionality, but the sentence does not contain
any intensional contexts. And many sentences that do contain
intensional contexts dont seem to be descriptions of intentional
phenomena. For example,
Mammals have a greater chance of heart failure than flatworms
because they are cordates.

and
Mammals have a greater chance of heart failure than flatworms
because they are renates.

differ in truth value, even though cordates and renates have the
same reference.
A second potential source of confusion is the similarity of
intention, and intentionality. Intention, like belief and
desire, is the name of a type of mental state. Like beliefs and
desires, intentions exhibit intentionality, but they are no more
essential to intentionality than other mental states.

Intentionality and Mentality


Though intentionality is derived from intentio, a technical term
that had wide use in medieval philosophy, and intentio is itself
a translation of technical terms from premedieval Arabic philosophy, modern usage of the term is usually traced to Franz
Brentanos 1874 Psychology from an Empirical Standpoint.
Brentano is standardly taken to have made two basic claims
about intentionality, the first of which is that intentionality is
internally related to mentality:
Every mental phenomenon is characterized by what we might
call direction toward an object. Every mental phenomenon
includes something as an object within itself, although they
do not all do so in the same way. In presentation something is

presented, in judgement something is affirmed or denied, in love


loved, in hate hated, in desire desired, and so on.
This is characteristic exclusively of mental phenomena. No
physical phenomenon exhibits anything like it. We can, therefore, define mental phenomena by saying that they are those
phenomena which contain an object intentionally within themselves. (Brentano [1874] 1997, II.i.5)

We can think of Brentanos thesis as having two components:


Intentionality is necessary for mentality; all mental states
exhibit intentionality.
Intentionality is sufficient for mentality; everything that
exhibits intentionality is a mental state.
The claim of necessity is uncontroversial when we are thinking of
propositional attitudes like believing, supposing, and judging. It
is more controversial but still plausible when we think of perceptual states; the sense in which my visual experience is currently
of or about a computer screen is recognizably the same as the
sense in which a name is a name of its bearer.
Bodily sensations like itches and pains, however, may seem
to be counterexamples to Brentanos claim that intentionality is
necessary for mentality. My sensation of throbbing pain is clearly
a mental state but can it be said to represent, or be about, anything at all? Many have thought not and have seen the attempt
to find intentionality in sensations as an ad hoc attempt to find
something common to mental phenomena (Rorty 1979). But
this negative verdict can be challenged, and it has been in recent
philosophy of mind. For one thing, pains are felt as located, and
given this, it is not implausible to think of them as about the part
of the body where they are felt to be (Tye 1995; Byrne 2001).
On the face of it, the other half of Brentanos thesis that intentionality defines the mental seems to be less well-off. How can
one claim that intentionality is sufficient for mentality when
things that are clearly not mental states like words, parts of
maps, and gas gauges exhibit intentionality?

Original and Derived Intentionality


The best answer to this question invokes a distinction between
original and derived intentionality. We began by noting the
diversity of things that exhibit intentionality: mental states, linguistic expressions, maps, gas gauges. But it is plausible to think
that at least some of these intentional phenomena acquire this
status via a relation to some other more fundamental intentional
phenomenon. If this is correct, we can recast the second half of
Brentanos thesis as the claim that only mental phenomena have
original intentionality: intentionality not explicable in terms of
other intentional phenomena.
This sort of defense of Brentano carries with it a commitment to the research program of explaining the intentionality
of language, maps, and gas gauges in terms of the intentionality
of the mental. This research program has considerable promise
and has received sophisticated development over the last few
decades, with most of the attention focused on explanations of
linguistic meaning in terms of mental content.
One well-developed attempt to provide such an explanation begins with the thought that linguistic expressions mean
what they do because of what speakers intend to convey by

399

Intentionality
using them (Grice 1957, 1969; Schiffer 1972, 1982). On this
view, what a speaker means by uttering an expression on an
occasion (speaker-meaning) is a function of the beliefs that
that speaker intends to bring about in his or her audience via
their recognition of that communicative intention, and, further, what an expression means in a community is a function
of what speakers mean or would mean by using the expression
on various occasions. By this two-part reduction (of expression-meaning to speaker-meaning, and speaker-meaning to
communicative intentions), the intentionality of language is
explained in terms of the intentionality of intentions. Critics
of this approach have focused on its inability to explain uses
of language in thought and apparently normal examples of
communication in which speakers lack the requisite communicative intentions (Chomsky 1975; Schiffer 1987). But despite
the problems faced by specific versions of this reductive program, there is widespread agreement that there is some way of
explaining the intentionality of language via the intentionality of the mental states of language users if not their intentions, then perhaps their beliefs (Lewis 1975). (Opposed views
of the source of the intentionality of language are defended in
Laurence 1996 and Brandom 1994.)

The Reduction of Original Intentionality


Supposing that there is a genuine distinction between original
and derived intentionality, there is a further question about
whether original intentionality can itself be explained. The second thesis about intentionality often associated with Brentano
is that it cant be: Original intentionality is not only definitive of
mentality but also inexplicable in nonintentional terms. By contrast, the view dominant in recent years may be summed up as
follows:
I suppose that sooner or later the physicists will complete the
catalogue theyve been compiling of the ultimate and irreducible properties of things. When they do, the likes of spin, charm,
and charge will perhaps appear on their list. But aboutness surely
wont; intentionality simply doesnt go that deep. If aboutness
is real, it must really be something else. (Fodor 1987, 97)

In part because most recent theorists have adopted the view,


sketched here, that original intentionality is found at the level of
thought, these theorists have approached the task of explaining
original intentionality by constructing theories of mental content. The standard method of theory construction takes as given
the following broad thesis: Being in a certain mental state is a
matter of being in an internal state that has properties that make
it a mental state of the relevant type with the relevant content.
This view is sometimes called the representational theory of the
mind though this label is sometimes used for the conjunction
of the present view with the language of thought hypothesis (about which more later) and other times is called functionalism though this label is sometimes used for the conjunction
of the present view with the thesis that the content-determining
properties of internal states are their functional roles.
The natural next questions are, therefore: What properties of
internal states make them mental states of a certain type with
a certain content? And which internal states are the bearers of
content?

400

The internal states in question will presumably be complex


physical states of subjects. Given this, we can ask: Is the content
of such states derived from the contents of its parts so that, in
the case of a state that has the content that grass is green, the
state would have one part representing grass, and another representing the color green or are the fundamental content-conferring properties a matter of the propositional attitude state as
whole? To take the former option is to endorse the language of
thought hypothesis (Fodor 1975; Rey 1995) and to take the latter
is to reject it (Stalnaker 1990; Blackburn 1984).
Whether or not the language of thought hypothesis is true,
the principal challenge in constructing a theory of content is to
specify the properties that confer contents on those representations. Here, the proliferation of theories is such that it is hardly
possible to do better than the following list of candidate completions of an internal representation x has the content p if and
only if:
x is actually caused by ps being the case / ps being the case
would, under epistemically ideal conditions, cause that internal state (Stalnaker 1984) / x covaries with ps being the case
during the learning period when the state is acquiring a
content (Dretske 1981).
It is the biological function of x to be present when p is the
case (Millikan 1989).
x has nomological connections of specified kinds with property p (Fodor 1990).
There is an isomorphism between the system comprised of x
and the rest of the agents internal representations and a system containing p which maps x onto p (Cummins 1989).
A (specified) theory maps xs functional role its causal connections to perceptual input, behavioral output, and other
internal representations onto p (Block 1986; Harman [1988]
1999).
The discussion so far leaves open an important metaquestion
about intentionality: Supposing that there is no reduction of
original intentionality to nonintentional facts, what attitude
should we take toward the claims about the intentionality of
mental states to which we unhesitatingly subscribe in daily life?
Some who have rejected such analyses have put alleged intentional facts into the same category as alleged facts about phlogiston, witches, and other posits of false theories (Quine 1960;
Churchland 1981); others have taken the failure of reductions of
original intentionality to show that intentionality is an unanalyzable feature of the world, and no less real for that (Chisholm
1957; Searle 1983).
Jeff Speaks
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Blackburn, S. 1984. Spreading the Word. Oxford: Clarendon Press.
Block, N. 1986. Advertisement for a semantics for psychology. In
Mental Representation: A Reader, ed. S. Stich and T. Warfield, 81141.
Cambridge, MA: Basil Blackwell.
Brandom, R. 1994. Making It Explicit. Cambridge: Harvard University
Press.
Brentano, F. [1874] 1997. Psychology from an Empirical Standpoint.
London: Routledge.

Intentionality
Byrne, A. 2001. Intentionalism defended. Philosophical Review
110: 199240.
Carnap, R. 1947. Meaning and Necessity: A Study in Semantics and Modal
Logic. Chicago: University of Chicago Press.
Chisholm, R. 1957. Perceiving: A Philosophical Study. Ithaca, NY: Cornell
University Press.
Chomsky, N. 1975. Reflections on Language. London: Temple-Smith.
Churchland, P. 1981. Eliminative materialism and the propositional
attitudes. Journal of Philosophy 78.2: 6790.
Crane, T. 1998. Intentionality as the mark of the mental. In Current Issues
in Philosophy of Mind, ed. A. OHear, 22951. Cambridge: Cambridge
University Press.
Cummins, R. 1989. Meaning and Mental Representation. Cambridge,
MA: MIT Press.
Dretske, F. 1981. Knowledge and the Flow of Information. Cambridge,
MA: MIT Press.
Fodor, J. 1975. The Language of Thought. Hassocks, UK: Harvester.
. 1987. Psychosemantics: The Problem of Meaning in the Philosophy
of Mind. Cambridge, MA: MIT Press.
. 1990. A theory of content, II: The theory. In A Theory of Content
and Other Essays, 89136. Cambridge, MA: MIT Press.
Frege, G. 1892. On sense and reference. In Translations from the
Philosophical Writings of Gottlob Frege, ed. P. Geach and M. Black,
5678. Oxford: Basil Blackwell.
. 1918. Thought. In The Frege Reader, ed. M. Beaney, 32545.
Oxford: Basil Blackwell.
Grice, P. 1957. Meaning. Philosophical Review 66.3: 17788.
. 1969. Utterers meaning and intentions. In Studies in the Way of
Words, 86116. Cambridge: Harvard University Press.
Harman, G. [1988] 1999. Wide functionalism. In Reasoning, Meaning,
and Mind, 23543. Oxford: Clarendon.
Husserl, E. [1901] 2002. Logical Investigations. 2 vols.
London: Routledge.
Kaplan, D. 1989. Demonstratives. In Themes from Kaplan, ed.
J. Almog, J. Perry, and H. Wettstein, 481563. Oxford: Oxford
University Press.
Kripke, S. 1972. Naming and Necessity. Cambridge: Harvard University
Press.
. 1982. Wittgenstein on Rules and Private Language: An Elementary
Exposition. Cambridge: Harvard University Press.
Laurence, S. 1996. A Chomskian alternative to convention-based semantics. Mind 105: 269301.
Lewis, D. 1975. Languages and language. In Language, Mind, and
Knowledge, ed. K. Gunderson, 335. Minneapolis: University of
Minnesota Press. Reprinted in Lewis 1983, 16388.
. 1983. Philosophical Papers. Vol. 1. Oxford: Oxford University
Press.
Millikan, R. 1989. Biosemantics. Journal of Philosophy 86.6: 28197.
Moran, D. 1996. Brentanos thesis. Proceedings of the Aristotelian
Society Supplementary Volume 70: 127.
Perler, D., ed. 2001. Ancient and Medieval Theories of Intentionality.
Boston: Brill.
Pietroski, P. 2003. The character of natural language semantics. In
Epistemology of Language, 21756. Oxford: Oxford University Press.
Quine, W. V. 1953. Two dogmas of empiricism. In From a Logical Point
of View, 2046. Cambridge: Harvard University Press.
. 1960. Word and Object. Cambridge, MA: MIT Press.
Rey, G. 1995. A not merely empirical argument for a language of
thought. Philosophical Perspectives 9: 20122.
Rorty, R. 1979. Philosophy and the Mirror of Nature. Princeton,
NJ: Princeton University Press.
Russell, B. 1905. On denoting. Mind 14: 47993.
Schiffer, S. 1972. Meaning. Oxford: Oxford University Press.

Internal Reconstruction
. 1982. Intention-based semantics. Notre Dame Journal of Formal
Logic 23: 11956.
. 1987. Remnants of Meaning. Cambridge, MA: MIT Press.
Searle, J. 1983. Intentionality. New York: Cambridge University Press.
Stalnaker, R. 1984. Inquiry. Cambridge, MA: MIT Press.
1990. Mental content and linguistic form. In Context and Content,
22540. New York: Oxford University Press.
Tye, M. 1995. Ten Problems of Consciousness: A Representational Theory
of the Phenomenal Mind. Cambridge, MA: MIT Press.

INTERNAL RECONSTRUCTION
Internal reconstruction (IR) is a method, or group of methods,
used to establish unattested earlier forms of languages. It differs
from the comparative method, which has similar aims, in
being based on the features of one language, without external
reference, and it is therefore appropriately applied to languages
without relatives, such as language isolates or reconstructed
protolanguages. Languages reconstructed by these means are
referred to as prelanguages, as opposed to the protolanguages
that result from the application of the comparative method.
IR arose in the later nineteenth century as an extension of
the comparative method. For example, Ferdinand de Saussures
reconstruction of sonant coefficients later termed laryngeals
for Indo-European was based largely on this method, though it
was not recognized as legitimate at the time, and his reconstructions were largely ignored. The method was identified in the early
twentieth century but was considered too speculative, methodologically unsound, and lacking controls on its results. It was not
until documentary evidence for laryngeals was found in Hittite
that its value was recognized, and then not universally.
IR is not a single method but a group of approaches having
in common reliance on internal features of the language. It is
possible to identify three such approaches, which are not necessarily completely distinct nor mutually exclusive: historical
morphophonemics, regularization of systems, and universal and
typological reconstruction.

Historical Morphophonemics
This approach, applicable only to phonology, is fairly well
defined methodologically. It is based on the structuralist
principle of the morphophoneme, which also corresponds to
the underlying phoneme of classical generative phonology.
Morphophonemes comprise a set of morphologically related
phonemes, that is, occurring in different forms of the same
morpheme. The principle is that the different phonemes result
from the application of different phonological changes in different contexts; by reducing these alternating phonemes to a single
form, we effectively establish the original phoneme from which
they are derived. For example, in Latin /s/ became /r/ between
vowels, giving alternations such as flos/flor-. Synchronically, this
provides us with a morphophoneme {S}; historically, we assume
that this corresponds to an original phoneme /s/. A weakness of
this method is that it assumes that there was no morphophonemic alternation in earlier stages of languages.

Regularization of Systems
While historical morphophonemics works by eliminating
alternations and apparent irregularities, the regularization of

401

Internal Reconstruction
systems method involves, in effect, the generalization of this
process to whole systems. Thus, where there are inconsistencies
in paradigms for example, with different classes of nouns or
verbs the method seeks to eliminate these differences in order
to produce a single paradigm, on the assumption that inconsistencies arise through change. For example, the different declensions or conjugations of Latin and Greek or the forms of Germanic
strong verbs can be reduced to single patterns. Again, a weakness
of this approach is that it assumes complete consistency in the
prelanguage.

Universal and Typological Reconstruction


This is the most speculative and perhaps methodologically the
least controllable approach, which to some extent goes beyond
purely internal reconstruction inasmuch as it invokes general
properties of languages. These properties may be either universal
or typological. In the former case the method relies on language
universals, which are held to determine either the synchronic
structures of languages or the diachronic processes of change
(see synchrony and diachrony). These have the effect of
constraining possible reconstructions since we may not postulate earlier states nor processes of change that do not conform to
the universals.
Thus, in phonology we may assume universal phonetic
processes (for example, the palatalization of velar consonants
in the neighbourhood of front vowels); a reconstruction that
assumes the reverse (palatalization with back vowels) would be
ruled out. Similarly, the universal process described by the term
grammaticalization the development of lexical items
into grammatical items precludes reconstructing the reverse.
There may also be constraints on structures and systems; if all
languages must have, say, vowels, then we cannot reconstruct
an earlier stage without them. In all such cases, our reconstructions will seek to establish earlier forms in compliance with
these universal constraints.
The typological approach (see typology) is similar, but it
reflects language types rather than universal properties. The
principle here is that language typology involves not just isolated features but, rather, sets of harmonizing or co-occurring
features. According to one widely used typology, languages fall
into two basic types, VO and OV, according to the order of verb
and object, but this ordering is also reflected in the order of other
items, such as noun and adjective (VO languages generally also
have NA, and OV languages have AN), and the occurrence of
prepositions or postpositions (VO languages favor the former,
OV languages the latter). Given such principles of harmony, a
language that has one such order, say, NA, will be expected to
have the other harmonic orders; if it does not, we can reconstruct
this harmony for an earlier period of the language. In the case
of English, for example, which has VO but AN, we can assume
an earlier stage that had either NA or OV (the latter is assumed).
Other typological parameters can be used in an analogous fashion. A weakness of this method, however, is that it assumes that
languages will originally have been typologically consistent,
though consistency is not necessarily an inherent attribute of
language.
In spite of potential weaknesses, the methods of IR have
proved useful in establishing earlier stages of languages in cases

402

Interpretation and Explanation


where the comparative method cannot be applied. The methods and especially the approach that relies on universals and
typology are extremely powerful, and for that very reason they
have often been regarded with some suspicion and must be used
with caution.
Anthony Fox
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Campbell, Lyle. 1998. Historical Linguistics: An Introduction.
Edinburgh: Edinburgh University Press. See Chapters 7 and 8.
Fox, Anthony. 1995. Linguistic Reconstruction: An Introduction to Theory
and Method. Oxford: Oxford University Press. See Chapters 7 and 8.
Lehmann, Winfred. 1992. Historical Linguistics. London: Routledge. See
Chapter 8.

INTERPRETATION AND EXPLANATION


An interpretation is an account of what is said, done, and thought
by some person or people an account at the level of content. An
interpretation may take as its subject some particular episode
involving some agents action. Consider a bosss remarks to a
new employee about her skirt. (Is the boss generally a fashion
maven? Does the boss wear skirts or chase skirts?) What is the
agent saying, doing, and thinking? An interpretation may take as
its subject some kind of utterance or activity in a social group.
Consider some students at a game chanting Were gonna beat
the hell out of you. What were they really saying, doing, and
thinking? An understanding of such matters an interpretation
depends epistemically on an associated explanatory understanding of why such things are said and done.
An interpreted episode or a phenomenon typically involves
some linguistic activity. Yet an interpretation is more than a
straightforward translation for those utterances. The chanting
of the students can be homophonically translated. But is their
utterance primarily a part of a ritual of hope and group identification or an expression of belief? Consider also one historical
episode: George W. Bushs insistance that it is morally wrong
to destroy life in order to save life (quoted in Stolberg 21 May
2005). Again, for speakers of English, homophonic translation is
uncontroversial. Yet just what was expressed? What would his
principle allow, and what would it rule out? What is Bush saying
and doing here?
Interpretation goes beyond translation in part because interpretation comes to terms with pragmatic elements of an
utterance what speech-acts are performed and how conversational implicature may outrun what is explicit. An
acceptable treatment of these matters is bound up with rich
understandings of agents situated projects their beliefs and
desires.
An interpretation may constrain the translation of some subjects language on which it depends. This is common in anthropological studies of a peoples religion or magic as one
translation may strongly suggest an understanding that is at odds
with an anthropological interpretation, whereas an alternative
translation may support that interpretation. The classic debates
between symbolist and intellectualist anthropologists over how
to interpret various folk religions had implications for whether

Interpretation and Explanation


the associated linguistic constructions should be translated into
terms that suggested theories about causes (as intellectualists such as Horton 1970, 1982 urge), or into more guarded
terms that suggested symbolic expression (as symbolists such as
Leach 1954 and Beattie 1970 urge) perhaps understandings
and influences. Here, translation (treating the literal said) and
interpretation (of what is ultimately said, done, and thought) are
interdependent. (Turner 1980; Henderson 1993; Risjord 2000;
Stueber 2006). As Donald Davidson (1984) argued, in settling
upon an interpretation and translation, several matters must
be sorted out together interrelated matters having to do with
belief, desire, and meaning (see meaning and belief, radical interpretation, agreement maximization, and
charity, principle of).
What, then, is the mark of a good interpretation? Surprisingly,
many theorists, both historical and contemporary from various
traditions, can be seen to agree on an answer to this epistemological question: A successful or adequate interpretation affords
an explanatory understanding of what is said, done, and thought.
An interpretation of someone or some folk as doing or saying
some sort of thing is of a piece with an explanatory understanding. Epistemologically, the two stand or fall together. The agreement here is pervasive but does not run deep. Writers quickly
come to differ over what makes for a successful explanation of
thought and deed and, thus, what marks good interpretation.
(For example, those urging a strong principle of charity such
as Davidson typically think of explanation here as a matter of
exhibiting rationality, while others may see more place for explanations that do not rationalize. Various conceptions of what
make for explanatory understandings are discussed later.)
Focus for a time on the agreement. Consider again Bushs
assertion that it is wrong to destroy life in order to save life. The
first thing to notice is the wide range of information that we recognize as relevant to its interpretation. It was uttered in the context of legislation regarding research using embryonic stem cells.
In keeping with the venerable discussions of hermeneutics (see
philology and hermeneutics), one should consider what
was also said in the wider context in which the assertion was
advanced. Its interpretation is bound up with the interpretation
of the whole of Bushs remarks which, in turn, is dependent
on successful treatment of the various component utterances.
Information having to do with much beyond this set of remarks is
also relevant. For example, how did Bush and his political advisers understand their political situation? That Bush was politically
beholden to right-wing conservative Christians even was one
is relevant. That they hold that human life begins at conception is
relevant. So also is information regarding the wider political context. Was the conservative Christian component of his political
base at that time disenchanted and thought to need firing up?
Sweeping moral statements are commonly best interpreted as
containing an implicit ceteris paribus clause. One might wonder
if Bushs statement should be understood as likewise qualified.
What is known about whether Bush would authorize the military
to destroy lives to save lives? How does Bush understand collateral damage in warfare? Is the military context one in which
ceteris is not paribus? Is the principle unqualified except that
it expresses a prima facie duty that can be overridden by other
duties? But what other duties? All this information, and much

more, bears on which of several possible interpretations is ultimately most satisfactory. Bush might be understood as having
advanced a moral principle that is implicitly qualified so that
there is no inconsistency with his military policies. Alternatively,
there might have been unnoticed inconsistency as all humans
exhibit some inconsistency. Finally, one might consider whether
Bush might have noticed an inconsistency that he conveniently
omits to mention.
The central point concerns how diverse information allows
one to decide among such alternative interpretations: It bears
on the explanation of what is said, done, and thought in the episode interpreted. The political concerns of Bush and his advisers might help to explain the assertion as political boilerplate.
Certainly the staking out of political positions, the responsiveness to constituencies, and the like can lead politicians to cast
about for simple (simplistic) formulations of sweeping principles
in which to wrap the desired policy. Is this the explanation, and
interpretation, of Bushs assertion? Understood and explained
as boilerplate, these questions about the exact content of the
asserted principle (whether or how it is implicitly qualified), the
way it then squares with various policies, and the obviousness of
inconsistency (if any) may become less pressing as they matter
less to the explanation of the episode. What is then central is the
sense of certain politicians for how taking the moral high road
will play in the relevant constituencies. But perhaps one senses
that Bush is a politician whose public face is here tied to his
own moral view of the world. Then, the degree of inconsistency
and how he could be insensitive to it become more central as a
matter of explanatory concern. In either case, one is informed
both by a general sense for what makes humans tick (cognitively and otherwise) and for an antecedently formed impression of this person in particular by the sedimented results of
past interpretations. This is to draw upon generic resources for
explaining human action and thought and on resources more
specific to this individual. Again, each is the fruit of past practice that is both interpretive and explanatory. One has a sense for
how humans think and act and for variations. One has a sense
for Bushs character as one variation, one that is the product
of his biography, and one that is evinced in past interpreted and
explained practice.
Just how one ends up interpreting the assertion in question
and the associated political or moral act depends on the explanation for this episode that one judges to be most likely, given all that
one knows about the agent or agents involved and about humans
as agents. Interpretation, it seems, is a matter of abduction or
inference to the best explanation of an ongoing, always revisable, sort. (Readers should find it easy to explore many further
alternatives and wrinkles in understanding the Bush case and
to assure themselves that their plausibility devolves onto the
question of greater or lesser explicability.)
The interdependence of interpretation and explanation, their
being two faces of the same coin, is widely appreciated. What is
contested is how to understand the explanation of thought and
deed. There are two broad schools of thought here, although representatives of each have been diverse.
Some think of explanation as subsumption as a matter of
deploying generalizations to show that the case or phenomenon
in question was dependent on certain antecedents. To conceive

403

Interpretation and Explanation


of the explanation of thought and action along these lines is to
generalize a common understanding of scientific explanation. C.
Hempel (1965) provides a particularly clear variant in his venerable hypothetical-deductive and statistical-probabilistic models
of explanation. (However, there are reasons to think that the
approach to scientific explanation is itself flawed; see, for example, Salmon 1989.) A very different subsumptive model is provided by James Woodwards (2000) discussions of explanation in
the special sciences. Woodward thinks that the generalizations
produced and deployed are themselves not exceptionless nomic
generalizations, but relatively invariant generalizations holding
within imperfectly specified limits.
The second school of thought views explanation of thought
and action as revealing intelligibility explanation is understood
as a matter of understanding what is said and done as having or
expressing a significance or meaning so that the whole thereby
becomes intelligible. Just what being intelligible comes to is
itself understood variously. Certainly a kind and degree of internal contentful coherence plays a role as does the interpreters
ability to then see how and why one would do as indicated on the
basis of such reasons. Those who draw on the hermeneutic tradition are representatives. So also is R. G. Collingwood (1946) with
his conception of explanation as reenactment. Something like
Collingwoods approach has enjoyed a contemporary revival of
sorts within cognitive psychology where there has been much
work on explanation of thought and deed as a kind of simulation in which ones own cognitive processes are taken off line
and put to work on imagined input that reflects ones provisional
interpretation of a subject. One imaginatively puts oneself in
the others shoes and deliberates or reasons; if ones simulation then accords with observed actions or expressed beliefs, one
has a prima facie successful explanation (see Stueber [2006] for a
recent overview and discussion).
As noted, the representatives of each approach have been
diverse. Still, we can appreciate that while proponents of these
approaches differ over what makes for explanation, they commonly understand that interpretation epistemically depends
on explanation. (It is worth adding that most would also recognize that the information supposed in an explanation itself
depends on interpretations so that there is ultimately a holistic
epistemic interdependence here.) A unified understanding can
then be had by way of drawing on a very general understanding
of explanation. When we do this, elements of the diverse interpretive/explanatory approaches can be understood as complementary epistemologies of explanation, rather than as competing
accounts of what it is to explain thought and action. Let me
explain by drawing on what has come to be called the erotetic
account of explanation (see Salmon 1989; van Fraassen 1980).
This label refers to the logic of questions and answers erotetic
logic. On this approach, an explanation is an answer to a question typically a why-question or a how-question. In either case,
the resources for answering the question allow one to understand a pattern of dependencies. The resources for answering a
why-question (let us focus on these) should allow us to appreciate that what is done or said was dependent on certain standing
states and events. If we can do this, we can answer a range of
associated what-if-things-had-been-different questions exploring these dependencies (see possible worlds semantics).

404

To illustrate, economic theory might inform our explanation


and interpretation of an individuals verbal exchange with his or
her broker as the reception of and risky use of insider information
to avert a loss, interpreting and explaining this as an expectable
form of profit maximization in light of the agents understanding of situated risks of detection. (Perhaps the agent said, That
is unacceptable exposure to risk. The explanation envisioned
supports interpreting this as an instruction to sell, rather than as
a merely general point about levels of risk in certain portfolios.)
Without drawing on theory, one might imaginatively put oneself
in the agents shoes (under tentative interpretation) and see that
one would do the same as the simulation theorist envisions. In
both cases, one appreciates that with different information and
antecedent beliefs and values, the agent would have done systematically otherwise. Thus, generalizations about economic agents
and imaginative simulation might support a single understanding and explanation of the agent answering the why-question by
attention to the same dependencies (under interpretation).
If explanation is understood as a matter of answers to howand why-questions (and associated what-if-things-had-beendifferent questions), then what have been taken to be competing
accounts of explanation can be understood in a fashion that
renders them compatible, and affords us a multifaceted understanding of the explanatory practice associated with interpretation. The pivotal move is to abandon each tradition insofar as
it attempts to delimit what counts as an explanation treating
the erotetic understanding as the more generic and acceptable
account of explanation. Then, approaches such as Woodwards
(appealing to subsumption under generalizations with significant invariance) and the simulation theorists (appealing to the
imaginative use of our off-line cognitive capacity) can each be
recognized as part of a full epistemic story. Subsumption and
simulation each have their role in coming to appreciate why the
agent or agents acted thus and what they would have done if
things had been relevantly different. Thus, my suggestion is that
explanation is the mark of a good interpretation, that explanation typically itself supposes a tentative interpretation, that the
generic understanding of explanation is that provided by erotetic
logic, and that it commonly is attained by some amalgamation of
the epistemic resources that have been of concern in traditions
that have wrongly claimed to provide the account of explanation in the human sciences.
David Henderson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Beattie, J. H. M. 1970. On understanding ritual. In Rationality, ed. Bryan
Wilson, 24068. Worchester, UK: Basil Blackwell.
Collingwood, R. G. 1946. The Idea of History. Oxford: Clarendon.
Davidson, Donald. 1984. Belief and the basis of meaning. In Inquiries
into Truth and Interpretation, 14154. Oxford: Oxford University
Press.
Hempel, C. 1965. Aspects of Scientific Explanation and Other Essays. New
York: The Free Press.
Henderson, David. 1993. Interpretation and Explanation in the Human
Sciences. Albany: State University of New York Press.
Horton, Robin. 1970. African traditional thought and Western science. In Rationality, ed. Bryan Wilson, 13171. Worchester, UK: Basil
Blackwell.

Interpretive Community
. 1982. Tradition and modernity revisited. In Rationality and
Relativism, ed. Martin Hollis and Steven Lukes, 20160. Cambridge,
MA: MIT Press.
Leach, E. R. 1954. The Political Systems of Highland Burma. London: Bell.
Risjord, Mark. 2000. Woodcutters and Witchcraft. Albany: State University
of New York Press.
Salmon, Wesley. 1989. Four decades of scientific explanation. In
Scientific Explanation, Minnesota Studies in the Philosophy of Science.
Vol. 13. Ed. P. Kitcher and W. Salmon, 3219. Minneapolis: University
of Minnesota Press.
Stolberg, Sherly Gay. In Rare Threat, Bush Vows Veto of Stem Cell Bill.
New York Times, 21 May 2005. Stueber, Karsten. 2006. Rediscovering
Empathy. Cambridge, MA: MIT Press.
Turner, Stephen. 1980. Sociological Explanation as Translation.
Cambridge: Cambridge University Press.
van Fraassen, Bas. 1980. The Scientific Image. Oxford: Oxford University
Press.
Woodward, James. 2000. Explanations and invariance in the special
sciences. British Journal for the Philosophy of Science 51: 197254.

Intertextuality
same person writes two different texts for himself when reading
from different interpretive communities, for he understands the
Hebrew texts as prophesies of Jesus Christ only after his conversion at Damascus.
Fishs theory has been criticized for making words have no
meaning. He responds with just the opposite: Words always have
meaning, in fact many meanings, all of which are constructed by
situated readers in various communities. Fish adds that his theory is sociological, not normative, that is, it describes only what
people say (or think) a text means; it does not prescribe how
we ought to interpret texts. Finally, to the objection that some
authors use certain techniques to ensure that their texts convey
certain meanings, he responds that those meanings come to fruition only if the reader belongs to the same interpretive community as that author.
Jeffrey R. Wilson
WORK CITED

INTERPRETIVE COMMUNITY
In 1976, literary critic Stanley Fish used this term to describe
the unspoken (often unknown) alliances among readers who
share similar strategies for determining what a text means. This
theory of pragmatics, he says, is the explanation for both the
stability of interpretations among different readers (they belong
to the same community) and for the regularity with which a single reader will employ different interpretive strategies and thus
make different texts (he belongs to different communities) (Fish
1980, 171).
The notion of interpretive community insists upon the primacy of situated readers, and it can be thought of as a theory
of creative reading. Fish says that a set of general assumptions
on how one ought to interpret a text precedes every act of interpretation; thus, a reader always perceives a given text within an
already in-place hermeneutical framework. One does not read
the words on a page and then decide what those words mean
because no temporal separation exists between acts of perception and interpretation. Instead, ones community conditions
how its members read those words in the first place. As such,
readers actually write a text for themselves as they read, for they
have a tool kit of interpretive strategies always at work determining what certain words will mean should they arise in a given
context. Readers using the same tool kit belong to the same
community.
One can see interpretive communities at work in Christian
typology, a mode of biblical exegesis that aims to square Old
Testament texts with the events recounted in the New Testament.
For the typologist, the belief that Jesus was God combines with
other assumptions in order to form the exegetes set of interpretive strategies. Other readers who share these strategies make up
this exegetes community even if they do not know one another,
which explains how two Christians might independently interpret some events in the Old Testament as prophesies of Jesus
Christ. Of course, a Jew, Gnostic, or pagan produces a much different meaning of those same Hebrew texts because he or she
works from a community that reads/writes those texts differently. And finally, a look at Paul of Tarsus demonstrates how the

Fish, Stanley Eugene. 1980. Is There a Text in This Class? The Authority of
Interpretive Communities. Cambridge: Harvard University Press.

INTERTEXTUALITY
Building on Mikhail Bakhtins (1981) discussion of the dialogic
nature of language, Julia Kristeva (1986) coined the term intertextuality for the multiple ways in which texts refer to and draw
on other texts. This notion highlights the interconnectedness of
texts and challenges deep-rooted literary values, such as autonomy, uniqueness, and originality (Allen 2000, 56). An intertextual perspective views text production as a social practice in
which different texts, genres, and discourses are drawn upon and
text consumption as a process in which readers may bring additional texts not only those that have shaped production into
the interpretation process (Fairclough 1992, 845). The study
of intertextuality does not focus solely (or even primarily) on
the specific prior texts that are brought into play in a given text;
rather, it also examines the implicit texts underlying production
and interpretation (e.g., presuppositions, genre conventions)
(Culler 1976, 1388). Thus, a newspaper crime report has intertextual links not only to eyewitnesses accounts and previous
reports on the same and/or similar events but also to newswriting conventions, propositions that the journalist takes as given,
and even the journalists/readers understanding of crimes in
general.
Reported speech, a prime example of intertextuality, has been
extensively studied in sociolinguistics. Reporting speech is
always a reformulation of the original act. Even if prior speech is
reported verbatim, the reporting speaker may use prosodic features like stress and intonation to indicate his/her interpretation of the utterance, or he/she may frame the reported speech
in such a way as to manipulate the addressees perception of the
reported speaker. In some cases, material represented as reported
speech is not spoken by anyone at all. These observations have
led Deborah Tannen (1989) to conclude that reported speech is
primarily the creation of the reporting speaker and serves to create a sense of interpersonal involvement between the reporting
speaker and the addressee in the reporting context.

405

Intertextuality
Reported speech is an example of what Norman Fairclough
(1992) calls manifest intertextuality. Manifest intertextuality
refers to the way in which specific other texts are overtly drawn
upon within a text. In addition to reported speech, it covers such
phenomena as irony, negation, and presupposition. In
contrast, constitutive intertextuality also known as interdiscursivity refers to the way in which texts draw on abstract sets of
conventions like genres and styles. In their research on interdiscursivity, several critical discourse analysts have
noticed a widespread appropriation of conversational styles in
public discourse. Focusing on a consumer feature about lipstick
from a British teenage magazine, Mary Talbot (1995) examines
how the text producer exploits features of conversational speech
(e.g., the pronoun you, expressive punctuation) to establish an
informal, friendly relationship with the reader. This practice,
however, is far from benign. Under the guise of offering sisterly advice, the consumer feature serves as covert advertising
and trains teenage readers to become consumers of cosmetic
products.
Generic intertextuality, a notion developed by anthropologists
Richard Bauman and Charles Briggs, can be viewed as a particular kind of interdiscursivity. They define generic intertextuality as
the construction of the relationship between a text and a genre,
and they are interested in how and for what purposes this relationship is established in communicative practice (Briggs and
Bauman 1992). They see genre as a speech style that serves as an
orienting schema for the production and reception of discourse.
Genre interacts with such factors as the interactional context and
the speakers/writers communicative goals in shaping a given
text. In turn, these factors may lead to the selective adoption of
the constituent features of the generic framework and create an
intertextual gap between the text and its generic model. Briggs
and Bauman argue that research on strategies for manipulating
intertextual gaps between texts and their generic schemas can
shed light on issues of power, ideology, and political economy.
In cultures with traditional genres that are invested with great
power, speakers/writers often minimize the distance between
their texts and these genres. This serves as a powerful strategy
for creating textual authority. At times, however, speakers/writers may maximize the intertextual gaps between texts and their
generic models. They may do so to resist the hegemony of established genres or to claim authority in cases where creativity is
highly valued.
Appropriation, a specific case of intertextuality, refers to the
practice of adopting words, expressions, or ways of speaking
that are generally thought to belong to someone else. Many
white American teenagers use elements of African-American
vernacular English (AAVE) in their speech so as to align themselves with hip-hop culture and/or to project an urban youth
identity by exploiting certain connotations of AAVE (e.g., toughness). Appropriation, however, may also serve disaffiliating or
even denigrating purposes. In the United States, some monolingual Anglos use what anthropologist Jane Hill (1993) call Mock
Spanish that is, a subregister of colloquial English made
up of (pseudo-)Spanish expressions (e.g., maana) to project a congenial persona. Yet to make sense of Mock Spanish,
one also requires access to certain racist beliefs about Spanish
speakers. Maana works as a humorous substitute for later

406

Intonation
only because of the stereotype of Spanish speakers as lazy
and procrastinating.
Several issues continue to dominate sociolinguistic research
on intertextuality. One, to which I have alluded, focuses on
the relation of intertextuality to power. Others are concerned
with the conditions that make decontextualization and recontextualization possible, as well as the semantic and functional
changes that texts undergo as a result of recontextualization.
Intertextuality also raises interesting issues about authorship.
If all texts are created out of prior texts and conventions, what
is an author, and who is responsible for what is said/written?
These issues are likely to be worked out differently in different
cultures.
Andrew Wong
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Allen, Graham. 2000. Intertextuality. London: Routledge.
Bakhtin, Mikhail. 1981. The Dialogic Imagination. Ed. Michael Holquist,
trans. Caryl Emerson and Michael Holquist. Austin: University of Texas
Press.
Bauman, Richard. 2004. A World of Others Words: Cross-Cultural
Perspectives on Intertextuality. Malden, MA: Blackwell.
Briggs, Charles, and Richard Bauman. 1992. Genre, intertextuality, and
social power. Journal of Linguistic Anthropology 2: 13172.
Culler, Jonathan. 1976. Presupposition and intertextuality.
MLN: Modern Language Notes 91: 138096.
Fairclough, Norman. 1992. Discourse and Social Change. Cambridge,
UK: Polity.
Hill, Jane. 1993. Hasta la vista, baby: Anglo Spanish in the American
Southwest. Critique of Anthropology 13: 14576.
Kristeva, Julia. 1986. Word, dialogue, and the novel. In The Kristeva
Reader, ed. Toril Moi, 3461. New York: Columbia University Press.
Talbot, Mary. 1995. A synthetic sisterhood: False friends in a teenage
magazine. In Gender Articulated, ed. Kira Hall and Mary Bucholtz,
14365. Routledge: New York.
Tannen, Deborah. 1989. Talking Voices. New York: Oxford University
Press.

INTONATION
this term refers to the fundamental frequency (or its perceptual correlate, pitch) contour associated with phrases and other
large prosodic units. Language communities use intonation to
serve a wide range of functions, both grammatical and discourse
based. For example, intonation is used to signal prosodic boundaries: The ends of utterances are characteristically associated
with terminal pitch excursions, either a rise or a fall depending
on semantic factors. Another function of intonation is to cue
many types of semantic distinctions, such as the difference
between yes/no questions and neutral declarative statements
in many languages. Intonation is also used to convey emotional
(see emotion and language) and expressive states, as well as
pragmatic information.
Intonation is a universal property in that speakers of all languages manipulate pitch to communicate linguistic and paralinguistic functions (see paralanguage and phonology,
universals of). Even languages that use pitch to differentiate
individual lexical items, for example, tone languages such as
Mandarin Chinese and pitch accent languages such as Swedish,

Intonation
also have intonation systems that are evident when words are
grouped into larger prosodic constituents or uttered in isolation.
Intonation systems may vary substantially, however, from language to language and also potentially from speaker to speaker.
Thus, while a yes/no question is associated with a terminal rise in
pitch in many languages (e.g., German, Japanese, and Korean),
there are other languages (e.g., Finnish and Chickasaw) that
mark yes/no questions with a final pitch fall. One nearly universal property, however, is the lowered pitch characterizing the
end of semantically neutral declarative utterances.
The study of intonation has witnessed many important theoretical advances over the last 30 years. Whereas certain schools
of intonation (e.g., the British school) described pitch contours
in terms of their overall shape or gestalt, many current models
of intonation capture changes in pitch in terms of discrete tonal
sequences, thereby bringing the study of intonation into line with
the analysis of segmental and word-level phonological phenomena. This type of approach assumes that peaks and troughs in a
fundamental frequency contour are attributed to phonological
high and low tones aligned with various phonological elements.
Actual surface phonetic intonation patterns result from interpolation between these high and low tonal targets.
As one of the pioneering works in this school of intonation,
Janet Pierrehumbert (1980) developed an analysis of English
intonation in which a wide range of pitch contours conveying
several different semantic and pragmatic functions are captured
as a sequence of phonological high and low tones associated
with hierarchically arranged prosodic constituents. A fundamental insight of Pierrehumberts analysis is that tones may be
classified into two groups: those that are associated with prominent, that is, stressed, syllables and those that are associated with the periphery, especially the end, of prosodic domains.
Tones that are associated with stressed syllables are termed pitch
accents. Pitch accents differ according to whether they consist of
a single tone, high or low, or a sequence of tones, such as H + L
or L + H, which phonetically yield a tone fall or rise, respectively.
In addition to pitch accents on certain stressed syllables, tonal
excursions are often observed at the end (and potentially the
beginning) of phrases. These phrase-level tonal movements are
attributed to boundary tones, which may be associated with relatively small phrases, termed intermediate phrases, or with larger
phrases, termed intonation phrases or intonation units. Like
pitch accents, boundaries may be characterized by a single tone
or a sequence of tones. For example, intonation phrase boundaries in Korean consist of as many as five tonal targets, for example,
LHLHL, which conveys a sense of annoyance on the part of the
speaker (Jun 2005a).
One of the challenges facing linguists interested in the
typological investigation of intonation is the relative dearth
of reliable descriptions of intonation on a broad cross section of languages. Fortunately, recent years have witnessed
a dramatic expansion of cross-linguistic studies intonation.
Daniel Hirst and Albert Di Cristo (1998) studied intonation
in 20 languages, including several non-Indo-European languages. Jun (2005a) compiled investigations of 13 languages
all analyzed within a Pierrehumbert-type framework. A typologically and geographically diverse array of languages is discussed in this work, including languages with word-level stress

Irony
(Dutch, English, and German), tone languages (Cantonese and
Mandarin Chinese), pitch accent languages (Japanese, Swedish,
and Serbo-Croatian), languages lacking word-level stress, lexical tones, or pitch accents (Korean and French), and indigenous languages of North America and Australia (Chickasaw and
Bininj Gun-wok, respectively).
An issue common to the intonation systems of all languages
is the mapping between intonational tunes and their meanings.
Because intonation is used to convey many subtle differences in
meaning, often in gradient fashion, it is a challenge to determine
which differences in intonation merit different phonological analyses. Phonological distinctions between intonational tunes must
be captured as differences either in the sequence of tones comprising the tunes or in the alignment of those tones with words.
Yet another topic inextricably linked to intonation is prosodic constituency. Since pitch excursions are often observed
at prosodic boundaries, a comprehensive analysis of intonation hinges on the characterization of the types of constituents
constituting utterances. Research has shown that prosodic constituency and the mapping between constituency and intonation
vary from language to language. For example, some languages
divide utterances into groupings of words that are characterized
by a tonal template. Thus, phrases in Chickasaw (Gordon 2005)
are associated with a LHHL sequence, whereby the first and the
last low tonal targets associate with the beginning and the end of
phrases, respectively, and the two high tones associate with the
second syllable (or the first syllable if it contains a long vowel or
ends in a sonorant consonant) and the beginning of the final syllable, respectively.
A number of researchers have published books providing
overviews of these and other issues in the study of intonation,
including Ladd (1996), Cruttenden (1997), and Gussenhoven
(2004).
Matthew Gordon
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Cruttenden, Alan. 1997. Intonation. Cambridge: Cambridge University
Press.
Gordon, Matthew. 2005. Intonational phonology of Chickasaw. In Jun
2005b, 30130.
Gussenhoven, Carlos. 2004. The Phonology of Tone and Intonation.
Cambridge: Cambridge University Press.
Hirst, Daniel, and Albert Di Cristo, eds. 1998. Intonation Systems: A Survey
of Twenty Languages. Cambridge: Cambridge University Press.
Jun, Sun-Ah. 2005a. Korean intonational phonology and prosodic transcription. In Jun 2005b, 20129.
Jun, Sun-Ah, ed. 2005b. Prosodic Typology the Phonology of Intonation
and Phrasing. New York: Oxford University Press.
Ladd, D. R. 1996. Intonational Phonology. Cambridge: Cambridge
University Press.
Pierrehumbert, Janet. 1980. The phonology and phonetics of English
intonation. Ph.D diss., Massachusetts Institute of Technology.
Reproduced by the Indiana University Linguistics Club, Bloomington,
1987.

IRONY
There are several types of irony described in the literature, all of
which rely on an incongruity or discrepancy between appearance

407

Irony
and reality. In dramatic irony, for instance, the incongruity is created by having the audience aware of information about which a
character in a play is ignorant (such as in Sophocles Oedipus Rex).
In situational irony and irony of fate, the disconnect is between
ideal expectations of justice and actual (or idealized) outcomes,
such as would occur if Bill Gates, one of the worlds most wealthy
individuals, won a lottery or as exemplified by Beethovens inability, on going deaf, to hear his own musical masterpiece.
The form of irony most studied in the language sciences
is verbal irony, traditionally conceptualized as a trope in
which the meaning one intends to communicate is opposite
of that expressed literally by the words that are used. Thus,
in Shakespeares Julius Caesar, when in his famous soliloquy (Act 3, Scene 2) Anthony states: For Brutus is an honorable man / So are they all, honorable men, it is understood
that Anthony is emphasizing the opposite of being honorable,
namely, the dishonorable action of Brutus and the other conspirators. In principle, with irony one should be able to convey
negative attitude by expressing something positive (as in the
Shakespearean example above) or positive attitude by stating
something negative.
There is also a general recognition that because the expressed
utterances are literally plausible, the recovery of ironic intent is
consequently highly context dependent and facilitated by signals fashioned (or unintentionally employed) by the ironist.
Tone of voice is one such hint in spoken language, but because
irony can be detected even when ironic intonation is not
employed (such as when reading text), this cue is not necessary.
Other cues include hyperbole, understatement, and excessive
politeness, but it is generally agreed that there is no signal that
points exclusively to irony.
The context necessary for the recovery of ironic intent traditionally has been limited to the discourse in which the irony is
embedded, but in more recent years, the concept of context has
been widened, even for verbal irony, to encompass an ironic
environment that includes social-cultural factors, such as those
dependent on discursive communities that share knowledge,
beliefs, values, and communicative strategies (e.g., Hutcheon
1994).
Over the past three decades, the standard pragmatic
approach to nonliteral language processing (see metaphor )
has framed much of the discussion on irony. Based on the pragmatics of conversation and on speech-act theory, irony is
described accordingly as the outcome of a conversational
implicature initiated by a violation, or based on exploitation, of H. P. Grices (1975) Maxim of Quality: The recognition
that the literal expression does not make sense in the context
in which it is produced leads one to initiate a search for a context-appropriate interpretation in which the literal sense of the
expression is denied, suppressed, and replaced by the logical
opposite.
There have been challenges over the years regarding the traditional emphasis on the initial processing of the literal expression and then substituting it with its opposite meaning, on both
logical and empirical grounds. Consider, for instance, the analysis of a passage from Voltaires Candide given by Dan Sperber
and Deirdre Wilson: When all was over and the rival kings were
celebrating their victory with Te Deums in the respective camps

408

(1995, 241). One cannot claim that the opposite counterpart to


the literal expression is that the kings were not celebrating with
Te Deums or that the irony can be substituted by a literal expression of the opposite, namely, that the kings were bewailing their
defeat with lamentations.
Empirically, it has been shown that the processing of statements in a discourse context that emphasizes the irony is not
slower than that observed for the same statement in a discourse
that is consistent with its literal sense, a finding incompatible
with predictions arising from the standard pragmatic approach.
Moreover, there is empirical evidence incompatible with the
notion of rejection and substitution of the literal, demonstrating instead that the difference between the literal and nonliteral
sense is important in determining the magnitude of the perceived irony (the tinge hypothesis of Ellen Winner) and that recognition of irony requires the processing of both of the opposing
meanings in order to determine that the two messages are in an
ironic relation (the indirect negation hypothesis of Rachel Giora
[2003]).
Other theories have de-emphasized the importance of the
literal expression as well and accord greater importance to psychological factors. With pretense theory, there is the recognition
that the ironist is taking on the role of a person who holds the
opinion expressed in the irony, thus mocking both the opinion and the people who would hold it. Two competing theories
(echoic mention and echoic reminder) share the assumption that
a verbal utterance can be seen as a mention or an allusion about
the expression, about expectations, or about cultural norms. In
this way, the ironist communicates his or her attitude about the
actual and expected state of affairs (see Gibbs 1994, Chapter 8,
for a review).
Extant theories have been criticized rightfully for their
inability to encompass all types of irony and for a theoretical
vagueness that make them of questionable scientific utility.
The importance accorded background context or identification
of an ironic environment is especially problematic for process
and computational models of irony comprehension, given the
failure to identify any signal of irony that is both necessary and
sufficient, though recent models based on constraint satisfaction or graded saliency principles are encouraging (see Colston
and Katz 2004)
Albert N. Katz
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Colston, H., and A. Katz. 2004. Figurative Language Processing: Social and
Cultural Influences. Mahwah, NJ: Erlbaum.
Gibbs, R. 1994. The Poetics of Mind. Cambridge: Cambridge University
Press.
Giora R. 2003. On Our Mind: Salience, Context and Figurative Language.
Oxford: Oxford University Press.
Grice, H. P. 1975. Logic and conversation. In Syntax and
Semantics: Speech Acts, ed. P. Cole and J. Morgan, 4158. New
York: Academic Press.
Hutcheon, L. 1994. Ironys Edge. London: Routledge.
Katz, A., ed. 2000. Uses and processing of irony and sarcasm. Metaphor
and Symbol 15 (Special Issue): 1116.
Sperber, D., and D. Wilson. 1995. Relevance: Communication and
Cognition. 2d ed. Oxford: Blackwell.

Language, Natural and Symbolic

L
LANGUAGE, NATURAL AND SYMBOLIC
The paradigm of language is natural language, a naturally evolved
system of human communication using spoken or signed words
according to the various ways they can be combined. But in an
extended sense, many species of nonhuman animals also have
language, in this case, a means of communication through inarticulate sounds; and humans similarly have various nonverbal means of expression and communication through facial
expressions, gestures, and body language more generally, and
through music, dance, and the imitative arts (see animal com-

munication and human language; art, languages of;


communication, prelinguistic). For more than a thousand years, we have also had the positional Arabic numeration
system, together with its algorithms for the basic arithmetical
operations, a symbolic language that was extended in the seventeenth century to include the literal notation of algebra as
well. The twentieth century, finally, saw the development of very
powerful formal languages, for example, the language of mathematical logic and programming languages, both of which are
constituted by a collection of signs together with rules governing
their use, that is, by a syntax but not, at least not essentially, a
semantics (see artificial languages).
Although natural language is the paradigm of language, the
foregoing suggests a continuum of sorts: first with systems of
animal and human nonverbal communication, then natural
languages, followed by the symbolic language of arithmetic and
algebra, and finally programming and other formal languages.
Given their centrality, both along this continuum and in our
lives, the focus here is on natural language and the symbolic language of arithmetic and algebra.
Natural language and the symbolic language of arithmetic
and algebra share two fundamental, and related, features. Both
are vehicles of inquiry and knowing, and both are a medium for
the expression of what we know. Not only do we learn from experience as animals can but we also, in virtue of our immersion in
natural language, can reflect on what we learn, wonder at how
we learn, and question whether we really know what we seem to
know. Because we reason not only implicitly and involuntarily as
other animals do but also explicitly, in words, we can question
our reasons and strive to discover better ones. Just the same is
true of the language of arithmetic and algebra by means of which
we discover, for example, negative, rational, real, and complex
numbers, explore their manifold natures, and display in familiar
symbols that which we know. Though for very different reasons,
neither systems of nonverbal communication nor merely formal
languages can serve in this way as a vehicle of critically reflective
inquiry.
Natural language and the symbolic language of arithmetic
and algebra are also very different, however. Natural language
is, for example, primarily spoken (or signed) and a vehicle of
communication between a speaker and a hearer; the symbolic
language of mathematics is instead essentially written and serves
primarily as a vehicle of reasoning. Spoken natural language is
intelligible independent of written natural language; symbolic

language is not. No one without the idea of reading and writing


could learn the language of mathematics. Symbolic language is
essentially written.
A second, related difference between the two sorts of language
is that natural language is enormously versatile. One can do all
sorts of things in and with natural language. Symbolic languages,
by contrast, are specifically designed for particular purposes and
are useless for others. One cannot tell a joke or a story, describe
a room (or a language), or even greet a friend in the formula language of arithmetic.
Natural language is also constantly evolving through its use;
it is inherently social and deeply historical. Symbolic language is
instead self-consciously created, often by a single individual, and
has no inherent impulse to change with use. Although French
has changed considerably over the past four centuries, the language of algebra that Descartes introduced in 1637, and which
every schoolchild learns today, has changed not at all.
A subtler difference concerns the characteristic sorts of concepts of the two languages. The concepts of natural language
are paradigmatically object-involving and, for that reason, sensory: we talk, first and foremost, of the objects met with in everyday experience, for example, of the things we eat, navigate by,
enjoy, and fear. Although our concepts of such things do involve
much more than the way they appear sensorily to us in appropriate circumstances, they could not be understood in abstraction
from those appearances, that is, from the subjective character of
our experience of them. The concepts of natural language (those
owing nothing to the development of symbolic language) are concepts of sensory entities, of things that look, feel, taste, smell, and
sound in characteristic ways (compare embodiment). Insofar
as they are, they resist expression in a symbolic language.
Consider, for example, the notion of a sphere. According to
Aristotle, a sphere is a common sensible; it is an object that
contrasts with a proper sensible, such as the color red or a
certain bitter taste, in being perceptible not merely through one
sense organ (say, by sight as color is, or by taste as bitterness is)
but through more than one. Red things have a characteristic
look; spheres have a characteristic look and a characteristic feel.
Aristotles concept of a sphere is not the modern, mathematical, and nonsensory concept of a two-dimensional surface all
points of which are equidistant from a center; it is the concept
of a three-dimensional, essentially sensory object. The modern mathematical concept of a sphere can be expressed in the
symbolic language of mathematics (namely, as x2 + y2 + z2 = r2);
Aristotles essentially sensory concept cannot.
We have seen that a symbolic language, unlike a natural language, is self-consciously created and expresses concepts that
can be fully understood in abstraction from our sensory experience. It is unsurprising, then, that the rules governing the use of
signs in a symbolic language can be explicitly formulated and,
hence, that it is relatively easy to build a machine that correctly
manipulates those signs. Because natural language is instead
social, historical, essentially sensory and object-involving, making the rules of its use explicit (in order, perhaps, to build a
machine capable of correctly manipulating its signs) is an altogether different, and much harder, possibly intractable, problem. It may be that natural language users can only be grown or
raised, that they cannot be built.

409

Language, Natural and Symbolic


The fact that natural language embodies a particular sensory
view of the world, one that is inextricably tied to human biology, has other implications as well. Our common experience,
grounded in our common (biological) form of life, explains, for
example, the intertranslatability of all human natural languages
and predicts the untranslatability of the natural languages of
creatures evolved to have radically different (biological) forms
of life. Given the role of acculturation in the acquisition of natural language, any being capable of learning a particular natural
language must share at least some sense modalities with other
speakers of the language; but nothing in the very idea of acculturation into a natural language requires that there be some
favored sense modalities. Symbolic language is not similarly
rooted in our bodily being. Not only are symbolic languages
universal to all people whatever their natural language; they
are in principle, at least along this dimension, accessible to any
rational being.
Why, then, is translation from one natural language to
another so difficult? Ludwig Wittgenstein suggests an answer
in his Philosophical Investigations. As he points out, although
some of our everyday concepts can be adequately understood by
reference to a standard or paradigm case, most (including language) are correctly applied to a range of things that exhibit only
a family resemblance. Whereas in the former instance all
correct applications refer back to the one standard or paradigm
case, in the case of concepts that exhibit only a family resemblance, similarities between correct applications need only overlap, like fibers twisted one on another over the length of a thread.
Such similarities are essentially historically conditioned: Some
applications will, at a particular moment in the evolution of the
language, seem appropriate, natural; others will not work so
well, or at all. And because what works crucially depends on previous successful applications, different natural languages come
to employ common words in quite different ways. There is nothing like this in the case of symbolic language. Although the words
may be borrowed from natural language (e.g., group, ring,
and field in abstract algebra), the concepts of a symbolic language have a fixed content and univocal application. It is only
our understanding of those concepts that develops and deepens
in the course of inquiry. The nature and structure of meaning is
quite different in the two sorts of language.
The two sorts of language tend to differ, finally, in their
predicative structure: a natural language is constitutively
narrative, a language within which to tell what happens; a
symbolic language is not. And natural language is narrative
because the everyday world is a world of becoming, of change.
To speak of it, then, requires two modes of predication, that
marking what something is and that marking properties a thing
can acquire and lose and also (as a matter of its form, not merely
in its content) tense. Symbolic languages generally involve neither different modes of predication nor tense. The language of
mathematics, for example, speaks timelessly of what is: 7 + 5 = 12,
(a + b)2 = a2 + 2ab + b2.
Natural language is primarily spoken and communicative,
narrative, essentially sensory and object-involving, and historical, that is, constantly evolving with use. A symbolic language,
such as the language of arithmetic and algebra, is essentially
written, non-narrative, nonsensory, and self-consciously created

410

Language Acquisition Device (LAD)


with no inherent impulse to change. The two sorts of language
are, then, quite different. And yet most work in the language sciences is pursued on the assumption that symbolic language differs from natural language only in its degree of clarity, rigor, and
perspicuity. We may, as a result, fail adequately to understand
either. By importing considerations relevant only to the workings of a symbolic language, we are liable to falsify the workings of natural language; and by taking natural language as our
paradigm, we risk radically misconceiving the nature of symbolic
language. (It has, for example, long been assumed that the conception of generality, or quantification, that is needed in
mathematics is identical to that employed in natural language,
but perhaps this is just not so.) If natural and symbolic languages
are essentially different, then the language sciences need to take
those differences into account, showing how they do, or do not,
matter to research programs in those sciences.
There are further ramifications for psychology and for pedagogy. For example, given the differences between natural and
symbolic language, it is reasonable to ask whether one reads
symbolic languages differently than one reads written natural
language, whether one looks at the page of marks differently in
the two cases. If one does, then perhaps what prevents some
students from thriving in mathematics is that they try to read its
symbolic language as if it were a written natural language. Maybe
students who are adept at modern, symbolic mathematics are, in
fact, primarily adept at catching on, without explicit instruction,
to the way of writing and reading it involves. These are testable
hypotheses, but they are hypotheses we will think to test only if
we comprehend the differences between natural and symbolic
languages.
Danielle Macbeth
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Gupta, Anil. 1980. The Logic of Common Nouns. New Haven, CT: Yale
University Press. Argues for the two sorts of predication needed in
speaking about change.
Macbeth, Danielle. 2004. Vite, Descartes, and the emergence of modern mathematics. Graduate Faculty Philosophy Journal 25: 87117.
Explores differences between premodern, nonsymbolic and modern,
symbolic mathematical understanding.
. 2005. Freges Logic. Cambridge: Harvard University Press. Argues
that Gottlob Freges language, unlike the language of mathematical
logic, which is merely formal, without content, is a fully contentful
symbolic language, a vehicle of inquiry.
Wittgenstein,
Ludwig.
1953.
Philosophical
Investigations.
Oxford: Blackwell. Deeply insightful investigation into the nature of
natural language.

LANGUAGE ACQUISITION DEVICE (LAD)


This term refers to Noam Chomskys early proposal of what
would be necessary for construction of a language acquisition
model (1965, 303). More precisely, this early proposal for a
language acquisition device (LAD) provides a logical analysis
of the components that would be minimally necessary for any
such model to account for language acquisition. Its components
are summarized in Figure 1. These can be viewed as specifying
logically and abstractly what would be minimally necessary to

Language Acquisition Device (LAD)

(i) Technique for representing input signals


(ii) Way of representing structural information about these signals
(iii) Some initial delimitation of a class of possible hypotheses about language structure
(iv) Method for determining what each such hypothesis implies with respect to each sentence
(v) Method for selecting one of the (presumably infinitely many) hypotheses that are allowed by
(iii) and are compatible with the given primary linguistic data.

Figure 1. LAD components (Chomsky 1965, 30).

the child as language learner in the initial state, that is, before
language experience.
These components, which were proposed as necessary for
language learning, were formulated as accounting for the mapping from primary language data (data which is necessarily
finite and inherently variable) presented at the initial state to
complex, infinite, and systematic language knowledge in the end
state. They characterized the foundations of the specific innate
abilities that make this achievement possible for the language
learner (Chomsky 1965, 27).
At the same time, the components in Figure 1 were to indicate
what a linguistic theory would have to treat in order to support
an acquisition model and thus attain explanatory adequacy
(see descriptive, observational, and explanatory
adequacy). For example, such a theory would need to provide a representation of possible sentence in order to support
(i), a definition of structural description for (ii), generative
grammar for (iii), a method for determining structural descriptions for (iv), and an evaluation metric for (v).
The original Chomsky proposal is often interpreted as though
it described a realistic device. However, it is best viewed as providing an explication of the logical foundations that any comprehensive model for language acquisition would have to presuppose
and account for, with the precise nature of each of the components to be subsequently determined as an empirical matter
(Chomsky 1965, 37). Perhaps at least partially because of this
divergence in interpretation, subsequent language acquisition
research over the last decades has frequently pursued divergent
paths, one pursuing the logical problem of language acquisition consisting of linguistic analyses of potential data mapping
(e.g., Baker and McCarthy 1981), another pursuing a realistic
approach consisting of empirical studies of language development (initiated largely by Roger Brown and his students [1973]).
In fact, certain components in Figure 1 proved particularly
challenging to a realistic model. For example, if there were a realistic device, it would have to include a mechanism for (i) and (ii).
If (iii) were to be implemented in terms of an enumeration of
the class G1, G2 of possible generative grammars (Chomsky
1965, 31), then there is the de facto impossibility of predefining
given grammatical hypotheses across 6,000 to 7,000 actually
potentially innumerable human languages, as well as the risk
of begging the question of language acquisition, that is, how the
individual grammars arise. Similarly, the nature of an evaluation

measure for (v) required specification. The absence of a temporal


or developmental component was especially challenging, given
that an instantaneous view of language acquisition is obviously false (Chomsky 1975, 119, 121).
Subsequent formulations of linguistic theory that pursue
a generative approach now offer a number of developments
of the original LAD proposal. In general, they seek to define
a theory of universal grammar (UG) as a theory of the initial state. Defined as an abstract partial specification of the
genetic program that enables the child to interpret certain
events as linguistic experience and to construct a system of
rules and principles on the basis of this experience (Chomsky
1980, 187) and as of course, not a grammar, but rather a system of conditions on the range of possible grammars for possible human languages (Chomsky 1980, 189), UG opens the
possibility for developing future integration of logical and realistic approaches to language acquisition and language development. a principles and parameters version of UG, for
example, formulates parameters as providing a specific mechanism for (i)(v). (See Table 4.4 in Lust 2006 for examples of
current approaches to modeling language acquisition within
this framework.)
Combining these and subsequent theoretical advances with
significant recent empirical advances concerning infant mapping to language data (e.g., Jusczyk 1997; Morgan and Demuth
1996), which informs us on actual mechanisms in (ii) and (iii),
may now allow the field to approach the more comprehensive
model of language acquisition that LAD was first intended to
underlie (see discussion in Chomsky 2000). See Chapter 4 in Lust
2006 for further discussion of this LAD-based model and a review
of several proposed alternatives to the LAD, derived from opposing theoretical paradigms.
Barbara Lust
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baker, C. L., and J. McCarthy, eds. 1981. The Logical Problem of Language
Acquisition. Cambridge, MA: MIT Press.
Brown, R. 1973. A First Language. Cambridge: Harvard University Press.
Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT
Press.
. 1975. Reflections on Language. New York: Pantheon.
. 1980. Rules and Representations. New York: Columbia University
Press.

411

Language Change, Universals of


. 2000. The Architecture of Language, ed. J. Mukherji, B. Narayan
Patnaik, and R. Agnihotri. Oxford: Oxford University Press.
Jusczyk, P. 1997. The Discovery of Spoken Language. Cambridge, MA: MIT
Press.
Lust, B. 2006. Child Language: Acquisition and Growth.
Cambridge: Cambridge University Press.
Morgan, J., and K. Demuth, eds. 1996. Signal to Syntax: Bootstrapping
from Speech to Syntax in Early Acquisition. Mahwah, NJ: Lawrence
Erlbaum.

LANGUAGE CHANGE, UNIVERSALS OF


All aspects of language can change, and with every aspect we see
broad tendencies that have inspired the search for universals of
change. In some areas, powerful universals have emerged, suggesting that many of the synchronic similarities we see across
languages may have their source in shared cognitive and processing mechanisms. Researchers have identified cross-linguistically
similar paths of change in sound change in morphological
change, and in grammaticalization.
Given any two related items, A and B, one could in principle
expect a change from A > B or from B > A to be equally probable. However, the facts of change show that this is not the case.
Across the languages of the world, we find that one direction
predominates and the other is rare. Thus, we speak of unidirectionality in language change. Claims about unidirectionality are
sometimes controversial because apparent counterexamples
do emerge. For this reason, it is important to define types of
change clearly before making proposals about the directionality
of change.
The importance of such unidirectional trends cannot be overstated. Universals attested in the diachronic plane appear to be
more powerful than those that can readily be stated for the synchronic plane. Thus, diachronic typology is a theory of universals
that proposes that the synchronic patterns are not in themselves
the universals but, rather, the result of strong diachronic trends
(Greenberg 1969, Greenberg, Ferguson, and Moravcsik 1978).
The structure of language is created by change that is ongoing in
language use (see usage-based theory); because languages
are used by human beings in very similar ways across cultures,
languages tend to be similar to one another.
An important question to ask concerns the sense in which
there can be universals of change. No change has to occur; there
are many changes that could occur. The universals, then, specify
only similar paths of change that can be found in different languages. If these languages are not closely related genetically,
then the similar paths of change cannot be attributed to a shared
history but must be viewed as independent developments. Thus,
even though we cannot say that a change has to occur, nor can
we predict when it will occur, there are still many substantive
predictions that we can make about change.
In examining trends in language change, it is important to
consider how well documented a particular change is. Some
changes are reconstructed on the basis of a comparison of
related languages (see comparative method ). Since known
trends in change are often used in this reconstruction, such
changes cannot be taken as evidence for trends in change. Only
well-attested changes are valid inputs to a theory of universals
of change.

412

Sound Change
The symmetry of the terms used to describe sound changes
assimilation/dissimilation, weakening/strengthening, deletion/
insertion makes it seem as though all directions of change were
equally probable. The facts now known from a wide array of languages show instead that assimilation, weakening, and deletion
are vastly more common than dissimilation, strengthening, and
insertion. If sound change is rigidly defined as language internal,
phonetically motivated, and lexically general, then the less common types of change show up even more rarely.
Common assimilation changes involve consonants taking
on the properties of adjacent vowels, for instance, palatalization before front vowels, labialization or rounding before round
vowels. Or vowels can take on properties of consonants, as
when vowels become nasalized in the same syllable with nasal
consonants. For some of these changes, more detailed hierarchies of contexts for change can be established. Palatalization
of coronal consonants (such as [t], [d], [s]) occurs first and most
commonly before a high front glide ([j]), occurs next before a
high front vowel ([i]), and then progresses to the lower front
vowels.
Paths of reductive change can also be established. For
instance, a common consonant reduction involves the loss of [p]
via the pathway p > f > h > . Parts of this path are documented
in different languages: Japanese has undergone a change that
reduced all prevocalic instances of /p/ to a fricative that assimilates to the place of articulation of the following vowel. Spanish
and other Romance languages have undergone a change that
reduced word-initial [f] to [h] and further to . The positions in
which such reductions occur also show regularity: Syllable-final
position favors the reduction of consonants, as does intervocalic
position. Word-initial position is least likely to condition reduction of consonants.
It is commonly assumed that such changes are caused by
phonetic tendencies. In the articulatory domain, some general
principles of change have been proposed that deal with the way
articulatory gestures change during production. The primary
directions in phonetic change are the reduction of the magnitude
of the gestures, leading to reduction or loss, and their increased
overlap, leading to assimilation (Browman and Goldstein 1992;
Mowrey and Pagliuca 1995).
As mentioned earlier, the importance of universals of language
change is that they can predict and thus explain synchronic patterns across languages. For instance, the presence of nasal vowels
in a language is almost always due to assimilation to a nasal consonant. Sometimes this consonant is lost, leading to phonemic nasal
vowels. The diachronic source explains why nasal vowels always
have a more restricted distribution and frequency compared to
oral vowels (Greenberg 1966). Similarly, the fact that some languages lack the phoneme [p] can be explained by its tendency to
undergo reduction. The restrictions against certain consonants
in syllable-final position can be attributed to their propensity for
loss in that position (Vennemann 1988, Blevins 2004).

After Sound Change: Morphologization


Another set of unidirectional changes involves the results of
sound change. Phonetically conditioned sound change creates
alternations that gradually acquire morphological or lexical

Language Change, Universals of


conditioning, as, for example, when vowel shortening before a
consonant cluster created an alternation in English keep/ kept,
sleep/slept and weep/wept.
Another example is the alternation in nouns such as wife, wives.
At first, in Old English there were no voiced fricatives: /v/, /z/, and
// did not occur. Later, by a regular sound change /f/, /s/, and
// became voiced when they occurred between two vowels, as
in the plurals of house, wife and wreath (at that time the plural
suffix had a vowel in it, putting the fricative between two vowels). Nowadays, the alternation is not phonetically productive. In
words such as classes, effort, and ether, voiceless fricatives occur
between vowels. Also /v/, /z/, and // have become phonemes
and they occur in all positions. So now the alternation still found
in wife, wives, and so on is associated with certain nouns and the
morphological category of plural. In this way, morphologically or
lexically conditioned alternations are created; such alternations
tend to be unproductive phonologically and to involve segments
that are contrastive elsewhere.

Morphological Change: Analogy


Morphological and lexical alternations that are created in the way
just described tend to undergo further change on the basis of the
patterns found in the paradigms of the language. Certain general trends in analogical change have been identified (Mazcak
1980). In analogical leveling, when an alternation is lost, it is
the high frequency form of the paradigm, such as the singular
in nouns or the present indicative in verbs, that serves as the
basis for new formations. Thus, the alternation weep/wept is leveled by the creation of a new past tense, weeped. Low-frequency
words are more likely to undergo analogical leveling than highfrequency words. Thus, wept is more likely to change than kept.
The productive pattern that prevails in leveling is the one that has
the largest number of members. One synchronic consequence
of these trends is that the irregular paradigms in a language are
usually among the most frequent.
Extension of an alternation to a paradigm that did not have it
before is less common than is leveling. It occurs where there is
strong phonological similarity to a productive pattern, as when
the regular verb string gets a new past tense strung due to its similarity to swing, swung.

Morpho-Syntactic Change: Grammaticalization


Most change in the morpho-syntax is a product of the general process of grammaticalization. Grammaticalization itself, however,
involves phonetic, semantic, and pragmatic change, in addition to
morpho-syntactic change. Various cross-linguistic paths of change
have been identified in recent research into grammaticalization.
First, there is a general path of change for the form of the grammaticalizing element as it progresses from lexical to grammatical:
content item > grammatical word > clitic > inflectional affix >
stem change > loss

This progression involves a loss of the properties originally


associated with the content word (its ability to occur as a noun
or verb) and a growing fusion of the element with lexical items
nearby. A good example is the development of the Spanish future
suffixes. In Latin, there was a construction that consisted of an
infinitive with the conjugated verb habere. For example, dicere +

habeo meant I have to say, In medieval Spanish, the verb had


reduced to he and consistently followed the infinitive. Eventually
the former verb (now a grammatical word) fused with the infinitive, giving decir. In further changes, the stem of some verbs lost
a syllable, in our example creating dir I will say. The last stage
(loss) is occurring in this case as a new future from ir a + verb go
to + verb replaces the old future.
Parallel to this path of change are the many paths of semantic/
pragmatic change that have been identified as creating the grammatical morphemes of the languages of the world. The numeral
one with a noun tends to develop into an indefinite article; demonstratives develop into definite articles and complementizers; verb
constructions involving a verb meaning want or go to plus
another verb develop into future markers; resultative constructions with have or be and a past or passive participle, such as I
have gone, develop into perfect, pasts, and perfectives; constructions with locative verbs or movement verbs develop into progressives, which may go on to become imperfectives. Verbs meaning
know or be able develop into auxiliaries indicating possibility.
A preposition meaning to or in order to develops into an infinitive marker. Body-part terms such as head and back become spatial and later temporal adpositions. Passive constructions develop
into ergative constructions. All of these paths of change (and many
others) are documented as independent developments in unrelated languages (Bybee, Perkins, and Pagliuca 1994).
In comparing these specific paths, certain general patterns
are discernible: As a construction grammaticalizes, its meaning becomes more general and abstract; its form becomes more
reduced and dependent upon surrounding material; it undergoes an extreme frequency increase; and its category membership can change, say, from verb to auxiliary, from noun or verb
to adposition. The lexical material upon which grammaticalization works is similar across languages, as are the mechanisms of
change. The fact, for instance, that one can infer intention from
a construction of be going to VERB and then later infer future
from that intention seems not to be culture specific, inasmuch
as the development of such a construction into future occurs in
languages all over the world. Thus, the commonalities found in
grammaticalization point to interactive, cognitive, and processing mechanisms that are shared across cultures.
Joan Bybee
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Blevins, Juliette. 2004. Evolutionary Phonology. Cambridge: Cambridge
University Press.
Browman, Catherine P., and Louis M. Goldstein. 1992. Articulatory phonology: An overview. Phonetica 49: 15580.
Bybee, Joan, Revere Perkins, and William Pagliuca. 1994. The Evolution of
Grammar: Tense, Aspect and Modality in the Languages of the World.
Chicago: University of Chicago Press.
Greenberg, Joseph. 1966. Language Universals: With Special Reference to
Feature Hierarchies. The Hague: Mouton.
. 1969. Some methods of dynamic comparison in linguistics.
In Substance and Structure of Language ed. Jan Puhvel, 147203.
Berkeley: University of California Press.
Greenberg, Joseph H., Charles Ferguson, and Edith. Moravcsik, eds.
1978. Universals of Human Language. Vols. 14. Stanford, CA: Stanford
University Press.

413

Language Families
Maczak, Witold. 1980. Laws of analogy. In Historical Morphology ed.
J. Fisiak, 2838. Berlin: Mouton de Gruyter.
Mowrey, Richard, and William Pagliuca. 1995. The reductive character
of articulatory evolution. Rivista di Linguistica 7.1: 37124.
Vennemann, Theo. 1988. Preference Laws for Syllable Structure and the
Explanation of Sound Change. Berlin: Mouton de Gruyter.

LANGUAGE FAMILIES
A language family is a set of languages that developed from the
same ancestral language. The best-known example is the IndoEuropean family, which comprises more than a hundred languages that, even in premodern times, extended from the Indian
subcontinent to northwestern Europe. This family is well known
not only because it contains many of the worlds most widely
spoken languages, such as Bengali, English, French, German,
Hindi, Portuguese, and Russian, but also because it was the main
focus of research in the nineteenth century, when linguistics
was established as a modern science. However, many other language families are subjects of intense research today, such as the
following:
Afro-Asiatic, which includes Arabic, Hebrew, and several
languages of northern Africa, including Ancient Egyptian,
Hausa, and Somali
Algic, which includes several native languages of North
America, such as Wiyot, Yurok, Cheyenne, Ojibwa, and
Shawnee
Austronesian, which includes more than a thousand languages spoken from Madagascar to Polynesia, such as Bahasa
Indonesia, Fijian, Tagalog, and Tahitian
Dravidian, which includes most of the non-Indo-European
languages of India, such as Malayalam, Tamil, and Telugu
Niger-Congo, which includes most of the languages of subSaharan Africa, including Igbo, Swahili, and Zulu
Sino-Tibetan, which includes Burmese, Chinese, and Tibetan
Tupi, comprising several languages of South America, including Guarani
Uralic, which includes most of the non-Indo-European languages of Europe, such as Estonian, Finnish, Hungarian,
Nenets, and Sami (Lapp)
This list is only a sample of the hundreds of known language
families and of the languages included in each one. For a comprehensive listing, see Gordon (2005).
Familial metaphors are the standard terms of art. Languages in
the same family are said to be genetically related to one another.
A language from which other languages developed is called an
ancestor, or parent, of those languages. Words that descend from
the same form in an ancestral language are related, or cognate.
This homey terminology is undergoing some competition with
that of modern cladistics as used in biology, but certain linguistic
concepts do not translate well. In biology, it is understood that
all biological taxa are related to one another, and family is but
a midlevel taxon. Linguists assume that relationships between
languages must be proved, and a language family is a maximal
taxon. In principle, even isolates languages that do not group
with other languages can trivially be considered families by
reinterpreting their dialects as separate languages.

414

The common ancestor of an entire language family is assigned


a name by prefixing Proto- to the name of the family, as in ProtoIndo-European and Proto-Afro-Asiatic. The place where the protolanguage was spoken is called the homeland of the language
family.
The study of language families is part of historical linguistics and is contextualized within a particular model of
language change: divergence. When innovations in one part
of a language community fail to spread to other parts, differences
accumulate until the community can be said to speak different
languages. It is this historical process that language-family theory is meant to model. But perhaps because language families
are commonly illustrated by showing similarities between languages (e.g., English mouse is cognate with Latin mus), the idea
arises that relatedness is about similarity between languages. In
fact, there is no requirement that cognates be similar at all (e.g.,
English two is related to Armenian yerku), and many sources of
similarity are disavowed as being irrelevant to the model. These
include borrowing (see contact, language), onomatopoeia,
universals (absolute and statistical universals), and
chance similarities.
The study of language families typically involves one or more
of the following enterprises:
Demonstrating that languages are related
Reconstructing the common protolanguage
Subgrouping the languages by hypothesizing intermediate
ancestors
Associating linguistic data with historical and archaeological
data
The following sections first describe the traditional and stilldominant methods for pursuing these tasks and then sketch and
evaluate some new methodologies.

Traditional Methods
The traditional technique is the comparative method. The
linguist studies characteristics that rarely recur across languages,
such as grammatical paradigms and the associations between
sound and meaning in morphemes. Efforts are made to discard
loans and onomatopoeia, although the former is a difficult and
often intractable problem. Matching morphemes across languages by meaning, one looks for recurrent sound correspondences. For example, English f corresponds to Latin p in father
= pater, feels = palpat, few = pauca, and many other words. If a
large number of recurrent correspondences are found, the languages are related. The recurrences are also used to reconstruct
the protolanguage (see historical reconstruction).
After a language family is identified, the next step is subgrouping, identifying the branches or groups within the family. Subgrouping seeks to uncover the history of the divergence
(cladogenesis) of a language family. If the family contains three
or more languages, the linguist looks for evidence that some
proper subset of those languages may have descended from
an intermediate common ancestor. This is done by looking for
shared innovations (synapomorphies) sound changes or new
words or grammatical constructions that were not in the ancestor language but are found in two or more of the descendant
languages. For example, the fact that English, German, Swedish,

Language Families
and several other languages have f where Proto-Indo-European
had p is a shared innovation that indicates that those languages
may have a shared intermediate ancestor that underwent this
change; otherwise, we would have to assume that each of those
languages separately innovated the change of p to f or borrowed
the innovation from another language. In fact, the preponderance of evidence supports such an intermediate language and
a branch (clade) of languages descending from it: the Germanic
languages. Other branches of Indo-European include the BaltoSlavic (including Bosnian, Lithuanian, Polish, and Russian),
Celtic (including Breton, Irish, and Welsh), Italic (including Latin
and the Romance languages), Indo-Iranian (including Bengali,
Farsi, Pashto, and Urdu), and the extinct Anatolian branch,
which included Hittite, Luvian, and Lycian. Several other languages, including Greek, Albanian, Armenian, and half a dozen
extinct languages (see extinction of languages), do not
share an agreed-upon intermediate branch at all.
Associating language history with external facts entails pinning
the protolanguage to a particular time and place its homeland
and demonstrating how it spread from there. The time depths
under consideration mean that written records are rarely, if ever,
available. The primary linguistic tool is to look for words found
in multiple branches of a language family and exhibiting all of
the regular sound correspondences; they are assumed to date
back to the protolanguage and, therefore, to name objects found
in its environment. For example, a pan-Indo-European word for
wheel suggests that the protolanguage split up no earlier than
the invention of the wheel, some six thousand years ago (Mallory
1989). Another technique is to look for areas of greatest linguistic diversity. The fact that the Austronesian languages are much
more diverse in Taiwan than anywhere else supports the theory
that they developed there longest, that is, that Taiwan was the
homeland for the Austronesian family (Blust 1999). A third technique is to seek archaeological evidence of population movements that may have disseminated a language family. In the
case of Austronesian, knowledge of how people spread through
the Pacific and Indian Oceans is consistent with the theory of
a Taiwan homeland. In the case of Indo-European, it has often
been noted that early adopters of horse-drawn wheeled chariots
would be in an ideal position to spread their languages throughout much of Europe and Asia. A well-received theory points to
the chariot users who lived in the PonticCaspian region about
six thousand years ago (Mallory 1989).

Challenges to the Traditional Method


The traditional comparative method is still the basic framework
within which language families are researched, but it is not perfect. It is a complicated process that demands a great deal of
knowledge about all of the relevant languages. It can be misled
by loanwords, and it offers little guidance in distinguishing true
shared innovations (synapomorphies) from independent identical innovations (homoplasies). The linguist must constantly
decide whether multiple languages could have undergone a
particular change independently and how likely they would be
to have borrowed it. In reality, of course, anything that happens
once can happen twice, and there is nothing that is not subject to
borrowing (Thomason and Kaufman 1988). The true solution is
probabilistic, but hard numbers are lacking, and the investigator

is often left with unlikely cladograms like the 15-way branching


tree of Indo-European.
Another disappointment is that little progress has been made
in the past century in pinning down the Indo-European homeland or proving that additional languages are related to English
topics of recurrent interest among linguists, archaeologists, and
enthusiasts alike. More disappointing is that when linguists have
claimed that language families such as Uralic are related to IndoEuropean such groupings often being given the Eurocentric
name Nostratic (our family) the methodology has given no
firm guidance as to how significant the evidence is, with the
result that many linguists find themselves uncomfortably agnostic on whether Nostratic has been proved or not. Unlike in modern experimental sciences, there are no statistical techniques for
estimating the probability that the number of correspondences
found is due to a real relationship between languages rather
than to chance. Rules of thumb were developed to provide some
guidance; a typical piece of advice is to treat words as potentially
cognate only if at least three of their consonants are found in
recurrent correspondence sets. But such rules are very approximate, not tailored to the specific structures of the languages at
hand, and they have discouraged linguists from applying the
method to languages with short morphemes.
Joseph H. Greenberg addressed several of these concerns
with a technique called multilateral comparison. Tables are constructed listing the translation equivalents for many concepts in
many different languages. It is claimed that the tabular layout
itself makes the relationships among the languages, even their
correct subgrouping, patent. Using this technique, Greenberg
presented an analysis of the languages of Africa (1963), which is
now considered standard, then went on to hypothesize language
families that lumped established families together into much
larger families what became known as deep linguistic relationships. The dozens of families and isolates of the Americas were
reduced to three families (Greenberg 1987); Indo-European,
Uralic, Japanese, and several other families were lumped into a
family called Eurasiatic (Greenberg 2002). Multilateral comparison has proven popular among enthusiasts, in part because it
requires no special language expertise, in part because it appears
to reveal many new, deep relationships. Unfortunately, there is
no way to evaluate a methodology that simply calls for contemplating raw data until patterns emerge.
Several researchers have shown, however, that some of
Greenbergs key ideas can be transformed into algorithmic
(reproducible) methodologies that introduce to language family research the benefit of statistical significance testing. Robert
L. Oswalts procedure (1998) minimized experimenter bias by
requiring that a specific concept list be used and that one specify
in advance specific criteria for measuring degree of similarity
between two languages. William H. Baxter and Alexis Manaster
Ramer (2000) added reliable significance-testing procedures
based on randomization tests. Brett Kessler and Annukka
Lehtonen (2006) adapted the technique to handle multiple languages in a single test, informally confirming Greenbergs claim
that such large-scale comparisons are inherently more powerful than two-language comparisons. Don A. Ringe (1992; see
Kessler 2001 for extensive discussion and methodological refinements) measured not similarity but the number of recurrent

415

Language Families
sound correspondences. This has the advantages both of being
closer to the traditional comparative method and of generating
correspondences useful for subgrouping and reconstruction.
Disappointingly, however, none of these neo-Greenbergian
techniques found evidence for the deep relations that were
advertised for the original, impressionistic method.
Other new techniques have concentrated on subgrouping.
Lexicostatistics (Swadesh 1955) was an early attempt to facilitate
subgrouping and also assign dates to protolanguages. The idea
was that if languages replace a constant number of words per
century with new words, then by measuring the percentage of
a list of words that is cognate between languages, one could calculate when the languages diverged and even construct a family
tree. Although these assumptions were mostly wrong and were
therefore rejected by most linguists, many people still use lexicostatistical techniques as a rough indication of a languages history in the absence of more compelling data.
Arranging many shared innovations into a binary tree is an
extremely laborious undertaking, especially given the possibility that some identical innovations are independent (homoplastic). The recent development of computational cladistic
methods similar to those used in biology (e.g., Ringe, Warnow,
and Taylor 2002) is a tremendous advance in helping the linguist find optimal trees. In addition, several solutions to the
problem of borrowing have emerged in the form of programs
that construct networks instead of trees. Shared innovations
that cannot be cleanly attributed to a shared ancestor are taken
as evidence of contact, obviating somewhat the need to make a
priori judgments about whether borrowing was involved (e.g.,
Bryant, Filimon, and Gray 2005; Nakhleh, Ringe, and Warnow
2005).
The problems of finding homelands and tracing the spread
of languages still requires one to resort to data that are often
suggestive but not definitive. Colin Renfrew (1987) added a new
perspective when he theorized that languages may be spread
by the movement of culture, rather than by the movement of
people. He suggested that Indo-European languages were
spread from Anatolia along with the adoption of agriculture.
Most linguists have not accepted this theory, in part because it
is incompatible with such linguistic data as an Indo-European
word for the wheel, which postdates the spread of agriculture
by millennia. Recently, further data are afforded by genetic
analyses of populations (genes and language). The presence of a Pontic genetic component in Europe is compatible
with the idea that invaders from the PonticCaspian region
brought Indo-European languages into Europe (Cavalli-Sforza,
Menozzi, and Piazza 1994).

Prospects
Recent computer techniques add simplicity, reproducibility,
and quantitative rigor to methodologies for proving relationships between languages, but so far there has been no noticeable increase in power over what experts are able to do by hand.
Failure to corroborate the sort of deep relationships conceived
by Greenberg may mean that better techniques need to be developed, or that the languages are not in fact related, or that the
answer is unknowable. Because languages are always changing,
they constantly lose information that links them to their relatives;

416

there must come a point at which any remaining commonalities


between languages are indistinguishable from chance levels.
But even if the more pessimistic predictions are true and new
methods are unlikely to greatly expand intensively studied families like Indo-European, they may greatly ease new analyses of
lesser-known languages.
New computerized cladistic methods are, likewise, already
aiding the analysis of complex language families and are providing Indo-Europeanists food for thought. However, the development and application of such algorithms could benefit from the
compilation and deployment of data about the probability of
various types of linguistic innovations and borrowings.
To date, the new methodologies have not been adopted
by most practitioners. While it is easy to fault established
researchers for conservatism, it is also true that quantitative methods typically cannot take into account the diverse
types of information that linguists are accustomed to reasoning with. Fortunately, the emerging partnerships between linguists and cladists should help bridge the gap between old and
new approaches and lead to the widespread adoption of hybrid
methodologies.
Brett Kessler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baxter, William H., and Alexis Manaster Ramer. 2000. Beyond lumping and splitting: Probabilistic issues in historical linguistics. In Time
Depth in Historical Linguistics, ed. Colin Renfrew, April McMahon,
and Larry Trask, 16788. Cambridge, UK: McDonald Institute for
Archaeological Research.
Blust, Robert. 1999. Subgrouping, circularity and extinction: Some
issues in Austronesian comparative linguistics. In Selected Papers
from the Eighth International Conference on Austronesian Linguistics,
ed. E. Zeitoun and P. J. K. Li, 3194. Taipei: Academia Sinica.
Bryant, David, Flavia Filimon, and Russell D. Gray. 2005. Untangling
our past: Languages, trees, splits and networks. In The Evolution of
Cultural Diversity: A Phylogenetic Approach, ed. Ruth Mace, Clare J.
Holden, and Stephen Shennan, 6985. London: UCL Press.
Cavalli-Sforza, Luigi Luca, Paolo Menozzi, and Alberto Piazza, 1994.
The History and Geography of Human Genes. Princeton, NJ: Princeton
University Press
Gordon, Raymond G., Jr., ed. 2005. Ethnologue: Languages of the World.
15th ed. Dallas: SIL International. Content also available online
at: http://www.ethnologue.com/
Greenberg, Joseph H. 1963. The languages of Africa. International
Journal of American Linguistics 29.1 (Supplement): Part 2.
. 1987. Language in the Americas. Stanford, CA: Stanford University
Press.
. 2002. Indo-European and Its Closest Relatives: The Eurasiatic
Language Family: Lexicon. Stanford, CA: Stanford University Press.
Kessler, Brett, 2001. The Significance of Word Lists. Stanford, CA: Center
for the Study of Language and Information.
Kessler, Brett, and Annukka Lehtonen. 2006. Multilateral comparison and
significance testing of the Indo-Uralic question. In Phylogenetic Methods
and the Prehistory of Languages, ed. P. Forster and C. Renfrew, 3342.
Cambridge, UK: McDonald Institute for Archaeological Research.
Mallory, J. P. 1989. In Search of the Indo-Europeans: Language,
Archaeology and Myth. London: Thames and Hudson.
Nakhleh, Luay, Don Ringe, and Tandy Warnow. 2005. Perfect phylogenetic networks: A new methodology for reconstructing the evolutionary history of natural languages. Language 81: 382420.

Language-Game
Oswalt, Robert L., 1998. A probabilistic evaluation of North Eurasiatic
Nostratic. In Nostratic: Sifting the Evidence, ed. J. C. Salmons and B. D.
Joseph, 199216. Amsterdam: Benjamins.
Renfrew, Colin. 1987. Archaeology and Language: The Puzzle of IndoEuropean Origins. London: Pimlico.
Ringe, Don A., Jr. 1992. On Calculating the Factor of Chance in Language
Comparison. Philadelphia: American Philosophical Society.
Ringe, Don, Tandy Warnow, and A. Taylor. 2002. Indo-European and
computational cladistics. Transactions of the Philological Society
100: 59129.
Swadesh, Morris. 1955. Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics 21: 12137.
Thomason, Sarah Grey, and Terrence Kaufman. 1988. Language Contact,
Creolization, and Genetic Linguistics. Berkeley: University of California
Press.

LANGUAGE-GAME
At the beginning of the Philosophical Investigations, Ludwig
Wittgenstein questions the Augustinian or traditional picture of
the essence of human language, according to which the meaning
of a word is the object for which it stands, so that a word to which
no object corresponds has no meaning. Learning language consists in giving names to objects, and the association between
the word and the object is established by ostensive teaching of
words. Wittgenstein attacks this conception as both confused
and reductive. It makes naming seem like a queer connection of
a word with an object, a kind of baptism of the object, as if meaning existed separately from the word and was attached to it by
a mental process: a remarkable act of mind or occult process.
Moreover, making the correspondence between name and object
a condition of meaning has the absurd consequence that when
the object no longer exists, the word no longer has any meaning.
Here, the meaning of a name is confounded with the bearer of
the name, whereas in fact when Mr. N. N. dies, one says that the
bearer of the name dies, not that the meaning dies (Wittgenstein
[1953] 1997, 40). The Augustinian picture of language is also
reductive. Even if we were to correct its conception of ostensive
definition (e.g., by replacing the occult process with training), it
would still err in being an oversimplification: Ostensive definition is too narrow; it does not describe everything that we call
language. We think that language consists in giving names to
objects, whereas in fact we do the most various things with our
sentences: Think of exclamations alone, with their completely
different functions. Water! Away! Ow! Help! Fine! No!
([1953] 1997, 27).
The functions of words, suggests Wittgenstein, are as diverse
as the functions of tools in a toolbox ([1953] 1997, 11). And
these are countless. But his toolbox analogy does more than
underscore the diversity of the uses of language; it also suggests
that words work much like tools in that their meaning resides in
their use: We remain unconscious of the prodigious diversity
of all the everyday language-games because the clothing of our
language makes everything alike ([1953] 1997, 224) and so it
is use that is the distinguishing mark. In reaction to the mentalist conception of meaning, which sees it unilaterally as a mental
connection between words and objects, Wittgenstein affirms
that the meaning of a word or sentence resides in the use we
make of it. He introduces the term language-game to highlight

the interplay, in the determination of meaning, between language and the actions into which it is woven and to bring into
prominence the fact that the speaking of language is part of
an activity, or of a form of life ([1953] 1997, 23; see forms of
life ).
In fact, Wittgenstein nowhere provides a well-rounded definition of language-game because his employment of the term
evolves and because it is what he calls a family resemblance
concept. He employs the term
i. to circumscribe various more or less broad domains of language. Here, we can speak of single and specific language-games,
such as those we play in our use of particular words or concepts (e.g., fear, game, hand, knowing) or in specific
activities (e.g., lying, thanking, cursing, making a joke,
following a rule, giving orders and obeying them); but these
specific language-games are subsumed under more general uses
of the term either a generic use: he calls the whole, consisting of
language and the actions into which it is woven, the languagegame ([1953] 1997, 7) or an anthropological use, that is, what
he calls the human language-game (1977, 554) (as opposed, say,
to the language-game of alien tribes).
ii. to describe the degree of sophistication of language use.
Here, Wittgenstein speaks of primitive or complicated languagegames.
iii. to describe how language works.
It is in its encapsulation of how language works that the
expression language-game is most eloquent. The expression
is due to the analogy Wittgenstein makes between language and
games, which supersedes the calculus analogy of the Tractatus,
thereby signaling the switch from his own conception of language
as a fixed and timeless symbolism to a conception of language as
essentially embedded in human practice language as essentially in use. In the Blue Book, he begins to question the idea that
speaking a language is, in all cases, to apply a calculus according
to strict and exact rules; rather, using language is much like playing a game. The game analogy is more fitting than the calculus
analogy in several ways:
1. Like game, language is a family resemblance concept. Just
as there is a multiplicity of games with nothing common to all,
there is nothing common to all of our uses of language that
makes them into language or parts of language, but they are
related to one another in many different ways, and it is because
of this relationship that we call them all language ([1953]
1997, 65).
2. Language is an activity, and it is essentially connected with
practice or use: our language-game is a piece of behaviour
(1980, 151); a language-game incorporates both language and
the actions into which it is woven ([1953] 1997, 7). To use language meaningfully is analogous to making a move in a game;
to understand a word is to know how to use it. Just as we learn
how to play a game by learning what the permitted moves are
in the game, we learn the meaning of words by learning what
is accepted as a meaningful use of the word. And here, it is not
the application of explicit rules but training (1970, 186) and
repeated exposure that are needed to play a game properly or
to use a word meaningfully. The concept of language-games
highlights the idea that the mastery of language is an acquired
skill or know-how, not a systematic (innate or acquired)

417

Language-Game
application of rules: To understand a language means to be
master of a technique ([1953] 1997, 199).
3. Like games, languages are rule governed, but this does not
mean that there are strict and precise rules for each languagegame: Just as there are not rules to legislate for everything in
a game (e.g., there are no rules for how high or how hard one
throws the ball in tennis), language, too, is not everywhere circumscribed by rules. And as in games, the learning of explicit
rules is not always necessary someone can have learned a
game without ever learning or formulating rules ([1953] 1997,
31) and so, too, in language can the game be learned purely
practically, without learning any explicit rules (1977, 95).
The constitutive rules of language are those of grammar
(1974, 184). What Wittgenstein means by grammar is not, however, what grammarians mean by grammar: It is neither a taxonomy of the structural features of a language nor a science
that describes or prescribes the correct or standard usage of
words or arrangement of words. Wittgensteinian grammar is a
generic term for the publicly determined (though this determination is not due to a concerted, but to an unconcerted, consensus) conventions or conditions (1974, 138, 88) that govern
our meaningful use of words or expressions. Languages are rule
governed, but the rules that govern them are not metalinguistic
norms that exist in advance of use; learning the meaning of a
word is learning how the word is used. Moreover, if grammatical rules determine what it makes sense to say, they cannot
themselves belong to the language-game: a grammatical rule
is a preparation (1993, 72) for a language-game, except in
heuristic language-games (e.g., pedagogical language-games;
language teaching) where the formulation of some rules is the
object of the language-game (the distinction here is an instance
of the use and mention distinction). To learn grammatical
rules is to learn what the conventional conditions and constraints of the uses of a word are, which linguistic moves are
meaningful and which are not. Just as the rules of a game constitute the game and its allowable moves, grammar constitutes
language and its allowable moves:
4. Like games, language is embedded in our social, cultural and natural ways of living that is, in our form of life.
Languages cannot be abstracted from the context in which
they live: words have meaning only in the stream of life
(1982, 913). Language is a normative practice, but it is also a
social practice. Any language is founded on convention ([1953]
1997, 355); it is embedded in the shared activities of the language users in a given community: To obey a rule, to make a
report, to give an order, to play a game of chess, are customs
(uses, institutions) ([1953] 1997, 199). To understand a language-game, one must be either immersed in the community
in which it is embedded or knowledgeable about that communitys practices: Someone not accustomed to, or aware of, the
practice in some Semitic cultures of saying Hamsa! (Five!)
a verbal conjuring of the five fingers of the hand that protects
against the evil eye would not understand the purpose of
the utterance. For a language to emerge or be possible, there
has to be something shared. What is shared is a distinct form
of life: the particular biosocial conditions and activities that
make particular languages possible. The human form of life
could not have produced a feline language, nor a feline form

418

of life a human language. Language and form of life are internally related: To imagine a language means to imagine a form
of life ([1953] 1997, 19), and to imagine a human language is
necessarily to imagine a human form of life, a human way of
being and acting that essentially involves both our biological
makeup and our social behavior. For our language-games are
impacted by the facts of our natural history, such as our biological human constraints; for example, the language-game
with colors is characterized by what we can and what we cannot do (1970, 345). Therefore, if certain very general facts of
nature were different from what they are, so would our concepts and language-games be. But to say that our languagegames are conditioned by certain facts (1977, 617), is not to
say that our language-games are justified by, or answerable to,
these facts.
5. Like games, language and language-games are autonomous. By this, Wittgenstein means that although our languagegames are rooted in our form of life, they are not accountable
to, or rationally grounded in, any reality: The language-game is
not based on grounds. It is not reasonable (or unreasonable). It
is there like our life (1977, 559). Rather than speak of symbols,
words, or sentences as the primary or elementary units of meaning (as logicians, including Wittgenstein himself in the Tractatus,
had done), the later Wittgenstein views the language-game as
the basic unit in linguistic activity; he urges us to look on the
language-game as the primary thing ([1953] 1997, 656), that
which does not have to be explained by any fact. Here, he can
be viewed as broadening Gottlob Freges context principle: The
context necessary for meaning is not the proposition but the
language-game (e.g., a sound is an expression only in a languagegame [(1953)1997, 261]).
In his last work, On Certainty, Wittgenstein dwells on the
importance of unmoving foundations for the possibility of
language-games: It is essential for our language-games that
no doubt appears at certain points (cf. 1977, 524). He argues
that some basic certainties such as The world exists or
This is a hand or Cats dont grow on trees are necessary,
unmoving foundations of our language-games (cf. 1977, 403,
411), that the whole language-game rests on this kind of certainty (cf. 1977, 446). This kind of certainty, however, is nonepistemic; he views it as a kind of trust: I really want to say
that a language-game is only possible if one trusts something
(1977, 509).
Danile Moyal-Sharrock
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baker, G. P., and P. M. S. Hacker. 1980. Wittgenstein: Understanding
& Meaning: An Analytical Commentary on the Philosophical
Investigations. Vol. 1. Oxford: Basil Blackwell.
Canfield, John V. 1993. The Living Language: Wittgenstein and
the Empirical Study of Communication Language Sciences 15.3:
16593.
Wittgenstein, Ludwig. [1953] 1997. Philosophical Investigations. Trans.
G. E. M. Anscombe. 2d ed. Oxford: Blackwell.
. 1970. Zettel. Ed. G. E. M. Anscombe and G. H. von Wright, trans
G. E. M. Anscombe. Berkeley: University of California Press.
. 1974. Philosophical Grammar. Ed. R. Rhees, trans. A. Kenny.
Oxford: Blackwell.

Language-Learning Environment
. 1977. On Certainty. Ed. G. E. M. Anscombe and G. H. von
Wright, trans. D. Paul and G. E. M. Anscombe. Amended 1st ed.
Oxford: Blackwell.
. 1980. Remarks on The Philosophy Of Psychology. Vol. 1. Ed.
G. E. M. Anscombe and G. H. von Wright, trans. G. E. M. Anscombe.
Oxford: Blackwell.
. 1982. Last Writings on the Philosophy of Psychology. Vol. 1. Ed.
G. H. von Wright and Heikki Nyman, trans. C. G. Luckhardt and
M. A. E. Aue. Oxford: Blackwell.
. 1993. Moores Wittgenstein lectures in 19301933. In
Philosophical Occasions: 19121951, ed. J. C. Klagge and A. Nordman,
46114. Indianapolis: Hackett.

LANGUAGE-LEARNING ENVIRONMENT
This term refers to the linguistic and sociocultural environment
in which children learn to talk and, in particular, to the language
to which they are exposed. The first systematic studies of the language-learning environment were conducted in response to the
claim that, like the language used among adults, the language
heard by children was grossly defective full of false starts, grammatical errors, and misleading pauses and as such represented
a very poor sample of the language that the child must eventually
learn. These studies showed that speech addressed to children
was largely clear, well formed, and semantically and syntactically simpler than speech addressed to adults, leading
some researchers to argue that in simplifying their speech, parents were presenting their children with graded language lessons
that could bear at least some of the burden of explanation for the
childs remarkably swift progress in language learning (see Pine
1994 for a review).
In fact, there are a number of problems with this view. First,
it confuses the notions of facilitating interaction and facilitating
acquisition. Thus, some of the adjustments made by parents to
facilitate interaction (e.g., the extensive use of questions) probably have the effect of increasing the complexity of the language
to which children are exposed during the early stages. Second,
it is unclear to what extent the adjustments made by Western
middle-class parents generalize across cultures. Indeed, ethnographic researchers have identified cultures in which parents
react with mirth or horror to the idea of holding conversations
with young language-learning children (see Lieven 1994 for a
review). Third, the notion that the language-learning environment somehow facilitates acquisition is theoretically rather vacuous in the absence of a reasonably well-specified theory of how
the child is learning from this environment.
Cognizant of this last problem, more recent work has focused
on the kind of information that is present in the environment
and the way in which this might be exploited by the childs
language-learning mechanisms. Thus, work in computational
modeling has shown that there is a great deal of information in
the statistical structure of human languages that could, in principle, be used to solve particular language-learning problems.
For example, work on segmentation has shown that it is possible to use information about the stress pattern, the phonotactics, and the transitional probabilities between syllables in
a language to identify boundaries between words (Brent 1999),
and work on the acquisition of syntactic categories has shown
that it is possible to categorize words quite successfully on

the basis of their co-occurrence statistics (Redington, Chater,


and Finch 1998) or their occurrence in frequent frames (Mintz
2003). Moreover, studies using modern infant techniques have
shown that very young children are sensitive to all of these
potential sources of information (Jusczyk 1999; Gomez and
Gerken 2000).
An interesting feature of this kind of work is the extent to which
it sheds new light on classical arguments from the poverty of the
stimulus. For example, John Lewis and Jeffrey Elman (2001) have
shown that it is possible for a simple recurrent network to learn
to obey structure-dependent rules such as those involved in the
formation of complex yes-no questions (e.g., Is the boy who is
dancing singing?) in the absence of exposure to complex yes/
no questions. This occurs because the knowledge that the network has acquired by processing simple yes/no questions and
complex declaratives constrains the way in which it deals with
complex yes/no questions.
A final strand of research has investigated the relation between
variation in language development and variation in the language
to which children are exposed. For example, research on children learning English has found a relation between their auxiliary development and mothers use of yes/no questions (e.g.,
Can you kick the ball?) in which the auxiliary occurs in stressed
utterance-initial position (see Richards 1990 for a review). One
problem with this kind of research is that covariance in patterns
of within-language variation often makes it difficult to distinguish
empirically between alternative explanations of the effects that
are found. One way of avoiding this problem is to focus on the
relation between cross-linguistic differences in patterns of development and cross-linguistic variation in the language to which
children are exposed. For example, Daniel Freudenthal and his
colleagues (2007) have recently shown that it is possible to simulate cross-linguistic variation in childrens tendency to use nonfinite verb forms in finite contexts in English, Dutch, German,
and Spanish as a function of the interaction between one identical mechanism that learns from the right edge of the utterance
(MOSAIC, Model Of Syntax Acquisition In Children) and the
statistical properties of child-directed speech in these four languages. MOSAIC produces high proportions of nonfinite verb
forms in finite contexts in Dutch, German, and English because
in these languages, the verb forms occurring in utterance-final
position in the input are much more likely to be nonfinite than
finite. However, it produces much lower proportions of nonfinite
verb forms in finite contexts in Spanish because in this language,
finite verb forms are much more likely than nonfinite verb forms
to occur at the right edge of the utterance.
Julian Pine
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brent, Michael. 1999. Speech segmentation and word discovery: A computational perspective. Trends in Cognitive Sciences 3: 294301.
Freudenthal, Daniel, Julian Pine, Javier Aguado-Orea, and Fernand
Gobet. 2007. Modelling the developmental patterning of finiteness
marking in English, Dutch, German and Spanish using MOSAIC.
Cognitive Science 31: 31141.
Gomez, Rachel, and LouAnn Gerken. 1999. Infant artificial language
learning and language acquisition. Trends in Cognitive Sciences
4: 17886.

419

Language of Thought
Jusczyk, Peter. 1999. How infants begin to extract words from speech.
Trends in Cognitive Sciences 3: 3238.
Lewis, John, and Jeffrey Elman. 2001. Learnability and the statistical
structure of language: Poverty of stimulus arguments revisited. In
Proceedings of the Twenty-Sixth Annual Boston University Conference
on Language Development, 35970. Somerville, MA: Cascadilla.
Lieven, Elena. 1994. Crosslinguistic and crosscultural aspects of
language addressed to children. In Input and Interaction in
Language Acquisition, ed. Clare Gallaway and Brian Richards, 5673.
Cambridge: Cambridge University Press.
Mintz, Toby. 2003. Frequent frames as a cue for grammatical categories
in child-directed speech. Cognition 90: 91117.
Pine, Julian. 1994. The language of primary caregivers. In Input and
Interaction in Language Acquisition, ed. Clare Gallaway and Brian
Richards, 1537. Cambridge: Cambridge University Press.
Redington, Martin, Nick Chater, and Steven Finch. 1998. Distributional
information: A powerful cue for acquiring syntactic categories.
Cognitive Science 22: 42569.
Richards, Brian. 1990. Language Development and Individual
Differences: A Study of Auxiliary Verb Learning. Cambridge: Cambridge
University Press.

LANGUAGE OF THOUGHT
This is a special language that has been postulated by a number
of writers G. Harman (1972), J. Fodor (1975, 1987) to explain
how humans and many animals represent and think about the
world. The language of thought (LOT) is claimed to be coded,
or entokened, in their brains, rather than in the way certain
formal languages are entokened in the circuitry of a computer.
What makes the LOT a language is that it possesses semantically
valuable, causally efficacious logico-syntactic structure: That is, it
consists, for example, of names, predicates, variables, quantifiers
(all, some) logical connectives (not, and, only if) and
operators (possibly, probably) that are combined to form
complex sentences that can be true or false.
The LOT need not be the natural language (e.g., English,
Chinese), if any, that a creature speaks, although some writers
have supposed that in adult humans the two may coincide (providing an interesting perspective in the Sapir-Whorf hypothesis
that thought is determined by language). Indeed, given that the
relevant sorts of intelligent behavior are displayed by many infralinguistic creatures infants, chimpanzees its postulation need
not be confined to natural language users. Nor need the LOT be
in the least conscious or introspectible: A persons thought processes might be explained by a LOT, while introspectively his or
her mental life might seem to consist wholly of images, feelings, and inarticulate impulses. Most importantly, processing a
LOT need not require any sort of intelligent creature to read and
understand the sentences being processed in the brain. Along
lines set out by Alan Turing, the processing of the LOT symbols
can be executed purely mechanically.
As in discussing any language, there are two issues raised by
such a postulation: i) There are syntactic issues regarding the
character of the actual symbols and the computations defined
over them. This is the kind of issue regularly addressed in rich
detail by programmers dealing with artifactual computers, and
with psychologists dealing with naturally occurring, living creatures, in their concern with data structures and algorithms for
dealing with them, for example, for vision or reasoning; and

420

ii) there are semantic issues regarding the meaning or interpretation of the symbols and data structures. In the case of artifactual
computers, this issue is usually conveniently settled by stipulation: The artifactor gets to say what the symbols represent (e.g.,
bank balances, chess moves). In the case of natural creatures,
there is, of course, no relevant artifactor, and so the meaning
of the symbols must be determined by some natural facts or
relations.
There are two (not necessarily exclusive) kinds of candidates
for a theory of meaning of a LOT:
i. Meaning is spelled out in terms of some of the symbols crucial causal/ conceptual roles in relation to other symbols, mirroring patterns of inference. This is a natural suggestion for logical
symbols, such as and and not: A symbol, @, might mean
and because states entokening sentences p and q separately
might cause and in turn be caused by a state entokening p@q
by itself (see Block 1986; Peacocke 1994).
ii. Meaning is spelled out in terms of causal relations that
the individual symbols bear to phenomena in the world. For
example, a symbol S might be entokened in a creatures brain in
a way that covaries with the presence of a certain shape before
the creatures eyes in various circumstances, such as under ideal
conditions, under evolutionarily selective ones, or under ones
that display a certain counterfactual structure (see Dretske 1988;
Stalnaker 1984; Fodor 1987).
The chief rivals to the LOT hypothesis are either one or another
form of interpretativism, according to which a creature has propositional attitudes only because ascription of such states permits
the most rational interpretation of their behavior (see Davidson
[1973] 1984; Dennett 1987), and/or the brain structures underlying that ascription are of a radical connectionist nonsyntactically structured sort (see Smolensky 1988). Its sometimes also
thought that some kind of system of imagistic representations
would not only be truer to introspection but also explain various
response-time results that suggest that people think in images
(see Kosslyn 1994).
The main reasons for believing in a LOT as opposed to these
rivals is that it would explain a number of interesting phenomena associated with the mind. Salient among these are the
following:
1. The propositional structure of attitudes: The standard
object of, for example, a thought, belief, desire, hope, or expectation is some kind of truth-valuable object, most perspicuously
expressed by a sentence; neither images nor connectionist
networks are able to systematically represent logically complex thoughts, involving, for example, negations and nested
quantifiers (what image could express Not everyone loves
someone?).
2. The causal efficacy of thought: A thought can cause bodily
states and movements because it is a structure entokened in the
brain.
3. The productivity of thought: People can potentially understand an infinitude of different thoughts formed by logical combinations of simpler ones, for example, Its possible for every
cat to chase some rat that eats some cheese that that lives in
the house that Jack built, since the LOT entokened in their brain
can (under reasonable idealization) produce a corresponding
infinity of sentences.

Language Policy
4. The systematicity of thought: If people can think some
thought p, then they can also think all logical permutations of
p; for example, people can think If John leaves, then someone
insulted Mary if and only if they can think If someone leaves, then
Mary insulted John, since they can mechanically recombine the
simple expressions of a sentence.
5. The intensionality of thought: People can think of things
in one way without thinking of those very things in another; for
example, they can think that the morning star is Venus without
thinking that the evening star is, and they can think about different nonexistent things, such as Zeus and Santa Claus. Indeed,
the LOT can explain the hyperintensionality of thought: People
can think about things that are even necessarily identical, as
when one thinks that Mark Twain but not Sam Clemens is
funny, because they employ correspondingly different LOT
symbols.
6. The multiple roles of attitudes: Different attitudes can be
directed upon the very same thought; for example, people can
believe, desire, suspect, doubt the same thought, that God exists.
Fodor (1975, 1987) and Georges Rey (1997) have argued that
these and other phenomena cannot be explained by the rival
views without substantial, additional ad hoc assumptions, for
example, that certain images or networks express logically complex thoughts and are causally related in the aforementioned
ways. For these reasons, the language of thought is to be preferred on empirical grounds (for more of an a priori argument
see, e.g., Davies 1991).
Georges Rey
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Block. N. 1986. Advertisement for a conceptual role semantics. In
Studies in the Philosophy of Mind. Ed. P. French, T. Uehling, and
H. Wettstein, 61578. Minneapolis: University of Minnesota Press.
Davidson, D. [1973] 1984. Radical interpretation. In Inquiries into Truth
and Interpretation. Oxford: Clarendon.
Davies, M. 1991. Concepts, connectionism, and the language of thought.
In Philosophy and Connectionist Theory, ed. W. Ramsey et al., 22956.
Hillsdale, NJ: Erlbaum.
Dennett, D. 1987. The Intentional Stance. Cambridge, MA: MIT Press
Dretske, F. 1988. Explaining Behavior: Reasons in a World of Causes.
Cambridge, MA: MIT Press
Fodor, J. 1975. The Language of Thought. New York: Crowell.
. 1987. Psychosemantics, Cambridge, MA: MIT Press.
Harman, G. 1972. Thought. Princeton, NJ: Princeton University Press.
Kosslyn, S. 1994. Image and Brain: The Resolution of the Imagery Debate.
Cambridge, MA: MIT Press.
Peacocke, C. 1994. A Study of Concepts. Cambridge, MA: MIT Press.
Rey, G. 1997. Contemporary Philosophy of Mind. Oxford: Blackwell.
Smolensky, P. 1988. On the proper treatment of conectionism.
Behavioral and Brain Sciences 11: 5974.
Stalnaker, R. 1984. Inquiry. Cambridge, MA: MIT Press.

LANGUAGE POLICY
Definition and History
Language policy studies the regular choices among varieties and
variants within a speech community the language practices
of members of the community, their beliefs about the values to
be assigned to the varieties, and efforts by individuals or groups

with or claiming authority to modify the practices or beliefs of


other speakers. Practices, beliefs, and management can be studied separately, although they turn out to be interdependent.
Early language managers were the Sanskrit and Arabic grammarians who guarded the purity of sacred texts, the medieval
European rulers who switched from Latin to the vernacular for
legal matters, and the language nationalists in the nineteenth
century who made their national language different from that of
their previous ruler.
Einar Haugen (1966) described how rival political ideologies supported competing invented varieties of Norwegian and
compromised by requiring schoolchildren to learn both. In the
1960s, scholars became interested in the language-planning
problems of postcolonial Africa and Asia (Fishman, Ferguson,
and Das Gupta 1968). They focused on status planning (making one variety of language official or designating it for school
use) and corpus planning (changes in the language itself, such
as standardizing it or providing it with a writing system or new
terminology) (Kloss 1966). Others concentrated on the cultivation and standardization of developed European languages
(Nekvapil 2007).
In status policy, the problem was to decide between the
demands of contending varieties; as a decision depended on
nonlinguistic values, such as the power of social or ethnic or economic groups, language planners had little real influence. There
was work to do in language academies or in writing textbooks
to purify the linguistic usage of schoolchildren. In the 1970s,
some scholars tried to evaluate the effect of corpus planning,
but it proved easier to keep doing it than to study its effectiveness (Fishman 1977). Language policy expanded when Robert
L. Cooper (1989) added a third area, language-acquisition planning, or language-education management, the effort to increase
the number of speakers. Related is language diffusion, governments working to spread their language outside their political
boundaries (Cooper 1982).

Rights and Theories


In the last half century, a further development has been the
study and promotion of human or civil rights associated with
language (Laitin and Reich 2003; May 2005; Skutnabb-Kangas,
Phillipson, and Rannut 1995). Building on principles first proposed after World War I, several international covenants of language rights for minorities have been formulated, and some have
been adopted by international bodies such as UNESCO and the
European Community; a smaller number have been ratified by
nation-states, and a few have been implemented.
There is no generally accepted theory of language policy.
Thomas Ricento (2001), in fact, argues that there cannot be one.
However, Joshua A. Fishman (1991) has presented a model of
reversing language shift, which includes a graded intergenerational disruption scale intended both to describe the state of a
language and its likelihood of being maintained and to suggest
how to resist further loss or reestablish earlier strength. Bernard
Spolsky (2004) has proposed that language policy has three
components (language practices, language ideology or beliefs,
and language management), and has sketched a theory based
on this proposal. Ji Nekvapil (2006), following Bjoern Jernudd
and J. V. Neustupn (1987), has put forward a theory of language

421

Language Policy
management, ranging from individual self-correction to the
organized management of all micro and macro levels. Detailed
descriptions of language policy such as Grenoble (2003), Kaplan
and Baldauf (2003) and Zhou (2004) have started to clarify the
complex dimensions that a theory must handle.

The Politics of Language Policy


One of the most critical facts or beliefs about language varieties concerns their power. There are many nation-states
that assume monolingualism to be ideal and combine this
assumption with a belief in the value, beauty, efficiency,
and desirability of their own national variety. This is true of
English-speaking nations, although it is challenged by counterassertions in South Africa, which has a long tradition of
claims for Afrikaans and has recently extended nominal recognition to nine African languages, and in Canada by the language-related claims of Quebec for independence. The belief
was first manifested in Spain, which continued its search for
purity after the expulsion of Moors and Jews with a proclamation carried to the New World of the value of Spanish; this
resulted in the virtual destruction of Native American languages. The belief in the importance of a single national variety was adopted by the Jacobins during the French Revolution
and gradually implemented in France and French territories
(Ager 1999): The difficulty of its implementation continues
to be demonstrated by the need to pass new laws and regulations. German Romanticism and nationalism (Fishman 1973)
provided an ideological base with the proclamation of the
truth of one nation, one language. Another example of ideological monolingualism is Japan, which during its period of
colonial expansion required conquered peoples to switch to
Japanese, and which has only recently taken note of minority
languages (Katsuragi 2005).
Commonly, the existence of two or more major languages
within a single nation-state or confederation is associated with
political conflict. One resolution is to favor a single variety, either
that spoken by the majority or that controlled by the dominant
elite. In the Soviet Union, a Lenin-inspired policy of recognizing
minority languages to speed up the spread of communism was
replaced under Stalin by a Russification policy (Grenoble 2003;
Lewis 1972). After the collapse of the Soviet Union, most of the
newly independent states reasserted the significance of their
territorial languages, so that currently each of the former Soviet
states (including Russia itself) appears to be working toward
monolingualism in the territorial language (Landau and KellnerHeinkele 2001; Ozolins 2003).

Territoriality
A second solution to problems associated with having multiple
major languages within a single nation-state is territoriality.
India tried to base its internal political divisions on language.
The partition into a Hindu-dominated India and Moslemdominated Pakistan was paralleled by a language-management
effort to divide what was previously considered a single language, Hindustani, into two Hindi written in Devanagari script
and Urdu with Perso-Arabic script (Annamalai 2001). The splitting up of India into states reflected major language differences,
although it could not capture the complexity of a nation with

422

2,000 varieties. Central Europe and the Balkans repeated this


process, as the partition of Czechoslovakia has led to official status for Czech and Slovak (Neustupn and Nekvapil 2003), and
the division of Yugoslavia has now led to efforts to distinguish
Serbian and Croatian (Pranjkovic 2001).
Belgium and Switzerland use territoriality to resolve language
conflict. Externally, both are believed to be bilingual. In fact,
Belgium is divided into language regions, some of which are officially French-speaking, others officially Dutch-speaking, and a
few officially German-speaking. Only Brussels is officially bilingual. The varieties spoken in these regions are neither Dutch nor
French but regional dialects; as a result, 40 percent of Belgian
high school students report that they are taught in a language
that they do not speak at home (Aunger 1993). In Switzerland,
each canton establishes its own language policy, choosing among
German, French, Italian, and Romansch. Knowledge of a second
language (other than the expanding use of English associated
with globalization) is no better than in other European countries
(Harlow 2004; Hrdegen 2001).
The special language problems of Africa were produced by
the fact that the borders drawn by colonizing European powers
in the nineteenth century did not coincide with ethnic, tribal, or
linguistic boundaries. After independence, African states had
to choose among a variety of languages, most of which were
also spoken in a neighboring states (Bamgbose 2000). Colonial
educational policy had favored the use of European metropolitan languages, absolutely in the case of French and Portuguese
colonies and, after initial use for a few years of local vernaculars,
in British colonies. Partly because choice of any one vernacular
would provide excessive power to its speakers, partly because
the elite already spoke the metropolitan language, and partly
because of inertia, postindependence efforts to establish the status of African languages have generally failed (Phillipson 1992).

Globalization and Local Resistance


Globalization has a major impact on language policy. One effect
has been the unparalleled diffusion of English, the most widely
used second language in most of the world. English is the favored
first foreign language in all European countries, spreading also
into former Soviet nations. In Asia, English is the lingua franca
for intercommunication among Japanese, Chinese, Koreans, and
Thais. International corporations, even those located in European
countries, tend to prefer English. Foreign language teaching is a
topic of interest mainly in English-speaking countries; elsewhere,
the major concern is English language teaching.
The protection of endangered languages is a recent concern
of many language policy scholars. They have noticed the rapid
loss of smaller minority languages, estimating that most of the
current 6,000 languages in the world will disappear in the next
hundred years (Krauss 1991). The threat comes not just from
world languages like Spanish (which has virtually denuded
Latin America of its rich linguistic diversity) or French (with its
strong monolingual ideology) or English (universally feared as
the exemplar of linguistic imperialism) but also from stronger
local languages like Swahili. Fishman (1990) provides a set of
benchmarks for studying loss and suggests how to reverse it. So
far, the most successful efforts at reversal have been associated
with grants of political autonomy, as in Spain (Hoffmann 1995),

Language Policy
the United Kingdom ( Laoire 1996; Coupland et al. 2006), and
Canada (Bourhis 2001). There are also efforts in New Zealand
(Spolsky 2005; May and Hill 2005), in South America (Hornberger
and King 2001), and among other indigenous peoples (McCarty
2003; Omoniyi 2003).
Speakers of major languages also fear language loss. This
can be seen in Spain, with the sensitivity of its Academy to language change; in France, with its growing number of regulations
and language agencies; in Russia, with its refusal to recognize
non-Cyrillic alphabets for minority languages and its claim of
defending the language rights of Russian-speakers in former
Soviet states; and even in the United States, where an Englishonly movement is struggling against what it sees as the threat of
Spanish and other immigrant languages to the survival of what
most people believe to be the strongest language in the world
(Baron 1990).
Language policy is a new and rapidly developing field, the
urgency and seriousness of which has resulted in activism by
scholars who feel responsible for correcting what they see as
injustices or blindness to the potential loss of linguistic diversity,
as well as in academic attempts to develop theories to explain
data from increasingly detailed descriptions of situations and
policies.
Bernard Spolsky
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ager, Dennis E. 1999. Identity, Insecurity and Image: France and Language.
Clevedon and Philadelphia: Multilingual Matters.
Annamalai, E. 2001. Managing Multilingualism in India: Political and
Linguistic Manifestations. New Delhi: Sage.
Aunger, Edmund A. 1993. Regional, national and official languages
in Belgium. International Journal of the Sociology of Language
104: 3148.
Bamgbose, Ayo. 2000. Language and Exclusion: The Consequences of
Language Policies in Africa, Beitrage zur Afrikanistik. Mnster and
Hamburg: LIT Verlag.
Baron, Dennis E. 1990. The English Only Question. New Haven, CT: Yale
University Press.
Bourhis, Richard Y. 2001. Reversing language shift in Quebec. In Can
Threatened Languages be Saved? ed. J. A. Fishman 10141. Clevedon
and Avon: Multilingual Matters.
Cooper, Robert L. 1989. Language Planning and Social Change.
Cambridge: Cambridge University Press.
Cooper, Robert L., ed. 1982. Language Spread: Studies in Diffusion and
Social Change. Bloomington: Indiana University Press.
Coupland, Nicholas, Hywel Bishop, Betsy Evans, and Peter Garrett. 2006.
Imagining Wales and the Welsh language: Ethnolinguistic subjectivities and demographic flow. Journal of Language and Social Psychology
25.4: 35176.
Fishman, Joshua A. 1973. Language and Nationalism: Two Integrative
Essays. Rowley, MA: Newbury House.
. 1977. Comparative study of language planning: Introducing a survey. In Language Planning Processes, ed. J. Rubin, B. H. Jernudd, J. Das
Gupta, J. A. Fishman and C. A. Ferguson, 3140. The Hague: Mouton.
. 1990. What is reversing language shift (RLS) and how can it succeed? Journal of Multilingual and Multicultural Development 11.1/2:
536.
. 1991. Reversing Language Shift: Theoretical and Empirical
Foundations
of
Assistance
to
Threatened
Languages.
Clevedon: Multilingual Matters.

Fishman, Joshua A., Charles A. Ferguson, and Jyotirinda Das Gupta. 1968.
Language Problems of Developing Nations. New York: Wiley.
Grenoble, Lenore A. 2003. Soviet Language Policy. Dordrecht, the
Netherlands: Kluwer Academic Publishers.
Harlow, Ray. 2004. Switzerland. In Encyclopedia of Linguistics, ed.
P. Strazny. London: Taylor and Francis.
Haugen, Einar. 1966. Language Conflict and Language Planning: The
Case of Modern Norwegian. Cambridge: Harvard University Press.
Hoffmann, Charlotte. 1995. Monolingualism, bilingualism, cultural
pluralism and national identity: Twenty years of language planning
in contemporary Spain. Current Issues in Language and Society
2.1: 5990.
Hornberger, Nancy H., and Kendall A. King. 2001. Reversing language
shift in South America. In Can Threatened Languages be Saved? ed.
J. A. Fishman, 16694. Clevedon and Avon: Multilingual Matters.
Hrdegen, Stephan. 2001. The Fribourg linguistic case controversy
about the language of instruction in schools in the light of freedom
of language and equal educational opportunities in Switzerland.
European Journal for Educational Law and Policy 5: 7382.
Jernudd, Bjoern, and J. V. Neustupn. 1987. Language planning: For
whom? In Proceedings of the International Colloquium on Language
Planning, ed. L. LaForge, 7184. Quebec: Presses de lUniversite Laval.
Kaplan, Robert B., and Richard B. Baldauf. 2003. Language and
Language-in-Education Planning in the Pacific Basin. Dordrecht, the
Netherlands: Kluwer Academic Publishers.
Katsuragi, Takao. 2005. Japanese language policy from the point of view
of public philosophy. International Journal of the Sociology of language 175/176: 4154.
Kloss, Heinz. 1966. German-American language maintenance efforts.
In Language Loyalty in the United States, ed. J. Fishman, 20652. The
Hague: Mouton.
Krauss, Michael. 1991. The worlds languages in crisis. Language
68.1: 410.
Laitin, David D., and Robert Reich. 2003. A liberal democratic
approach to language justice. In Language Rights and Political
Theory, ed. W. Kymlicka and A. Patten, 80104. Oxford: Oxford
University Press.
Landau, Jacob, and Barbara Kellner-Heinkele. 2001. Politics of Language
in the Ex-Soviet Muslim States: Azerbaijan, Usbekistan, Kazakhstan,
Kyrgyzstan, Turkmenistan and Tajikistan. London and Ann Arbor:
C.Hurst & Co. and University of Michigan Press.
Lewis, E. Glyn. 1972. Multilingualism in the Soviet Union. The
Hague: Mouton.
May, Stephen. 2005. Language rights: Moving the debate forward.
Journal of Sociolinguistics 9.3: 31947.
May, Stephen, and Richard Hill. 2005. Mori-medium education: Current
issues and challenges. International Journal of Bilingual Education
and Bilingualism 8.5: 377403.
McCarty, Teresa L. 2003. Revitalising indigenous languages in homogenising times. Comparative Education 29.2: 14763.
Nekvapil, Ji. 2006. From language planning to language management.
Sociolinguistica 20: 92104.
. 2007. Language cultivation in developed contexts. In Handbook
of Educational Linguistics, ed. B. Spolsky and F. M. Hult, 25165.
Oxford: Blackwell.
Neustupn, J.V., and Ji Nekvapil. 2003. Language management
in the Czech republic. Current Issues in Language planning
4.3/4: 181366.
Laoire, Muiris. 1996. An historical perspective of the revival of Irish
outside the Gaeltacht, 18801930, with reference to the revitalization of Hebrew. In Language and State: Revitalization and Revival in
Israel and Eire, ed. S. Wright, 5175. Clevedon and Avon: Multilingual
Matters.

423

Laws of Language
Omoniyi, Tope. 2003. Local policies and global forces: Multiliteracy and
Africas indigenous languages. Language Policy 2.2: 13352.
Ozolins, Uldis. 2003. The impact of European accession upon language
policy in the Baltic States. Language Policy 2.3: 21738.
Phillipson, Robert. 1992. Linguistic Imperialism. Oxford: Oxford
University Press.
Pranjkovic, Ivo. 2001. The Croatian standard language and the Serbian
standard language. International Journal of the Sociology of Language
147: 3150.
Ricento, Thomas, ed. 2001. Ideology, Politics and Language Policies: Focus
on English. Amsterdam and Philadelphia: John Benjamins.
Skutnabb-Kangas, Tove, Robert Phillipson, and Mart Rannut. 1995.
Linguistic Human Rights: Overcoming Linguistic Discrimination. Berlin
and New York: Mouton de Gruyter.
Spolsky, Bernard. 2004. Language Policy, Key Topics in Sociolinguistics.
Cambridge: Cambridge University Press.
. 2005. Maori lost and regained. In Languages of New Zealand, ed.
A. Bell, R. Harlow, and D. Starks, 6785. Wellington: Victoria University
Press.
Zhou, Minglang, ed. 2004. Language Policy in the Peoples Republic
of China: Theory and Practice since 1949. Dordrecht, the
Netherlands: Kluwer Academic Publishers.

LAWS OF LANGUAGE
The Concept of Law
The philosophy of science defines the term scientific law as a
meaningful universal hypothesis that is systematically connected
to other hypotheses in the field and, at the same time, well corroborated on relevant empirical data (cf. Bunge 1967). A law is
called universal because it is valid at all times, everywhere, and
for all objects of its scope.
A system of laws is called a theory. The construction of a theory is the highest and most demanding goal of scientific research
and can be undertaken only if and when a number of interrelated laws have been found. There is much confusion about the
term theory, especially in linguistics, where all kinds of formalisms, thoughts, approaches, descriptive tools, definitions, and
concepts are called theories. The philosophy of science distinguishes two kinds of theories: 1) the axiomatic theories of logics
and mathematics and 2) the empirical theories in the factual sciences. While the first ones make statements only within a given
axiomatic system and can be used only to construct analytical
truths, the latter ones make statements about parts of the world.
The truth of an empirical theory and of its elements, the laws,
depends not only on internal correctness but also on the correspondence with the facts of reality although every empirical
theory must have an axiomatic kernel.
The value of theories and their components, the laws, lies not
only in their role as the containers of scientific knowledge but
also in the fact that there can be no explanation without at least
one law: A valid scientific explanation (the so-called deductivenomological explanation; cf. Hempel and Oppenheim 1948) is a
subsumption under laws taking into account boundary conditions. Laws must not be confused with rules, which are either
prescriptive or descriptive tools without any explanatory power;
hence, grammars and similar formalisms also cannot explain
anything. Another significant difference is that rules can be violated laws (in the scientific sense) cannot.

424

Laws in the Study of Language and Text


In quantitative linguistics, the exact science of language
and text, distributional and functional kinds of laws are known.
The first kind takes the form of probability distributions; that is,
it makes predictions about the number of units of a given property. A well-known example of this kind is the Zipf-Mandelbrot
Law. The status of the corresponding phenomenon has been
discussed since the days of George K. Zipf, who was the first to
systematically study quantitative properties of language from
a scientific point of view. The law relates a) the frequency of a
word in a given text (in any language) to the number of words
with the given frequency (called frequency spectrum) and b) the
frequency of a word in relation to its rank (called rank-frequency
distribution). The first formulation by Zipf was later modified
and corrected by Benoit Mandelbrot, who derived the law from
the assumption that languages optimize their lexicons with
respect to code-production effort in the long run. This resulted
in the famous formula (1), which has the form of a rank-frequency distribution: If the words are arranged according to their
frequency, the most frequent word is assigned rank one, and so
on. The formula gives the frequency that a word should have at
a given rank:
(1)

f (r) =

K
(b+ r)

where f (r) is the frequency, r the rank, b and empirical parameters, and K a normalizing constant that makes the probabilities
sum up to 1.0.
Since the seminal works of Zipf and Mandelbrot, numerous
laws have been found. Other examples of distributional laws
are (in morphology and lexicon) the distribution of length,
polysemy, synonymy, age, part of speech (see word classes),
and so on, (in syntax) the frequency distribution of syntactic constructions, the distribution of their complexity, depth of
embedding, information, and position in mother constituent; (in
semantics) the distribution of the lengths of paths in semantic
networks (see also semantic fields), semantic diversification, and so on. Any property and any linguistic unit studied so
far displays a characteristic probability distribution.
The second kind of law is called the functional type, because
these laws link two (or more) properties. An illustrative example
of this kind is Menzeraths Law (also called Menzerath-Altmann
Law), which relates the size of linguistic constituents to the size
of the corresponding construct. Thus, the (mean) length of the
syllables of a word depends on the number of syllables the
word consists of; the (mean) length of the clauses in a sentence
depends on the length of the sentence (measured in terms of the
number of clauses it consists of). The most general form of this
law is given by formula (2):
(2)

y = Ax be-cx

where y is the mean length of the constituents, x the length of the


construct, and A, b, and c are parameters. (This law predicts the
function [2] but not the values of its parameters. They are estimated empirically on the data under analysis. Future research
may provide an enhanced version of the law that will also determine these parameters.) Experience shows that the parameters
are determined mainly by the level of the units under study. They

Laws of Language
3.2
3.0
2.8
2.6
2.4
2.2
2.0

Figure 1. The functional dependence of mean syllable length (y-axis) on


word length (x-axis) in Hungarian. The line represents the prediction; the
marks show the empirical data points.

(/vart/ > /vurde/) in the time period from 1445 to 1925. As the
graph shows, the replacement was very limited for the first 200
years, but even a much shorter time span can provide enough
information to predict the development over the next several
hundred years.
Another variant of this third kind of law is based on (discrete)
linguistic instead of (continuous) physical time. The simplest
way to operationalize linguistic time is the reference to text
position. In oral texts, there is a direct correspondence of the
sequence of linguistic units to physical time intervals.
Several linguistic characteristics can be investigated using
indices that relate their frequency to current text position, among
them the type-token-ratio (TTR). At each text position, the number of types occurred to that point is counted, which yields a
monotonously increasing curve, because the number of words
used before a given text position cannot decrease in the course
of the rest of the text. A straightforward theoretical derivation of
this law was given by Gustav Herdan (1966), represented by the
simple formula (4):
(4)

Figure 2. Typical curve representing the replacement of a linguistic unit


by a new one.
increase from the level of sound length gradually to the sentence
and suprasentence level. Figure 1 gives an impression of a typical
curve. Other examples are the dependence of word (or morph)
frequency on word (or morph) length, and the frequency of syntactic constructions on their complexity, of polysemy on length,
of length on age, and so on.
A special variant of a functional law is the developmental one.
Here, a property is related to time. The best-known example is
the Piotrowski Law, which represents the development (increase
and/or decrease) of the portion of new units or forms over time.
This law is a typical growth process and can be derived from a
simple differential equation with the solution (3):
(3)

p=

c
1 + ae bt

where p is the proportion of new forms at time t, c is the


saturation value, and a and b are empirical parameters.
Figure 2 shows the increase of the forms with /u/ at the cost
of the older form with /a/ in the German word ward > wurde

y = ax b

where y is the number of types, x the number of tokens (= text


position), and b a text characteristic. The parameter b is also an
indicator of the morphological type of the language under study
if word forms are considered because morphologically rich languages display a faster increase in word-form types than isolating languages.
A problem of the TTR, if used for text comparison, is that it is
not independent of the overall text length. Therefore, more complicated formulae are used to take this influence into account or
quite different models (cf. Popescu and Altmann 2006a, 2006b)
are applied.
Recent investigations have found that other linguistic units
show a similar behavior in their text dynamics (letters, morphs,
syntactic constructions, syntactic function types, etc.). However,
depending on the size of their inventory in language (which may
vary over several orders of magnitude compare, e.g., the size
of an alphabet or a phoneme system to the size of a lexicon),
different models have to be used. The TTR of syntactic units, for
example, is shown in Figure 3.

Theory Construction
Currently, there are two approaches to the construction of a linguistic theory (in the sense of the philosophy of science): 1) synergetic linguistics and 2) Gejza Wimmer and Gabriel Altmanns
unified theory.
The basic idea behind synergetic linguistics (cf. Khler 1986,
2005) is the aim to integrate the separated laws and hypotheses found so far into a complex model that not only describes
the linguistic phenomena but also provides a means to explain
them. This is achieved by introducing the central axiom that
language is a self-regulating and self- organizing system .
An explanation of the existence, properties, and changes of
linguistic, more generally semiotic, systems is not possible
without the aspect of the (dynamic) interdependence of structure and function. The genesis and evolution of these systems
must be attributed to repercussions of communication upon
structure (cf. Bunge 1998 and Khler and Martinkov 1998).

425

Laws of Language

Types

100

Figure 3. The TTR of syntactic constructions in a text.


The smooth line corresponds to the prediction; the
irregular line represents the empirical data.

50

0
0

300

600

900

1200

1500

Tokens

Synergetic modeling in linguistics starts from axiomatically


assumed requirements that a semiotic system must meet: the
coding requirement (semiotic systems have to provide a
means to create meaningful expressions) and the requirements of coding and decoding efficiency, of memory saving,
of transmission security, of minimization of effort, and many
others.
The other approach at theory construction in linguistics is
Wimmer and Altmanns unified theory. Integration of separately existing laws and hypotheses starts from a very general
differential (alternatively: difference) equation, as well as two
very general assumptions: 1) If y is a continuous linguistic variable (i.e., some property of a linguistic unit), then its change
over time or with respect to another linguistic variable will be
determined in any case by its temporary value. Hence, a corresponding mathematical model should be set up in terms of
its relative change (dy/y). 2) The independent variable that
has an effect on y also has to be taken into account in terms
of its relative change (i.e., dx/x). The discrete approach is analogical; one considers the relative difference yx/yx. Hence, the
general formulas are dy/y = g(x)dx and yx-1 / yx-1 = g(x). The
solutions of these equations are quite interpretable linguistically and yield the same results as the synergetic approach. The
great majority of laws known up to now can be derived from
these equations.
Both models, the unified and the synergetic, turn out to be
two representations of the same basic assumptions. The synergetic model allows easier treatment of multiple dependencies for
which partial differential equations must be used in the unified
model.
Reinhard Khler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Altmann, Gabriel. 1980. Wiederholungen in Texten. Bochum, Germany:
Brockmeyer.
Bertalanffy, Ludwig van. 1968. General System Theory: Foundations,
Development, Applications. New York: George Braziller.
Bunge, Mario. 1967. Scientific Research I, II. Berlin and Heidelberg:
Springer.

426

. 1998. Semiotic systems. In Systems: A New Paradigm for the


Human Sciences, ed. Gabriel Altmann and Walter A. Koch, 33749.
Berlin and New York: Walter de Gruyter.
Haken, Hermann. 1978. Synergetics. Berlin and Heidelberg: Springer.
Haken, Hermann, and R. Graham. 1971. Synergetik. Die Lehre vom
Zusammenwirken. Umschau 6: 191.
Hempel, Carl G., and P. Oppenheim. 1948. Aspects of scientific explanation. Philosophy of Science 15: 13575.
Herdan, Gustav. 1966. The Advanced Theory of Language as Choice and
Chance. Berlin: Springer.
Hebek, Ludek. 1997. Lectures on Text Theory. Prague: Oriental
Institute.
Khler, Reinhard. 1986. Zur linguistischen Synergetik. Struktur und
Dynamik der Lexik. Bochum, Germany: Brockmeyer.
. 1990. Elemente der synergetischen Linguistik. Glottometrika
12: 17988.
. 1995. Bibliography of Quantitative Linguistics = Bibliographie zur
quantitativen Linguistik = Bibliografija po kvantitativnoj lingvistike.
Amsterdam: Benjamins.
. 2005. Synergetic linguistics. In Quantitative Linguistik. Ein
internationales Handbuch. Quantitative [Linguistics: An International
Handbook], ed.Reinhard Khler, Gabriel Altmann, and Rajmund G.
Piotrowski, 76075. Berlin and New York: Walter de Gruyter.
Khler, Reinhard, and Zuzana Martinkov. 1998. A systems theoretical approach to language and music. In Systems: A New Paradigm for
the Human Sciences, ed. Gabriel Altmann and Walter A. Koch, Berlin,
51446. New York: Walter de Gruyter.
Mautek, Jn, and Gabriel Altmann. 2007. Discrete and Continuous
Modeling in Quantitative Linguistics. Journal of Quantitative
Linguistics 14: 8194.
Popescu, Ioan-Iovitz, and Gabriel Altmann. 2006a. Some aspects of
word frequencies. Glottometrics 13: 2346.
. 2006b. Some geometric properties of word frequency distributions. Gttinger Beitrge zur Sprachwissenschaft 13: 8798.
Wimmer, Gejza, and Gabriel Altmann. 2005. Unified derivation of
some linguistic laws. In Quantitative Linguistik. Ein internationales
Handbuch. [Quantitative Linguistics: An International Handbook],
ed. Reinhard Khler, Gabriel Altmann, and Rajmund G. Piotrowski,
76075. Berlin and New York: de Gruyter.
Zipf, George Kingsley. [1935] 1968. The Psycho-Biology of Language: An
Introduction to Dynamic Philology. 2d ed. Boston: Houghton-Mifflin.
Cambridge, MA: MIT Press.
. 1949. Human Behaviour and the Principle of Least Effort. Reading,
MA: Addison-Wesley.

Learnability

LEARNABILITY
generative grammar shifted the foundations of theoretical
linguistics away from discovery procedures the automatic construction of an optimal grammar to the problem of language
learnability, the question of how a natural language could be
learned in principle. This statement of the problem is so vague
as to be useless; so let us break the question into subparts and
consider them in turn.
The study of language learning might proceed from the observation and investigation of actual human children engaged in the
process of learning their mother tongue or tongues. However fascinating and useful this approach is, it is fraught with a number
of difficulties. Real children are engaged in a number of different
tasks and are changing along a number of different dimensions
while in the process of learning their first language. The developmental psycholinguist must, then, be careful of all these different
factors.
The investigation of learnability seeks to circumvent these difficulties by considering language learning as a problem in computational logic. The researcher in language learnability seeks to
construct an explicit algorithm that will produce a grammar for
a target language after finite exposure to evidence from that language. Such a researcher takes the rarefied view of language as
a set of strings, corresponding to the grammatical sentences
of that language. He or she supposes that the learner is actually
an algorithm that takes as input a text, an infinite sequence of
sentences. The text is constructed by drawing strings from the
language and presenting them, one at a time, to the learner. In
this case, the learner is presented with positive only evidence;
he/she is given information about sentences that are in the language but no information about sentences that are outside the
language. An alternative learning setting would be to allow the
learner to be tutored by giving him/her strings that are marked
for grammaticality. Such information demonstrably simplifies the learning task, but in real learning, the child is unlikely
to receive systematic evidence about grammaticality. As a result,
learnability research has generally proceeded from the assumption of positive-only evidence.
After each example is presented to the learner, that learner
makes a guess about the grammar of the target language. A
learner is said to converge to a grammar for a language just in
case the learner hypothesizes that grammar after finite exposure
to the text and never alters the hypothesis after that. If the grammar is correct, then the learner is said to have learned the target
language; truly, learning a language means that the learner has
hit the correct grammar and never changes his/her mind after
that. We only require, at this point, that the grammar generate
the correct set of sentences; we have not said anything about
how the grammar does so, and so we place no constraints on the
structural descriptions assigned to sentences.
A language is learnable if there exists a learner who, upon
finite exposure to the language, learns the language in the aforementioned sense. A set of languages is learnable if the leaner can
learn every language in the set.
We can turn to an early learnable result from E. M. Gold
(1967; see also Osherson, Stob, and Weinstein 1986). Imagine a
set of languages defined as follows: We define a language, call it

L0, which is an infinite set consisting of the symbol a repeated an


arbitrary number of times:
L0 = { a, aa, aaa, ... }

Otherwise, a language Li in the set consists of all strings of as


shorter than, and including, a repeated i times. Learning a language in this set just means converging to the i that is the index
for the language Li.
It is easy to see that this set of languages is not learnable from
positive-only evidence. Suppose that the longest string that the
learner has seen to date is of length n. Nothing about the text will
allow the learner to distinguish between L0 and Ln, and so the
learner will be incapable of converging. Thus, this class of languages is not learnable.
Golds result might seem to spell disaster for the learnability
project. The derivational machinery needed for the languages
that Gold used to prove his theorem is much simpler than what
would be required for the set of natural languages, yet Golds set
is unlearnable. Gold seems to have shown that the natural languages are not learnable in the sense outlined here.
We neednt fret for too long over this particular bugbear since
another way of thinking about Golds result is near at hand. In
particular, the set of learnable languages does not contain the set
that Gold constructed for his proof. Some have found it tempting to massage Golds result into an argument that the learner
is equipped with prior information (innate knowledge) about
the set of natural languages, although this goes beyond the actual
content of the result.
There are a number of interesting responses to Golds result.
Complexity bounds can be placed on the grammars that the
learner can consider. M. Kanazawa (1998) has shown that
restricted sets of categorial grammars are string learnable learnable using the kind of text presentation that we have
considered. Golds result entails that the entire class of categorial
grammars cannot be learned using this kind of evidence. Note
that Kanazawas result does not conflict with Golds theorem
since it holds only of a particular subset and not the entire class,
the latter case being what Golds theorem excludes.
Alternatively, learners might receive more evidence about the
target language than is present in a positive-only text. K. Wexler
and P. Culicover (1980) developed a proof of the learnability of
the set of rules in the transformational component of a 1970s style
transformational grammar. In their system, the learner is
presented with pairs consisting of the surface syntactic string
along with a base structure. This base structure is akin to the level
of deep structure (see underlying structure and surface structure), where the syntactic representation is a kind
of mentalese the language of thought, and would be invariant
across languages (the Universal Base Hypothesis). The learner
is presented occasionally with both a grammatical sentence and
its meaning. The proof shows both that the transformation rule
component could be learned and that a complexity bound could
be placed on the input evidence that the learner needed in order
to converge (their Degree 2 Learnability result).
S. Pinker (1984) considered the cases whereby the learner has
access to the string and a representation of its semantic content.
This process, called semantic bootstrapping, uses a set of heuristic rules to link semantic categories to syntactic categories.

427

Learnability
Costa Florencio (2003) has shown that the full range of categorial
grammars can be learned if the learner is presented with unlabeled structures.
Another response to Golds result is to impose different types
of constraints on the learners hypothesis space. For example,
the idea that universal grammar consists of a set of invariant
principles whose expression is regulated by a finite set of parameters has played a seminal role in linguistic theory over the past
quarter of a century (see principles and parameters theory and language acquisition). The learner is taken as
being faced with the finite task of discovering the correct value
of each parameter, where each parameter can take on one of a
finite set of parameters, given a text consisting of simple grammatical sentences.
The parametric approach usually assumes that the learner
is given positive-only input. After a sentence is presented to
the learner, it produces a hypothesis, possibly by changing the
value(s) of one or more parameters. It is obvious that even a relatively small set of parameters could produce an enormous space
of languages. The space of languages might, for example, contain local maxima. A local maximum might look correct to the
learner since it would always yield a structural analysis for any
input sentence, but it would systematically give the sentence an
incorrect structural analysis, for example. If this happened, then
the set of languages defined by that parameter space would not
be learnable relative to a simple learning device with positiveonly evidence.
In response to this problem, a number of different algorithms
were proposed. R. Clark (1992) proposed using a kind of artificial
evolution to converge to the target. A population of grammars
would be exposed to the input text, with the best performers
allowed to combine and produce offspring that had inherited
properties from the parent grammars. This approach uses the
parallelism implicit in a population to avoid the problem of local
maxima. This approach to learning falls into the class of probably
approximately correct (or PAC) learning; in this framework, the
learner is guaranteed to converge within a margin of error, where
the margin of error can be made arbitrarily small, but not zero.
Clark and I. Roberts (1993) extended this work to try to account
for language change. P. Niyogi (2006) has developed a more
sophisticated computational approach to this problem (see also
Yang 2002).
E. Gibson and Wexler (1994) tried to develop a learner that
used an algorithm that tests to see if resetting a parameter to
a new value actual improves the learners performance on the
input example. In order to avoid local maxima, they proposed
ordering the parameters according to a maturational sequence.
Readers should consult Frank and Kapur (1996) for an extensive
critique. S. Kapur (1991) developed a learning algorithm that
avoids local maxima by using a statistical model of indirect negative evidence. This algorithm is, once again, clearly within the
PAC learners.
Others have proposed using triggering evidence to set parameters (Dresher and Kaye 1990). On this model, each value of a
parameter would be associated with the abstract description of
a piece of triggering evidence that would cause the parameter
to be set to that value. The learner would scan the input text,
searching for examples that matched the description of a trigger;

428

Left Hemisphere Language Processing


only then would the learner set the parameter to the exemplified
value.
Charles Yang (2002) has imported a number of mathematical
tools from population biology to develop a sophisticated model
of parameter setting that is clearly inspired by evolutionary theory. Equally interesting work has been done on the learnability of
optimality theory (Tesar and Smolensky 2000) using techniques drawn from statistical machine learning. Although no full
proof of the learnability of parametric approaches yet exists, the statistical approaches as well as work in conventional machine learning (see Manning and Schutze 1999) promise to yield new insights
in language learning, language variation, and language change.
Robin Clark
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Clark, R. 1992. The selection of syntactic knowledge. Language
Acquisition 2: 83149.
Clark, R., and I. Roberts. 1993. A computational model of language
learnability and language change. Linguistic Inquiry 24: 299345.
Dresher, E., and J. Kaye. 1990. A computational learning model for metrical phonology. Cognition 34: 13795.
Florencio, C. C. 2003. Learning categorial grammars. Ph.D. diss.,
Universiteit Utrecht.
Frank, R., and S. Kapur. 1996. On the use of triggers in parameter setting. Linguistic Inquiry 27: 62360.
Gibson, E., and K. Wexler. 1994. Triggers. Linguistic Inquiry 25: 40754.
Gold, E. M. 1967. Language identification in the limit. Information and
Control 10: 44774.
Kanazawa, M. 1998. Learnable Classes of Categorial Grammars. Stanford,
CA: CSLI Publications.
Kapur, S. 1991. Computational learning of languages. Ph.D. Thesis,
Cornell University.
Manning, C. D., and H. Schutze. 1999. Foundations of Statistical Natural
Language Processing. Cambridge, MA: MIT Press.
Niyogi, P. 2006. The Computational Nature of Language Learning and
Evolution. Cambridge, MA: MIT Press.
Osherson, D., M. Stob, and S. Weinstein. 1986. Systems That Learn: An
Introduction to Learning Theory for Cognitive and Computer Scientists.
Cambridge, MA: MIT Press.
Pinker, S. 1984. Language Learnability and Language Development.
Cambridge: Harvard University Press.
Tesar, B., and P. Smolensky. 2000. Learnability in Optimality Theory.
Cambridge, MA: MIT Press.
Wexler, K., and P. Culicover. 1980. Formal Principles of Language
Acquisition. Cambridge, MA: MIT Press.
Yang, C. D. 2002. Knowledge and Learning in Natural Language.
Oxford: Oxford University Press.

LEFT HEMISPHERE LANGUAGE PROCESSING


Some History
The left hemisphere (LH) has been considered to be the primary
locus of language-specific processing for centuries. We now
know that the right hemisphere (RH) has considerable language abilities and should no longer be considered the minor
hemisphere. Recent research has demonstrated that both the left
and right hemispheres contribute to varying aspects of language
processing (Beeman and Chiarello 1997). Even so, historical and
current work still regards the left hemisphere as having a primary
and significant role in language processing.

Left Hemisphere Language Processing


One of the first functional accounts of brain-language relations geared to the LH, the Wernicke-Lichteim model, separated
language into activities such as listening, reading, writing, and
speaking. Based on classic lesion localization efforts, these activities were thought to be localized in different LH brain regions.
Geschwinds (1965) model proposed that brocas area a
region located in the left inferior frontal gyrus (LIFG) at the foot
of the motor strip near regions controlling mouth and tongue
movements was the seat of speech production, while
wernickes area in the posterior superior temporal regions
adjacent to the primary auditory cortex was the seat of auditory comprehension. While this simple model was appealing, it
has been clear since at least the 1970s that this view of language
is likely inaccurate. For example, individuals diagnosed with
Brocas aphasia have auditory comprehension deficits that are
exposed on simple experimental probing and have production
deficits that go well beyond those described by fluency measures
(Zurif and Caramazza 1976; Friedmann 2006).
Beginning in the 1970s, efforts were made to characterize language in the LH by reference to linguistic levels of analysis along
the lines of syntax, semantics, and phonology. The LIFG
has been suggested to be critical for syntax, while the temporal
lobe has been suggested to be important for the normal functioning of word-level semantics. Moreover, the posterior superior temporal gyrus (pSTG) has been suggested to play a critical
role in phonology. Even within these linguistic divisions, efforts
have been made to discover exquisitely detailed neurological
instantiations. The trace deletion hypothesis (Grodzinsky 2006),
for example, has taken a minimalist position on syntax-brain relations (see syntax, neurobiology of) and has suggested that
only sentence constructions that are derived from the displacement of an argument (e.g., a noun phrase) and that yield a trace
(see also movement) rely on an intact Brocas area, and that
other hypothesized aspects of syntax rely on more widely distributed anatomical regions. Other accounts of the relation between
syntax and the brain suggest that only those constructions that
are defined as complex rely on an intact LIFG. Alternative theories
suggest that syntax, broadly defined, requires a neuroanatomical
language network consisting of Brocas region as well as STG, the
middle temporal gyrus (MTG), and the white matter fiber tracks
(arcuate fasciculus) connecting these regions.

A Real-Time Perspective
The most current approach to brain-language relations as we
near the end of the first decade of the twenty-first century is formulated in terms of processing metaphors such as activation
and maintenance that require a real-time analysis. It has been
suggested that within the LH, Brocas area is required for fastacting, relatively automatic and reflexive processing routines;
more frontal areas are critical for executive functions underlying language processing, including selecting among alternatives;
left posterior temporal areas seem important for activating and
maintaining argument structure, an aspect of lexical-semantics
or conceptual structure.
Take, for example, the activation of meaning-related word
forms during sentence comprehension. It has been argued that
unimpaired individuals initially activate multiple meanings of
ambiguous words, regardless of the context of the sentence.

There is evidence that this exhaustive access is controlled immediately by the LH (Burgess and Simpson 1988). Soon after, a
lexical choice is made on the basis of frequency and context.
Evidence from aphasia also supports the role of the left anterior
frontal cortex in lexical access. For example, individuals with
Brocas aphasia appear to show a slow rise time of the initial
activation of multiple meanings, while those with Wernickes
aphasia evince normal patterns (Prather et al. 1991). Other activation accounts suggest that individuals with Brocas aphasia
(with damage to LIFG) underactivate lexical forms, while those
with Wernickes aphasia (with damage to STG) overactivate
(Blumstein and Milberg 2000).
The role of LIFG in real-time processing also appears to extend
to the comprehension of sentences that contain displaced arguments or those with filler-gap dependencies, for example, in
object relative (OR) constructions (e.g., The audience liked the
wrestler that the priest condemned *____ for foul language) where
a direct object argument or filler (e.g., wrestler) has been displaced from its canonical, post-verb position or gap (noted by *).
Individuals with Brocas aphasia do not activate the filler at the
gap in real time, unlike what is observed for neurologically intact
individuals and those with Wernickes aphasia or RH lesions
(Swinney et al. 1996). Thus, a real-time processing deficit may
underlie the inability for Brocas individuals to ultimately comprehend these constructions when they are probed with simple
sentence-picture matching tasks or grammaticality judgments.

Variability
Much of the work detailing the role of the LH in language processing has been based on descriptions of neuroanatomy conducted in the later third of the nineteenth and early twentieth
centuries. K. Brodmann (1909) suggested that the most functionally relevant parcellation of the brain is by cytoarchitectonics (cellular composition), but the map of Brodmanns Areas
is based on manually drawn borders of a single brain. K. Amunts
and colleagues (1999) examined 10 postmortem brains, and the
borders for each brain were automatically drawn and superimposed on a template to produce a group cytoarchitectonic
map. Large intersubject variability was uncovered, perhaps partially explaining why so much variability exists in the mapping
between behavior and anatomy from both lesion and functional
imaging studies.
Another possible contributor to intersubject variability is the
assumption of dead tissue only in and around the lesion. For
many years, investigators have assumed that structural lesions
(and the ischemic penumbra surrounding the lesion) were the
primary loci contributing to language deficits. With the advent
of more refined neuroimaging technology, such as perfusion
weighted and diffusion tensor imaging, researchers have been
investigating areas of the brain that are found to be structurally
intact yet not receiving an optimal supply of blood flow. These
hypoperfused regions give way to functional lesions inside seemingly intact neural tissue (Hillis 2007; Love et al. 2002).

The Role of Functional Neuroimaging in the Investigation


of Left Hemisphere Language
Lesion studies alone must be interpreted with caution as these
can only provide information regarding a specific (i.e., the

429

Left Hemisphere Language Processing


damaged) neural regions necessity to perform a particular
language task. Functional neuroimaging patterns, on the other
hand, describe the level of recruitment of specific area(s), not
the necessity of only the lesioned area for the process itself (Hillis
2007). It is the fusion of these and other methodologies that best
aides in the understanding and modeling of the brain basis of,
and networks involved in, language processing.
Neuroimaging research has demonstrated that the LH is particularly well suited for language processing, regardless of the
modality of language input (auditory or visual, as is found in
sign languages; Hickok, Love-Geffen, and Klima 2002). The
literature has demonstrated a LH bias for the neural circuitry
involved in the processing of complex versus simple sentence
constructions. More specifically, it has been argued that there
exists active recruitment of BA 44 and BA 45 (pars opercularis
[Stromswold et al. 1996] and pars triangularis [Caplan and
Waters 1999]) of the LH during the parsing of complex sentence
constructions (e.g., the filler-gap constructions described earlier). Yet other reports examining sentence comprehension have
found anterior temporal cortex activation, including STG and
MTG (e.g., Humphries et al. 2005; Stowe et al. 1999). It is quite
likely that the discrepancies found in the imaging literature are
due to varying methods of presentation, and differing behavioral
requirements of the participants, as well as experimental design
issues and analysis procedures.

The Integration of Language Processing in the Left


Hemisphere
Work from multiple tasks and methodological techniques have
been integrated to form the basis of neurocognitive models of
language processing. These models capture the choreographed
workings of neural regions during language processing. One
such model posited by A. Friederici (2002) argues for an LH
biased temporofrontal network. According to this model, identification of a word into a grammatical category (e.g., noun, verb,
determiner, etc.) begins at about 200 milliseconds after the word
is encountered in the speech stream, localized in the regions surrounding the anterior superior temporal cortex. At this temporal
point, such grammatical categories are placed into a hierarchical
syntactic form, which relies on the regions surrounding Brocas
area. At about 300500 milliseconds after a word is encountered,
the lexical entry is accessed, which allows for subsequent syntactic integration via Brocas area and semantic integration via
temporal lobe regions.

Translational Research
Finally, the investigation of language processing in the LH has
yielded a growing enterprise devoted to mapping recovery of
function in aphasia following brain damage. Converging evidence from clinical studies along with functional neuroimaging studies have demonstrated that, depending on individual
factors such as size and extent of lesion, premorbid handedness, and so on, recovery of language function may include the
undamaged regions of the LH language processing network or
homologous RH regions (Heiss et al. 1999; Kinsbourne 1971).
Treatment of language disorders for individuals with brain
damage, and the subsequent behavioral and neural changes

430

that are observed, may help illuminate brain-behavior mapping in both the left and the right hemispheres (e.g., Thompson
and Shapiro 2007).
Tracy Love and Lewis P. Shapiro
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Amunts, K., Axel Schleicher, U. Brgel, Hartmut Mohlberg, Harry Uylings,
and Karl Zilles. 1999. Brocas region revisited: Cytoarchitecture
and intersubject variability. Journal of Comparative Neurology 412
(August): 31941.
Beeman, M., and C. Chiarello, eds. 1997. Right Hemisphere Language
Comprehension: Perspectives from Cognitive Neuroscience. Hillsdale,
NJ: Lawrence Erlbaum.
Blumstein, S., and W. Milberg. 2000. Comprehension in Brocas and
Wernickes aphasia: Singular impairment. In Language and the
Brain, ed. Y. Grodzinsky, L. P. Shapiro and D. Swinney, 16783. San
Diego: Academic Press.
Brodmann, K. 1909. Vergleichende Lokalisationslehre der Grohirnrinde
in ihren Prinzipien dargestellt auf Grund des Zellenbaues. Leipzig:
Barth JA.
Burgess, C., and G. Simpson. 1988. Cerebral hemispheric mechanisms
in the retrieval of ambiguous word meanings. Brain and Language 33
(March): 86103.
Caplan, D., and G. Waters. 1999. Verbal working memory capacity and
language comprehension. Behavioral and Brain Sciences 22.1: 7794.
Friederici, A. 2002. Towards a neural basis of auditory sentence
processing. TRENDS in Cognitive Sciences 6.2: 7884.
Friedmann, N. 2006. Speech production in Brocas agrammatic
aphasia: Syntactic tree pruning. In Brocas Region, ed. Y. Grodzinsky
and K, Amunts, 6382. New York: Oxford University Press.
Geschwind, N. 1965. Disconnection syndromes in animals and man.
Brain 88: 585644.
Grodzinsky, Y. 2006. A blueprint for a brain map of syntax. In Brocas
Region, ed. Y. Grodzinsky and K. Amunts, 83107. New York: Oxford
University Press.
Heiss, W., J. Kessler, A. Thiel, et al. 1999. Differential capacity of left
and right hemispheric areas for compensation of poststroke aphasia.
Annals of Neurology 45.4: 4308.
Hickok, G., T. Love-Geffen, and E. Klima. 2002. Left temporal lobe
supports sign language comprehension. Brain and Language
82.2: 16778.
Hillis, A. 2007. Magnetic resonance perfusion imaging in the study of
language. Brain and Language 102.2: 16575.
Humphries, C. , T. Love, D. Swinney, and G. Hickok. 2005 . Response
of anterior temporal cortex to syntactic and prosodic manipulations during sentence processing. Human Brain Mapping
26.2 : 12838.
Kinsbourne, M. 1971. The minor cerebral hemisphere as a source of
aphasic speech. Archives of Neurology 25.4: 3026.
Love, T., D. Swinney, E. Wong, and R. Buxton. 2002. Perfusion imaging
and stroke: A more sensitive measure of the brain bases of cognitive
deficits. Aphasiology 16.9: 87383.
Prather, P., L. Shapiro, E. Zurif, and D. Swinney. 1991. Real-time examinations of lexical processing in aphasics. Journal of Psycholinguistic
Research 20.4: 27181.
Stowe, L., A. Paans, A. Wijers, F., et al. 1999. Sentence comprehension
and word repetition: A positron emission tomography investigation.
Psychophysiology 36.6: 786801.
Stromswold, K., D. Caplan, N. Alpert, and S. Rauch. 1996. Localization
of syntactic comprehension by positron emission tomography. Brain
and Language 52.3: 45273.

Legal Interpretation
Swinney, D., E. Zurif, P. Prather, and T. Love. 1996. Neurological distribution of processing operations underlying language comprehension. Journal of Cognitive Neuroscience 8.2: 17484.
Thompson, C., and L. P. Shapiro. 2007. Complexity in treatment of
sentence deficits in aphasia. American Journal of Speech-Language
Pathology 16: 3042.
Zurif, E., and A. Caramazza. 1976. Psycholinguistic structures in
aphasia: Studies in syntax and semantics. In Studies in Neurolinguistics,
ed. N. Avakian-Whitaker and H. Whitaker, 26092. New York: Academic
Press.

LEGAL INTERPRETATION
The judicial practice of legal interpretation provides the model
for other versions of legal interpretation, such as the interpretation of law by practicing lawyers. Interpretation of three types
of texts prior judicial decisions (which serve as precedents for
a present interpretive exercise), statutes, and constitutions
provides the model for interpretation of other texts, such as
regulations and treaties. It has a temporal and an institutional
dimension. Decision makers interpret texts written earlier,
sometimes substantially earlier, and produced by institutions
different from the ones engaged in the interpretive enterprise
(if only because of changes in personnel). These characteristics
generate some problems common to all three forms of legal
interpretation.

Case Law and the Interpretation of Precedents


Prior judicial decisions play an important part in every system
of legal interpretation. Civil law systems, such as those in France
and Germany, are committed to a legal ideology in which judges
revert to statutory texts directly, without reference to any prior
judicial decisions. Judicial decisions in such systems rarely refer
to prior decisions, but even in such systems, precedent plays an
important role before courts issue their opinions (Lasser 2004).
The common law system of unwritten law, the foundation of
law in Great Britain and the United States, was developed by the
courts themselves to regulate large portions of the law of property, contracts, and torts (accidents, among other topics). The
common law is unwritten only in the sense that the texts on
which it is based are prior judicial decisions, rather than legislative enactments.
Interpreting a judicial decision in order to apply it to a new
problem involves several analytic steps. Typically, a decision will
describe a cases facts and articulate several legal rules that the
court says lead it to its conclusion. The interpreter later must distinguish the decisions holding from any obiter dicta found in the
decision. On standard accounts, the holding is the rule or rules
necessary to support the conclusion, dicta any rule or rules that
could be eliminated from the courts discussion without altering
its conclusion (Marshall 1997). Later courts do not always disregard dicta, however, sometimes finding that they provide useful,
though not binding, guidance.
The prior decisions holding, once identified, must be applied
to the new problem. Again typically, that problem will differ in
some respects from that presented by the precedent. Courts apply
the precedents rule in two ways. For much of the nineteenth
century, and to some extent today, courts applied precedents
formalistically. They took the rule supporting the precedents

holding to follow deductively from some set of higher or more


general principles. The answer to the present problem then could
be deduced from those principles. The deductive process incorporates an effort to ensure that the application of a precedent in
the case at hand cohere with the entire set of abstract principles
that, taken together, constitute the common law. (Interpretation
of the codes in civil law systems is said to follow the same model,
except that the general principles are found in the codes provisions, rather than extracted by interpretation from prior decided
cases.)
The formalist approach to precedent was subjected to withering criticism in the twentieth century, primarily by the rule
skeptics associated with American legal realism (Rumble 1968).
They argued, and to much of the legal community demonstrated,
that the purported deductions never satisfied minimal standards
of deductive reasoning. The legal realists argued that what courts
actually did was to interpret precedents with an eye to the policies advanced by the rules articulated by the courts: The present
problem would be resolved by determining how the policies
embodied in the rules articulated in the precedents would best be
advanced. In the nineteenth century, for example, courts barred
an employee injured by the negligence of another employee
from recovering damages from their joint employer, in part
because the injured employee was said to be in a good position
to notice whether the other employee was a careful worker. Later
courts had to decide whether that policy was applicable where
the negligent worker labored in a different department, or was
the injured workers supervisor. Critics of the strongest versions
of legal realism wondered why the policy-oriented approach
should be described as involving interpretation at all. Policyoriented decisions, they argued, were entirely forward looking,
and the precedents did no more than provide a convenient heuristic to guide thinking about the present problem.
The temporal dimension of legal interpretation is apparent
on the surface when courts interpret prior decisions. The institutional dimension is revealed when we ask, particularly of the
policy-oriented interpreter, Why should a court today give any
weight to the rules articulated by courts in the past? Answers
vary, but most combine a Burkean ideal that judges today should
not be overly confident that they know better than their predecessors what good policies are, with a pragmatic sense that some
degree of reliance on prior decisions conserves judicial effort.

Statutory Interpretation
Questions of statutory interpretation as such arise only when
someone an enforcement official or a judge, for example has
some question about what a statutes terms mean. Where statutory terms are thought to be unambiguous, officials simply apply
the statutes, an operation that to them seems preinterpretive
(see also philology and hermeneutics). Application rather
than interpretation is likely to be more common soon after a
statutes adoption, because people will generally be familiar with
what the statutes enactors were trying to do unless, as happens
with some frequency, the adopters deliberately left specific provisions in the new statute unclear.
There are three prominent approaches to statutory interpretation in the United States, with parallels in other legal systems.
(For an overview of the contemporary discussion in the United

431

Legal Interpretation
States, see Vermeule 2006.) The textualist approach interprets a
statutes terms by asking what the words would mean to an ordinary reader (usually, a reader at the time the statute was enacted)
who is reasonably well informed about the meaning of the technical terms and about the entire statutory environment within
which the contested term is located.
Textualism is a rather bare-bones interpretive approach,
which to its critics requires the interpreter to ignore real and
accurate information about what a statute is designed to do.
Proponents of textualism often claim more clarity for the outcomes they reach than there actually is. Where ambiguity persists after considering the sources to which textualists limit
themselves, some other basis is needed for resolving the controversy. The most prominent candidate emerges from the assumption that ambiguous legislation should not be taken to disturb
the status quo. This assumption is sometimes expressed as a
canon of construction that ambiguous statutes should not be
construed to change the common law (that is, the background
rules that would apply if the legislature took no action). An alternative defense of textualism is comparative: that it reaches better
outcomes overall than alternative approaches that ask interpreters to assess information that they are not well equipped to handle, even though in particular cases the use of one or the other
approaches might produce a better result than textualism.
The intentionalist approach shifts the focus from the reasonable reader to the enacting legislature. It asks what the legislature intended to accomplish by enacting the statute. In its least
controversial version, the intentionalist approach directs the
interpreters attention to the problem the statute was designed
to solve, producing an interpretation that, in the judges view,
solves the problem as well as possible within the bounds set by
the statutes words as reasonably understood. Intentionalists in
the United States, more than in the United Kingdom, are willing to consult documents produced as the statute proceeded
through the enactment process (the statutes legislative history), such as reports by the committees that considered the
legislation and statements by the statutes supporters and opponents, to determine what the legislature meant by the terms it
used.
Textualists criticize these more expansive versions of intentionalism. Most narrowly, they note that materials drawn from
the legislative history are readily deployed strategically by advocates, who present only the materials that support the interpretation that will yield the result they favor, and selectively by judges,
who refer only to those parts of the legislative history that favor
that result the judges prefer for reasons independent of the interpretive enterprise. Critics also observe that referring to legislative history gives some degree of authority to committees and
individual members, whereas only the entire legislature has any
authority to enact law. Finally, critics question the coherence of
invoking intentionalist terms with respect to multimember legislative bodies. Some legislators might have favored the adoption
of a statutory provision because they thought that it solved an
important public policy problem, others because their constituents favored it, still others because important contributors to
their campaigns did so. How can these varying states of mind be
aggregated into an intent of the legislature? Continental legal
theorists elide this question by referring to the legislator in the

432

singular when discussing the institutions that adopt statutes, but


it cannot be avoided so easily.
The final prominent approach, usually called purposivism,
implicitly shifts the focus from the enacting legislature to the
interpreter. In a classic formulation, the purposivist assumes
that the legislature was composed of reasonable people seeking
to pursue reasonable purposes reasonably. Ambiguous statutory terms are to be interpreted so that the goals imputed to the
legislature are most likely to be achieved. Purposivism avoids
most of the problems associated with intentionalism, because,
although its proponents ordinarily refer to the legislatures
purposes, they are not truly concerned with a real institution
staffed by real people. Rather, purposivists construct an idealized legislature to which they impute purposes that they then
seek to implement. Yet purposivism typically lacks an account
of whether the interpreter should posit abstract or more concrete purposes.
Purposivism makes the institutional dimension of legal interpretation clear. It allocates effective decision-making authority to
courts, at least once the legislature does something that licenses
the courts to engage in the interpretive enterprise. Its proponents
believe that purposivism contributes to the good functioning of
the government overall, as courts and legislatures collaborate in
accomplishing good for the society. Critics respond with some
skepticism about the very idea of the public good as something
independent of the choices made by legislatures, and with the
observation that what the purposivists are doing cannot fairly be
described as interpretation. Rather, they suggest, the judges are
reading into the law their own policy preferences and then attributing those purposes to the statute.
Statutory interpretation also involves the use of canons of
statutory construction, which might be thought as well to constrain the judges power to interpret statutes merely to advance
their policy preferences. One example is the rule of lenity,
according to which criminal statutes should be construed where
fairly possible to limit the scope of criminal liability. Another
example, in legal systems with some form of judicial review for
constitutionality, is the canon that statutes should be construed,
again where fairly possible, to make them consistent with the
constitution or basic human rights. Scholars divide canons of
construction into two groups, substantive and legislative-intent
canons. Substantive canons embody policies that courts seek
to pursue independent of what legislatures actually sought
to accomplish in enacting particular statutes. The rule of lenity and the rule that statutes should be construed to limit their
impact on background law are examples. Legislative-intent
canons are rebuttable presumptions about what legislatures
seek to accomplish in enacting particular statutes. The canon
dealing with avoiding constitutional questions can be justified
on the ground that courts should not assume that legislators
sought to enact unconstitutional statutes. Karl Llewellyn offered
a classic critique of canons of interpretation, suggesting that
for each thrust built into one canon, there was a parry from
another equally well-established canon of statutory interpretation (Llewellyn 1950). So, for example, the canon Every word
and clause must be given effect was parried by the canon If
inadvertently inserted or if repugnant to the rest of the statute,
they may be rejected as surplusage.

Legal Interpretation
Canons of interpretation can fit into each of the interpretive
approaches. For the textualist, the canons are part of the general
background the ordinary reader is assumed to know as he or she
reads a contested statutory text. The intentionalist can defend the
substantive canons on the ground that legislatures typically do
not intend to infringe on the policy goals embodied in substantive canons. The purposivist has an easy time with substantive
canons, which directly embody judgments about good policy,
and can treat the other canons as similarly reflecting good policy
judgments, rather than imputations of legislative intent.

Constitutional Interpretation
Questions of statutory interpretation typically involve specific
and detailed provisions of complex statutes. Constitutional
interpretation, in contrast, typically involves the application of
general and abstract constitutional terms, such as freedom of
speech, due process of law, and equal protection of the laws
(to take examples from the U.S. Constitution) to specific problems. Here, too, there are two primary families of approaches.
The first family includes varieties of originalism. One version holds that constitutional provisions should be interpreted
to conform to the intent of the constitutions adopters. As with
intentionalism in statutory interpretation, original-intent
approaches run into many difficulties, such as the problem of
aggregating individuals intentions. In addition, when, as with
the U.S. Constitution, major provisions were adopted two centuries earlier, the task of identifying what any particular individual
understood a provision to mean is extremely difficult.
Finally, the abstract terms that constitutions use pose an additional problem: Should the provisions be interpreted according
to the abstract or the concrete understandings of their adopters?
Ronald Dworkin uses the term concept to describe the abstract
understanding, conception the concrete one (Dworkin, 1977).
Consider, for example, a constitutional provision dealing with
equality. Does that provision enact into fundamental law the
particular understandings the adopters had about equality, such
as the understanding that laws could treat men and women differently while still providing equality, or does it enact equality
itself, that is, the best understanding an interpreter can devise at
the moment of interpretation?
Some proponents of originalism responded to these and
other problems with an original-intent approach by arguing that
constitutional provisions should be interpreted according to the
original public meaning of their terms. The interpreter should
identify what the terms meant to a reasonable and well-informed
member of the public when the provisions were adopted. This
does not completely eliminate the evidentiary problems associated with original-intent approaches, because one is still searching for what the terms meant to individuals, but it substantially
expands the range of relevant materials to include uses of the
terms in newspaper discourse and the like. Similarly, it shifts terminology about abstract versus concrete intentions to references
to abstract versus concrete public meanings.
The temporal problem is perhaps the most difficult one facing originalist approaches. The problem is that the adopters or
the general public years ago could have no intentions about, or
understanding of, what the terms they used meant, in connection with developments they did not and could not anticipate.

One can rely on concrete intentions to rule out the possibility


that a practice they understood to be constitutionally permissible would later be found to be constitutionally impermissible,
but even then the reliance on concrete intentions or understandings requires a defense that goes outside the terms set by originalism itself. An alternative, similar to that offered by canons
of statutory interpretation, is to hold a practice constitutionally
permissible unless it is clearly precluded by the constitution
as originally understood. The justification for this alternative is
institutional: The constitution of a liberal democracy taken as a
whole should be understood to commit decision-making authority to democratically elected legislatures, unless the constitution
clearly takes that authority away from the legislatures and gives
it to the courts.
The second family has no standard name, but probably can be
best described as including varieties of perfectionism. According
to this group of approaches, general and abstract constitutional
provisions should be interpreted in accordance with some overarching principles of good government and individual liberty.
These principles can be relatively modest, as in a commitment
to democratic self-governance (Ely 1980), or more robust, as in
a commitment to justice broadly understood (Dworkin 1996).
Germanys constitutional court finds the perfectionist approach
to interpretation embodied in its constitutions commitment
to what it calls a basic order of values. (The Muslim concept of ijtiha-d might be thought to have a similar underlying
structure.)
Perfectionist approaches to constitutional interpretation
resemble purposivist approaches to statutory interpretation.
Interpreters, it appears, are to rely on their own best understanding of what the basic order of values is. This, some believe, is
inconsistent with democratic self-governance because it allows
judges to substitute their judgments about what justice or equality requires for the judgments made by elected representatives.
Critics suggest, in this setting as well, that what perfectionists
do cannot be called interpretation. The sting of that observation
might be reduced by responding that a constitutions text as such
has no authority anyway; only the long-standing practices that
people have come to accept have authority, and perfectionist
practices have been widely accepted for many years.
Another response, suggested by Stanley Fish (1994), is that
perfectionist interpretation is not as unconstrained as its critics
think. Judges are part of an interpretive community whose
shared understandings place significant limits on what even the
most willful judge will take as a responsible interpretation of a
constitutional provision. One important version of this view
describes a common law constitution, in which what judges
interpret is not primarily the constitution as written but the prior
decisions interpreting the constitution (Strauss 1996). Here,
constitutional interpretation reproduces common law interpretation. It is worth noting that these two defenses of perfectionist
interpretation do not preclude interpretations that are normatively unattractive, if the people or the interpretive community
settle on unattractive practices.
Consistent with the idea that judges are members of interpretive communities, judges in different nations take different
approaches to interpreting their constitutions (Goldsworthy
2006). The practice in the United States is quite eclectic, with

433

Lexical Acquisition
judges using originalism and perfectionism relatively unsystematically; the practice in Germany is more perfectionist, and that
in Australia is formalist.
Legal theorists have recurrently been attracted to the idea
that law, and legal interpretation, could become a science. The
direct invocation of sciences, such as linguistics, psychology, and
more recently neuroscience, to understand legal interpretation
has produced relatively little enlightenment, to the point where
it seems more likely than not that whatever science of law there
might eventually be, it will not be a science on the model of the
physical or biological sciences.
Mark Tushnet
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Dworkin, Ronald. 1996. Freedoms Law: The Moral Reading of the
American Constitution. Cambridge: Harvard University Press.
. 1977. Taking Rights Seriously. Cambridge: Harvard University
Press.
Ely, John Hart. 1980. Democracy and Distrust: A Theory of Judicial Review.
Cambridge: Harvard University Press.
Fish, Stanley E. 1994. Theres No Such Thing as Free Speech: and Its a
Cood Thing, Too. New York: Oxford University Press.
Goldsworthy, Jeff. 2006. Interpreting Constitutions: A Comparative Study.
New York: Oxford University Press.
Lasser, Mitchel de S.-O.-LE. 2004. Judicial Deliberations: A Comparative
Analysis of Judicial Transparency and Legitimacy. New York: Oxford
University Press.
Llewellyn, Karl. 1950. Remarks on the theory of appellate decision and
the rules or canons about how statutes are to be construed. Vanderbilt
Law Review 3: 395406.
Marshall, Geoffrey. 1997. What is binding in a precedent. In Interpreting
Precedents: A Comparative Study, ed. Neil MacCormick and Robert S.
Summers, 50317. Brookfield, VT.: Ashgate/Dartmouth.
Rumble, Wilfrid E. 1968. American Legal Realism: Skepticism, Reform,
and the Judicial Process. Ithaca, NY.: Cornell University Press.
Strauss, David A. 1996. Common law constitutional interpretation.
University of Chicago Law Review 63: 877935.
Vermeule, Adrian. 2006. Judging Under Uncertainty: An Institutional
Theory of Legal Interpretation. Cambridge: Harvard University Press.

LEXICAL ACQUISITION
Children start to produce their first recognizable words between
12 and 18 months of age, and typically understand more than
they produce. This asymmetry is a lifetime effect. The forms of
their earliest words often depart radically from the adult versions
(consider ga for squirrel) and may be hard to understand. But
as their pronunciation becomes more skilled (see phonology,
acquisition of), children add rapidly to the vocabulary at their
disposal. They add words for people, animals, everyday objects,
toys, food, and various activities and by age two generally produce between 200 and 800 distinct words.
Researchers have taken one of two main approaches to the
study of word acquisition in the last two decades: With the first
approach, they have postulated built-in constraints that would
limit the hypotheses children entertain about possible meanings, typically limited to noun meanings only (Markman 1989),
constraints that must later be overridden since they are incompatible with semantic relations, such as inclusion, overlap, and

434

partonomy, as well as with the meanings of verbs, adjectives,


and prepositions. Or, alternatively, they have argued that children rely on many of the same pragmatic principles as adults
in making inferences in context about possible meanings. Under
this view, childrens initial inferences are limited only by what
they know and the words they have already acquired (Bloom
2000; Clark 1993).
How do children assign some meaning to an unfamiliar
word? Once adult and child are both attending to the same
object or action, for example, the child can infer that that object
or actionat their locus of joint attention in the physical contextis the adults intended referent. That is, the child draws on
both physical and conversational context in assigning meanings
to unfamiliar words, regardless of word class (Clark 2009).
Moreover, once an object or action has been labeled, the child
can often infer that subsequent utterances are also relevant to
the newly identified referent. And these utterances, in turn, may
supply added information about properties (size, texture; manner of motion), relations (role of the object as agent, location, or
entity-affected, say; see thematic roles), function (common
uses, use on that occasion), and so on. The inferences children
make about meanings are guided by adult usage (a way of finding out the conventional way to designate each category) and by
the fact that new words must contrast in meaning with whatever
vocabulary they already know.
As children acquire more vocabulary, they build up semantic domains words for food, clothing, cars, animals; types of
motion and location; and relations in space, for example and
they organize and reorganize each domain as they add new
members. Members of a domain are typically linked by semantic
relations like X is a kind of Y, Z is part of A, B is made of C,
or D is used for E. But not all relations hold in every domain.
They depend on the meanings of individual lexical items. Among
verbs, for instance, the relations typically include their argument roles. A locative verb like put, for instance, is accompanied
by three arguments an agent, an object, and a location, as in
Miranda put the cup on the shelf. But a verb of motion like run
requires only one argument role: the doer or agent, as in Robert
ran fast (Clark 2003).
Building up each semantic domain also involves identifying
words that co-occur (dogs bark, but horses neigh), and common
collocations (compare disappearing ink and vanishing cream). It
requires working out the semantic relations that link such terms
as tiger, predator, and mammal, or tree, aspen, and gingko, on
the one hand, and throw, toss, twirl or break, tear, and cut, on the
other. It also requires that children learn the terms for parts and
wholes (thumb, finger, hand), for groups (flock, pod, herd; crowd,
reunion, meeting), for complex events (circus, opera, play), for
cycles (days of the week, months of the year), for relations (in,
above, behind; before, after; if, because), abstractions (justice,
equality, goodness), and much more.
Learning words is also the first step in learning constructions.
Many constructions are linked initially to specific verbs, and only
later extended to others that can take the same construction.
Children may learn want first with a direct object, as in I want
that. Then they start to use nouns in place of demonstrative that
(I want the ball, I want a spoon), and only sometime later do they
start to use want with a to-complement, as in I want to go out.

Lexical-Functional Grammar
They take even longer to add a subject to the complement, as in I
want Anna to come. Constructions often appear to be built up on
single lexical items on a one-by-one basis. This takes time (Clark
and Kelly 2006; Tomasello 2003).
Finally, adult usage plays a crucial role in acquisition.
Children track the frequencies of constructions in parental
speech and acquire first those constructions that occur most
often. It is adult speakers who model word use, who offer children conventional terms for talking about types of objects,
activities, and relations. And it is adults who continually check
up with young children on just what meanings the children
intended to convey.
Eve V. Clark
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bloom, Paul. 2000. How Children Learn the Meanings of Words.
Cambridge, MA: MIT Press. This book reviews how meaning acquisition is linked to speaker intentions and theory of mind.
Clark, Eve V. 1993. The Lexicon in Acquisition. Cambridge: Cambridge
University Press. This book examines childrens word learning and
their ability to coin words to fill gaps in their current vocabulary.
. 2009. First Language Acquisition. 2nd ed. Cambridge: Cambridge
University Press. A review of first language acquisition, and how cognitive and social factors interact in a usage-based approach.
Clark, Eve V., and Barbara F. Kelly, eds. 2006. Constructions in Acquisition.
Stanford, CA: CSLI. This book reports on studies of how children
acquire constructions.
Markman, Ellen M. 1989. Categorization and Naming in Children.
Cambridge, MA: MIT. This book presents a constraints-based account
of lexical acquisition.
Tomasello, Michael. 2003. Constructing a Language. Cambridge: Harvard
University Press. This book presents a usage-based approach to early
syntactic acquisition (see syntax, acquisition of).

LEXICAL-FUNCTIONAL GRAMMAR
Lexical functional grammar (LFG) is what is known as a constraint-based parallel correspondence architecture for a theory of
language (Bresnan 1982, 2001; Dalrymple 2001). It was called lexical because certain relations between elements, like that between
an active and a passive verb, were dealt with in the lexicon, as a
relation between lexical items. This contrasts with the approach
in transformational theories. Functional ambiguously refers
to grammatical relations, which are prominent in the theory, and
mathematical functions, which are used in the LFG-formalism.
The LFG-formalism can be mathematically modeled and, hence,
analyses expressed within it are susceptible to computational testing (see Dalrymple et al. 1995). LFG is often referred to as a syntactic theory, but like many other syntactic theories, it is actually
a framework within which theories of language can be expressed.
As one of the founders of LFG puts it: [T]he formal model of LFG
is not a syntactic theory in the linguistic sense. Rather, it is an
architecture for syntactic theory. Within this architecture, there
is a wide range of possible syntactic theories and sub-theories,
some of which closely resemble syntactic theories within alternative architectures, and others of which differ radically from familiar approaches (Bresnan 2001, 43).

A crucial underpinning idea is that any meaningful linguistic


element has associated with it different types of linguistic information, for instance, information about prosodic structure, about
category and constituent structure (see phrase structure),
about grammatical relations (also referred to as functions), and
about semantic structure (see semantics). It is furthermore
assumed that the organizational principles for these dimensions
may vary and that the formalisms used to represent the different
types of information should capture this variation. The information is represented in different dimensions, for instance c-structure (for category and constituent), f-structure (for functional),
a-structure (for argument structure), and i-structure (for information structure). Each dimension operates with its own fundamental categories and principles. The different dimensions are
related through mapping principles.
C-structure is represented in terms of a version of x-bar syntax, employing both lexical categories, such as noun, adjective,
verb, and preposition, and functional categories, such as complementizer, inflection, and determiner. Quite a restrictive approach
to functional categories tends to be taken; they are used for elements expressing crucial functional features whose distribution
is limited to a certain position within a phrase. Hence, a category
such as I, assumed in some transformational theories to form a
part of every clause in every language, is motivated within LFG
for languages where an inflected verbal element occupies a particular structural position. One example would be verb second
languages, where the finite verb occurs in the second position in
the clause and, hence, finiteness can be associated with this position. This can then be captured through a functional category IP,
headed by the finite verb in I and the initial element placed in the
specifier position. In English, finite auxiliaries have properties
that motivate the use of a functional category I (see Dalrymple
2001, 534).
A principle of economy of expression applies to constituent
structure to yield trees that look rather unorthodox from a standard X-bar perspective. This principle states that any constituent
introduced by a rule is optional unless some separate principle
requires its presence. This can be illustrated by the analysis of
finite auxiliary verbs in English. The category I, to which these
verbs belong, is assumed to be introduced by a rule I I VP;
however, in sentences that do not contain a finite auxiliary, the I
is not present, giving a tree such as that in (1).
(1)

IP

DP

VP

There is assumed to be a fair amount of typological


variation in c-structure among languages. A language such as
Wambaya, which has relatively free word order apart from the
constraint that an auxiliary-like element has to appear in second
position, is assumed to have a functional category IP, where the

435

Lexical-Functional Grammar
I takes an exocentric category S as its complement (for a discussion of the relevant data and potential constraints on the typological variation in constituent structure, see Nordlinger 1997).
F-structure is assumed to be reasonably invariant among
languages. It takes the shape of feature-value matrices, where
the features capture grammatical relations and functional features. The simplest features are those with atomic values, such as
[number plu]. Grammatical relations such as subj(ect), obj(ect)
or adj(unct) are represented as features that take f-structures as
their values. Each element that has lexical semantic content has
associated with it a feature pred, which has a semantic form as
its value. A verb such as tickle, for instance, has the feature-value
pair [pred tickle < (subj) (obj) >], that is, in the pred value of this
verb, tickle requires a subject and an object (for more details
on semantics within LFG, see especially Dalrymple 2001, 21754,
who develops an approach to semantic composition called glue
semantics). The pred feature also captures selectional properties,
which are based on functions and functional properties, rather
than on syntactic categories; a transitive verb selects for a subject
and an object, not for two noun phrases. An f-structure for the sentence Oscar tickled the cat can be found in (2).

(2)

PRED

SUBJ

OBJ

TENSE

)(

tickle SUBJ OBJ

PRED Oscar

GEND masc

PRED cat

NUM sg

SPEC def

past

Three well-formedness conditions apply to f-structures. The


general uniqueness condition requires each feature to have a
unique value. The completeness and coherence conditions ensure
compatibility between the requirements of a pred feature and its
local f-structure. Completeness requires that all functions specified by an elements pred feature be present in the f-structure
built up around that element; if, for instance, there had been
no obj in (2), completeness would have been violated. It also
requires those functions to have a semantic value, which prevents
an argument position from being filled by an expletive pronoun,
for instance. The coherence condition requires that all functions
present in a local f-structure be licensed by another elements
pred feature. If, for instance, there had been an obl(ique) in (2),
coherence would have been violated since no such function is
licensed by tickle. This is one example of the way that constraints
accounted for in terms of structure in other approaches are
expressed through f-structure in LFG.
The information captured in the pred feature is not a primitive of the theory. The syntactic valency is, in fact, derived from
the semantic roles associated with the verb; hence, this aspect
of the f-structure is derived from the a-structure of the element.
The relation between semantic and syntactic valency is specified through lexical mapping theory (LMT). LMT works in terms
of two features [o(bject)] and [r(estrictive)]. The feature [o]
captures the fact that certain thematic roles cannot fill an
object function, for instance agents, which can be subjects or
obliques, but not objects. The feature [r] distinguishes those
functions that are not restricted as to which thematic roles can

436

fill them from those that are restricted in this way; a subject, for
instance, can be associated with a large number of thematic
roles, whereas an oblique is restricted as to which thematic
role can fill it. LMT associates feature values to thematic roles
intrinsically for example, an agent is intrinsically associated
with [o] and by default for instance, the highest thematic
role according to a typologically motivated thematic hierarchy
is associated with [r]. Grammatical functions are then defined
in terms of these two features; a subject, for instance, is [-o] in
not being object-like and [-r] in not being restricted to any particular thematic role, whereas an oblique is [-o] but [+r]. Two
well-formedness constraints apply to the mapping; the function-argument bi-uniqueness condition which states that each
thematic role must be associated with exactly one function and
each function with exactly one thematic role and the subject
condition which states that every predicate must have a subject. For examples of how LMT works and how it can analyze
constructions such as locative inversion or complex predicates
in interesting ways, see, for instance, Dalrymple (2001, Chap. 8)
or Bresnan (2001, Chap. 14).
The mapping between c-structure and f-structure can be
structurally defined or identified through morphological elements. English is an example of a language wherein functions
are determined through their hierarchical position; the subject appears in the specifier of the IP. This is then captured
formally through the phrase structure rule in (3a), where the
up arrow should be read as the f-structure associated with
my mother node and the down arrow as the f-structure
associated with this node. The resulting tree can be found in
(3b), where indices have been inserted to identify f-structures;
the f-structure associated with the IP is referred to as f1, and
so on.
(3)

a.

IP f1

b.

DP
f2
(SUBJ)=

I' f 3
=

The arrows can now be replaced by the indices, and we get the
equations in (4).
(4)

f 1 SUBJ = f 2
f1 = f 3

These equations refer to three f-structures and define their relations; the f-structure f1 contains a feature subj which has as its
value the f-structure f2. The second equation states that f1 is identical to f3, which means that any featurevalue pair associated
with either of the two will also be associated with the other; the
two nodes IP and I will then be associated with one f-structure.
In fact, the categories that form the clausal backbone CP, IP
and VP will always share f-structure, so that featurevalue pairs
introduced to any of them will also be shared by the others. These

Lexical-Functional Grammar
categories are referred to as co-heads (Bresnan 2001, 102). The
equations in (4) give rise to the partial f-structure in (5).
(5)

SUBJ [ ] f

2
f

1 3

As further elements are added, the information contributed by


their lexical entries or by functional equations associated with
structure will be inserted into the f-structure as dictated by the
functional equations. This mapping procedure from c-structure
to f-structure, like all LFG mapping relations, has the property
of monotonicity; information can be added but never deleted,
moved, or changed.
For a language like Latin, there are no arguments for an
elaborate hierarchical clause structure, but rather a flatter exocentric structure is appropriate. In such a language, functions
are not defined structurally and there is no structural equation
of the kind illustrated in (3a). Instead, functions are identified through case marking, and this is captured directly in LFG
through an association between the value for the feature case
and a function. For Latin, there would then be a global equation
as in (6).
(6)

(CASE) = nom (SUBJ) =

This equation can be inserted at any noun phrase node and is


read as if the f-structure associated with this node contains the
featurevalue pair [case = nom], then the f-structure associated
with the node above contains the feature subj and the f-structure
associated with this node is the value of that subj feature. Or in
less formal language, if this node is nominative, then it is the subject of the node above.
The mapping principles permit non-one-to-one correspondences between dimensions of information. For instance, an
f-structure can contain a subj function without there being
a noun phrase in the corresponding c-structure. This is how
constructions generally referred to as pro-drop are analyzed,
though in LFG they are more appropriately named pronoun
incorporation. The verb in the Italian sentence Rido I laugh,
for instance, is analyzed as consisting of just a verb. This verb
contains in its f-structure description the type of information contributed by the subject pronoun in the corresponding
English sentence. The verb form rido would be associated with
equations such as those in (7).

(7)

The crucial part of (7) is the equation that introduces a pred


feature with a pronominal value for its subject. This means
that the principle of completeness is satisfied by the verb
itself.
Some of the fundamental properties of LFG have been illustrated here mainly through reference to f- and c-structure.

Lexical Learning Hypothesis


Information about dimensions of information not discussed
here can be found in publications listed in the bibliography on
the LFG Web site. At this site, information can also be found on
extensions and applications of LFG, such as combining LFG with
optimality theory (Bresnan 2000), as well as work by Rens
Bod and Ron Kaplan (2003), which combines linguistic theory
and statistical methods to create an exemplar-based theory of
syntax.
Kersti Brjars
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bod, Rens, and Ron Kaplan. 2003. DOP model for lexical-functional
grammar. In Data-Oriented Parsing, ed. Rens Bod, Remko Scha, and
Khalil Simaan, 21133. Stanford, CA: CSLI Publications.
Bresnan, Joan. 2000. Optimal syntax. In Optimality Theory: Phonology,
Syntax and Acquisition, ed. Joost Dekkers, Frank van der Leeuw, and
Jeroen van de Weijer, 33485. Oxford: Oxford University Press.
. 2001. Lexical-Functional Syntax. Oxford: Blackwell.
Bresnan, Joan, ed. 1982. The Mental Representation of Grammatical
Relations. Cambridge, MA: MIT Press.
Dalrymple, Mary. 2001. Lexical Functional Grammar. San Diego,
CA: Academic Press.
Dalrymple, Mary, Ronald M. Kaplan, John T. Maxwell III, and Annie
Zaenen. 1995. Formal Issues in Lexical-Functional Grammar. Stanford,
CA: CSLI Publications.
Falk, Yehuda. 2001. Lexical-Functional Grammar: An Introduction to
Parallel Constraint-Based Syntax. Stanford, CA: CSLI Publications.
Nordlinger, Rachel. 1997. Constructive Case: Evidence from Australian
Languages. Stanford, CA: CSLI Publications.
The LFG Web site is located online at: http://www.essex.ac.uk/
linguistics/external/LFG/.

LEXICAL LEARNING HYPOTHESIS


According to this hypothesis, childrens grammatical development is incremental and driven by the learning of lexical elements (see Pinker 1984; Clahsen 1996; and Eisenbeiss 2000,
2003, 2009 for overviews and references). The lexical learning hypothesis was developed by proponents of generative
grammar in order to address the poverty-of-the-stimulus
argument: In order to produce and understand new sentences,
children must generalize beyond individual input utterances.
However, they do not have reliable access to systematic corrections that would allow them to reject incorrect generalizations
about the target language. Therefore, generative linguists have
postulated an innate language acquisition device, universal grammar (UG), that constrains childrens hypothesis
space. According to the principles and parameters theory, UG contains i) principles that constrain all grammatical
representations and ii) open parameters that provide a finite
set of values, that is, options from which learners can choose
(Chomsky 1981). For instance, generative linguists assume
that all sentences contain subjects, but that languages may differ with respect to the positioning of subjects and their overt
realization (e.g., optional subjects in Italian versus obligatory
subjects in English). In such a model, language acquisition
only involves i) setting parameters to their target values and ii)
acquiring the lexicon.

437

Lexical Learning Hypothesis


If one assumes such a powerful acquisition device, one must
explain why children need several years to acquire their target grammar and initially produce non-target-like sentences
for example, subjectless sentences in English. Faced with this
developmental problem, proponents of the lexical learning
hypothesis argue that UG is available from the onset of grammatical development, but in order to set parameters, children still
need to learn the grammatical properties of the lexical elements
associated with these parameters.
These assumptions are in line with lexicalist generative
models: Initially, parameters referred to a heterogeneous set of
linguistic properties, for example, subject omissions, word order,
or morphological marking. However, cross-linguistic (parametric)
variation is closely linked to lexical properties, in particular to
properties of grammatical morphemes (see, e.g., Manzini and
Wexler 1987). For instance, Germanic languages with postverbal
negation exhibit a morphological distinction between first and
second person. Proponents of lexicalist models argue that this suggests a relationship between parameter values for word order and
the person specifications of subject-verb-agreement markers.
In recent generative models, such markers or function words (e.g.,
auxiliaries) are analyzed as realizations of functional categories
that project to phrases, just like the lexical categories verb and
noun. For instance, subject-verb-agreement markers are viewed
as realizations of the functional category INFL (Chomsky 1986).
Proponents of lexical learning regard functional categories
as the only source of parametric variation (Chomsky 1989), and
they argue that children should fix parameters and build up projections of functional categories by learning the properties of the
lexical elements that encode the respective functional categories.
Hence, one should find developmental correlations between the
acquisitions of lexical items and the acquisition of the syntactic
properties associated with the projections of the corresponding
functional categories. Such correlations have been documented
for instance, a correlation between the acquisition of the German
subject-verb-agreement paradigm and the target-like ordering
of subjects, verbs, and negation (Clahsen 1996). Moreover, if one
assumes incremental phrase-structure building, one can explain
developmental dissociations between realizations of different
functional categories for instance, the observation that German
children master the use of agreement markers associated with
INFL before they consistently produce complementizers, that is,
realizations of the functional category COMP.
Children show even more complex dissociations, however
(Eisenbeiss 2003): First, they start to realize different features of
the same category at different points. For instance, for the category case, German children mark the nominative/accusative
distinction before the accusative/dative distinction. Second, children do not acquire all instantiations of the same features simultaneously. For example, German children show case distinctions
on pronouns earlier than on articles. Third, childrens realizations of functional categories show lexeme-specific restrictions.
For instance, German children initially restrict the possessive -s
to some familiar names (e.g., Mamas mommys).
These observations can be captured in feature-based, lexicalist versions of the lexical learning hypothesis (see Eisenbeiss
2003, 2009 for discussion): In these models, cross-linguistic variation is not so much related to functional categories, as such,

438

but to their individual grammatical features (e.g., tense), which


are stored in lexical entries for grammatical morphemes and
project to phrases whenever these morphemes are combined.
According to such models, children should be able to acquire
individual features independently of one another, integrate them
into lexical entries for individual lexical/morphological elements
in an item-by-item fashion, and project each of these features
into phrases when these elements are combined. Thus, whether
or not a childs utterance involves a realization of a particular
grammatical feature and the corresponding syntactic operations
does not depend on a global parameter value. Rather, it depends
on the individual lexical items that the child has acquired so far.
Hence, developmental dissociations between individual lexical
items and individual features are expected. For instance, definite and indefinite articles are different lexical realizations of the
functional category determiner, and German children acquire
indefinite articles before definite articles. Similarly, when they
start producing definite articles, German children use feminine
forms correctly, but then incorrectly combine masculine forms
of articles with both masculine and neuter nouns. This suggests
that German children acquire the [FEMININE] distinction
before they instantiate the feature [MASCULINE] that distinguishes masculines from neuters.
Thus, in sum, the lexical learning hypothesis, that is, the idea
that syntactic development is driven by lexical development,
can provide accounts for the incremental nature of syntactic
development, as well as for the observed correlations between
lexical and syntactic development and the developmental dissociations that have been observed in childrens grammatical
development.
Sonja Eisenbeiss
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht,
the Netherlands: Foris.
. 1986. Barriers. Cambridge, MA: MIT Press.
. 1989. Some notes on economy of derivation and representation. In MIT Working Papers in Linguistics 10, 4374, Massachusetts
Institute of Technology, Cambridge.
Clahsen, Harald, ed. 1996. Generative Perspectives on Language
Acquisition: Empirical Findings, Theoretical Considerations and
Crosslinguistic Comparisons. Amsterdam: Benjamins. With relevant
contributions by Harald Clahsen, Sonja Eisenbeiss, and Martina
Penke; Jrgen Meisel and Maria-Jose Ezeizabarrena; Andrew Radford;
and Thomas Roeper.
Eisenbeiss, Sonja. 2000. The acquisition of the determiner phrase in
German child language. In The Acquisition of Syntax: Studies in
Comparative Developmental Linguistics, ed. M.-A. Friedemann and
L. Rizzi, 2662. London: Longman.
. 2003. Merkmalsgesteuerter Grammatikerwerb. Eine Untersuchung
zum Erwerb der Struktur und Flexion der Nominalphrase. Available
online at: http://www.ub.uni-duesseldorf.de/home/etexte/diss/
show?dissid=1185.
. 2009. Generative approaches to language learning. Linguistics
47.2: 273310.
Manzini, Rita, and K. Wexler. 1987. Parameters, binding theory, and
learnability. Linguistic Inquiry 18 (July): 41344.
Pinker, Steven. 1984. Language Learnability and Language Development.
Cambridge: Harvard University Press.

Lexical Processing, Neurobiology of

LEXICAL PROCESSING, NEUROBIOLOGY OF


The lexicon is the store of words in the mental dictionary.
A typical English-speaking high school graduate knows about
60,000 words, a literate adult perhaps twice that number (Miller
1991, 138). A word can be regarded as a long-term memory
association of semantic, syntactic, phonological, and
orthographic structures. For example, the lexical entry for rose
includes the following components, with the semantic component symbolized by a picture for convenience:
rose
meaning:
part of speech:
phonology:
orthography:

noun
/roz/
ROSE

During the past two decades, there has been remarkable progress in understanding the neural substrates of lexical processing,
mainly because of advances in two complementary approaches for
investigating the functions of specific brain structures: 1) the lesion
method, which, when used with ample numbers of patients who
are carefully studied both neuropsychologically and neuroanatomically, can yield indispensable insights about the neural systems that
are necessary for particular abilities; and 2) functional imaging techniques, such as fMRI, which allow researchers to identify with more
fine-grained spatial resolution the brain structures that are engaged
during the normal performance of certain tasks (see neuroimaging). Much more has been learned about the neural substrates of
lexical processing than can be summarized here, and so this review
concentrates on cortical regions that have been linked with the recognition and production of spoken and written word forms.

Neural Substrates of Spoken Word Recognition and


Production
It is well established that the sensorimotor aspects of spoken
word processing depend on the left perisylvian cortex, and
there is growing evidence that both the posterior superior temporal (auditory-related) and the posterior inferior frontal
(motor-related) sectors of this large anatomical territory contribute to both speech perception and speech production (Imada
et al. 2006; Okada and Hickok 2006; Pulvermller et al. 2006;
Skipper et al. 2008). These two regions interact not only via direct
connections but also via an indirect pathway mediated by the
inferior parietal lobule (Catani, Jones, and Ffytche 2005).
To understand spoken words, listeners must first use the auditory input to activate stored representations of lexical-phonological form. It is only after this process of lexical access has been
achieved that the semantic and syntactic properties of words can
be activated and used to construct higher-level representations of
the utterance. Numerous behavioral studies suggest that speech
information is continuously projected to the lexicon, so that an
initial sequence like bla will activate all the words in the listeners lexicon that begin with those sounds (black, bland, blanket,
etc.); as the input accumulates, the set of activated words diminishes until only a single one matches the input, at which point
recognition can be said to occur (McQueen, Dahan, and Cutler
2003). Pseudowords (e.g., blash) also activate partially matching
candidate words, but ultimately no winner is selected.

A broad perspective on the neural correlates of spoken word


recognition is provided by P. Indefrey and A. Cutler (2004), who
report a meta-analysis of 55 experiments in which subjects passively listened to tones, pseudowords, words, or sentences. It
was found that all of the different types of auditory stimuli reliably
activate overlapping, as well as partially differentiated, central
and posterior regions of the superior temporal gyri in both hemispheres. In addition, the following hierarchical organization was
observed: As the linguistic complexity of the stimuli increases,
there is recruitment of progressively more anterior regions of the
left superior temporal sulcus. Thus, moving anteriorly, there is
first an area responsive to pseudowords but not tones, then an
area responsive to words but not pseudowords, and finally an
area responsive to sentences but not words. The anterior area
that is selectively activated by words may contribute to the resolution of the lexical competition process described here; however, it is also conceivable that this operation is subserved by one
of the more posterior word-specific areas (Orfanidou, MarslenWilson, and Davis 2006). After the phonological form of a word
has been recognized, its semantic and syntactic components are
retrieved. As summarized by Indefrey and Cutler (2004), these
processes may be executed by a wide distribution of predominantly left hemisphere brain regions, including most notably
the middle and inferior temporal gyri and the posterior inferior
frontal gyrus.
Turning to spoken word production, one of the most influential theories is that proposed by W. J. M. Levelt, A. Roelofs,
and A. S. Meyer (1999). According to their model, the production of spoken content words depends on multiple processing
stages, each of which generates its own characteristic output
representation (Figure 1). First, conceptual preparation involves
identifying the meaning of the word to be produced. Second,
lexical selection involves activating the lemma for the word
that is, a unit that intervenes between semantics and phonology
and that serves as the gateway to syntactic features (e.g., grammatical category, number, tense, etc.; these features are not
shown in Figure 1). Third, form retrieval involves calling up the
phonological code for the word. Fourth, syllabification involves
determining segmental clusters and metrical assignments. Fifth,
phonetic encoding involves transforming syllabic units into
motor instructions. And sixth, articulation involves the final programming of overt speech.
The neural correlates of the first stage, conceptual preparation, remain mysterious, largely because this stage constitutes
the complex interface between language and thought and is also
heavily influenced by social-cognitive perspective-taking abilities for example, the same piece of real estate can be called the
coast, the shore, or the beach, depending on ones communicative
goals (Tomasello 1999, 119). Future research may show that conceptual preparation is subserved by widespread cortical structures that underlie semantic processing (Kemmerer 2010; see
semantics, neurobiology of). The next two stages, lemma
selection and phonological form retrieval, both involve core
lexical processes, and their neural correlates are beginning to be
understood. In a meta-analysis of 58 functional imaging studies
including several studies employing magnetoencephalography,
which has excellent temporal resolution Indefrey and Levelt
(2004) found that lemma selection is linked with the midsection

439

Lexical Processing, Neurobiology of

Figure 1. The LRM (i.e., Levelt, Roelofs, and Meyer) model of spoken word production. Left column: Word production tasks involving lead-in processes that enter the central word production architecture at different stages.
Middle column: Core processes of word production and their characteristic output. Right column: Example fragments
of outputs generated at each stage. Reprinted by permission from Elsevier, copyright 2004, from P. Indefrey and
W. Levelt, The spatial and temporal signatures of word production components, Cognition 92: 10144.
of the left middle temporal gyrus and typically occurs during a
time window of 150225 milliseconds (ms) post-stimulus in oral
picture-naming tasks (see Color Plate 4). They also found that
phonological form retrieval is linked with the posterior portions
of the left middle and superior temporal gyri and occurs during
a time window of either 200400 or 275400 ms, depending on
the studies that are considered. The three postlexical stages of
spoken word production are known to rely on a variety of motor-

440

related brain structures; however, the exact neural correlates of


each stage are not yet clear (Bohland and Guenther 2006).
Independently of Levelt, Roelog, and Meyers (1999) model,
a great deal of neuroscientific research has focused on the process of mapping the meanings of words onto their corresponding phonological forms during speech production. One
important line of work, conducted by Hanna Damasio, Daniel
Tranel, and their colleagues (2004), suggests that this process

Lexical Processing, Neurobiology of


is subserved by intermediary units that are analogous to lemmas insofar as they function as relays, taking lexical-semantic
structures as input and then pointing to the appropriate lexicalphonological structures. It is interesting that these intermediary
units may be neurally organized according to both semantic and
grammatical principles. For example, lesion data suggest that,
contrary to Indefrey and Levelts (2004) proposal, the retrieval
of nouns for different categories of concrete entities may hinge
on intermediary units that do not reside in the left middle temporal gyrus but, rather, in the left temporal pole (TP) and inferotemporal (IT) cortices. Specifically, studies in which oral
picture-naming tasks have been administered to large cohorts of
brain-damaged patients have shown that 1) impaired access to
proper nouns for unique persons (e.g., Jennifer Aniston) is associated with left TP lesions, 2) impaired access to common nouns
for animals (e.g., horse) is associated with damage to the anterior
sector of left IT, and 3) impaired access to common nouns for
tools (e.g., hammer) is associated with damage to the posterior
sector of left IT, a region called IT+ (Damasio et al. 1996; Damasio
et al. 2004). Crucially, the patients have intact object recognition
and conceptual knowledge since they can accurately describe
the entities they cannot name; in other words, the disorders are
purely anomic. Furthermore, functional imaging data indicate
that the same cortical regions are activated in normal subjects
in the same category-specific ways when concrete entities are
orally named from either pictures (Damasio et al. 1996, 2004)
or characteristic sounds (Tranel et al. 2003; Tranel et al 2005).
There is also increasing evidence from several methodologies
that the process of retrieving action verbs engages a quite different neural pathway that includes the left ventrolateral premotor/
prefrontal cortex (Damasio et al. 2001; Shapiro and Caramazza
2004; Tranel et al. 2001; Tranel et al. 2008). This region is reliably
activated when action verbs are accessed, and damage to it frequently impairs the production of verbs but not nouns.

Neural Substrates of Written Word Recognition and


Production
Reading and writing are recent inventions in human history
and must be explicitly taught. For literate individuals, however, word representations include not just a phonological
component but also an orthographic component that is efficiently processed by neural circuits that are gradually being
elucidated.
The activity of reading recruits numerous brain regions in
the temporal, parietal, and frontal lobes (Hillis and Rapp 2004;
Hillis and Tuffiash 2002). Perhaps the most controversial region,
however, has been the visual word form area (VWFA), located
in the left occipitotemporal sulcus bordering the fusiform gyrus
(McCandliss, Cohen, and Dehaene 2003; Dehaene 2005). This
area responds more strongly to printed words than to other types
of visually presented objects, such as faces, animals, and tools.
Also, disruption of the input projections to this area can induce
pure alexia without agraphia, a disorder in which reading can be
accomplished only in a laborious letter-by-letter manner, while
writing and all other linguistic skills are unaffected. Despite these
findings, the question of whether the VWFA plays a genuine causal
role in reading has been hotly debated (e.g., Price and Devlin
2003). Recently, however, a compelling case study supporting

the VWFA was reported by R. Gaillard et al. (2006; see also Martin
2006). In brief, prior to surgery for intractable epilepsy, the patient
exhibited normal single-word reading, including a lack of increase
in reading time for common words varying in length from three to
eight letters; moreover, fMRI revealed his VWFA to have normal
functional-anatomical characteristics, and local field potentials
recorded from implanted electrodes showed that this area was
sensitive to word frequency but not word length, again within
normal parameters. After excision of tissue just posterior to the
VWFA, the patients epileptic seizures were successfully eliminated, but his reading was markedly slow and inaccurate, with
reading times increasing linearly with word length (i.e., letterby-letter reading). In addition, the VWFA no longer responded
to printed words, even when they were contrasted with a simple
fixation point. This study, therefore, provides powerful new evidence that the VWFA is in fact necessary for access to the stored
orthographic forms of words during reading.
Writing also depends on a large network of widely distributed
brain regions (Hillis and Rapp 2004; Rapcsak and Beeson 2002).
Information about the neural basis of lexical access during written word production comes primarily from patients with lexical
agraphia, a disorder in which words with regular mappings
between phonology and orthography are spelled correctly, but
words with irregular mappings (e.g., choir) are misspelled. The
errors are usually phonologically plausible (e.g., circuit serkit)
and affect low-frequency words more than high-frequency
ones. Lexical agraphia is typically caused by damage to the left
temporo-parieto-occipital junction (Brodmann areas 37 and/or
39), although in some cases, there is involvement of the left ventral occipitotemporal region, close to, if not encompassing, the
VWFA. Several functional imaging studies with normal subjects
provide further support for a role of these cortical regions in written word production (e.g., Petrides Alvisatos, and Evans 1995;
Nakamura et al. 2000).

Conclusion
When people recognize and produce the spoken and written
forms of words, they usually concentrate on the meanings being
expressed and remain blithely unaware of the complex computations being executed by their brains in order to rapidly and
effectively process the lexical structures themselves. Cognitive
neuroscience is beginning to reveal the intricasies of these neural
systems, and dramatic advances are likely to happen in the coming years. Exciting new discoveries are appearing in the literature
almost daily, and this explosion of research will undoubtedly
provide fresh insights into the neurobiology of lexical processing, with significant implications for understanding and treating
disorders that result from brain injury.
David Kemmerer
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bohland, J. W., and F. H. Guenther. 2006. An fMRI investigation of syllable sequence production. NeuroImage 32: 82141.
Catani, M., D. K. Jones, and D. H. Ffytche. 2005. Perisylvian language
networks of the human brain. Annals of Neurology 57: 816.
Damasio, H., T. J. Grabowski, D. Tranel, R. D. Hichwa, and A. R. Damasio.
1996. A neural basis for lexical retrieval. Nature 380: 499505.

441

Lexical Processing, Neurobiology of


Damasio, H., T. J. Grabowski, D. Tranel, L. L. B. Ponto, R. D. Hichwa, and
A. N. Damasio. 2001. Neural correlates of naming actions and of naming spatial relations. NeuroImage 13: 105364.
Damasio, H., D. Tranel, T. Grabowski, R. Adolphs, and A. N. Damasio.
2004. Neural systems behind word and concept retrieval. Cognition
92: 179229.
Dehaene, S. 2005. Evolution of human cortical circuits for reading and
arithmetic: The neuronal recycling hypothesis. In From Monkey
Brain to Human Brain, ed. S. Dehaene, J.-R. Duhamel, M. Hauser, and
G. Rizzolatti, 13358. Cambridge, MA: MIT Press.
Gaillard, R., L. Naccache, P. Pinel, S. Clemenceau, E. Volle, D. Hasboun,
S. Dupont, M. Maulac, S. Dehaene, C. Adam, and L. Cohen. 2006.
Direct intracranial, fMRI, and lesion evidence for the causal role of
left inferotemporal cortex in reading. Neuron 50: 91204.
Hillis, A. E., and B. C. Rapp. 2004. Cognitive and neural substrates of
written language: Comprehension and production. In The Cognitive
Neurosciences. Vol. 3. Ed. M. Gazzaniga, 77587. Cambridge, MA: MIT
Press.
Hillis, A. E., and E. Tuffiash. 2002. Neuroanatomical aspects of reading.
In The Handbook of Adult Language Disorders: Integrating Cognitive
Neuropsychology, Neurology, and Rehabilitation, ed. A. E. Hillis, 1526.
Philadelphia: Psychology Press.
Imada, T., Y. Zhang, M. Cheour, S. Taulu, A. Ahonen, and P. K.
Kuhl. 2006. Infant speech perception activates Brocas area:
A developmental magnetoencephalographic study. NeuroReport
17: 95762.
Indefrey, P., and A. Cutler. 2004. Prelexical and lexical processing.
In The Cognitive Neurosciences. Vol. 3. Ed. M. Gazzaniga, 75974.
Cambridge, MA: MIT Press.
Indefrey, P., and W. J. M. Levelt. 2004. The spatial and temporal signatures of word production components. Cognition 92: 10144.
Kemmerer, D. 2010. How words capture visual experience: The perspective from cognitive neuroscience. In Words and the Mind: How
Words Capture Human Experience, ed. B. Malt and P. Wolff, 289329.
Oxford: Oxford University Press.
Levelt, W. J. M., A. Roelofs, and A. S. Meyer. 1999. A theory of lexical
access in speech production. Behavioral and Brain Sciences 22: 138.
Martin, A. 2006. Shades of Dejerine forging a causal link between the
visual word form area and reading. Neuron 50: 1735.
McCandliss, B. D., L. Cohen, and S. Dehaene. 2003. The visual word form
area: Expertise for reading in the fusiform gyrus. Trends in Cognitive
Sciences 7: 2939.
McQueen, J. M., D. Dahan, and A. Cutler. 2003. Continuity and
gradedness in speech processing. In Phonetics and Phonology in
Language Comprehension and Production: Differences and Similarities,
ed. N. Schiller and A. Meyer, 3776. New York: Mouton de Gruyter.
Miller, G.A. 1991. The Science of Words. New York: Freeman.
Miozzo, M., and A. Caramazza, eds. 2008. Lexical processing. Cognitive
Neuropsychology 25.4 (Special Issue).
Nakamura, K., M. Honda, T. Okada, T. Hanakawa, K. Toma, H. Fukuyama,
J. Konishi, and H. Shibasaki. 2000. Participation of the left posterior
inferior temporal cortex in writing and mental recall of kanji orthography: A functional MRI study. Brain 123: 95467.
Okada, K., and Hickok, G. 2006. Left posterior auditory-related cortices
participate both in speech perception and speech production: Neural
overlap revealed by fMRI. Brain and Language 98: 11217.
Orfanidou, E., W. D. Marslen-Wilson, and M. H. Davis. 2006. Neural
response suppression predicts repetition priming of spoken words
and pseudowords. Journal of Cognitive Neuroscience 18: 123752.
Petrides, M., B. Alivisatos, and A. C. Evans. 1995. Functional activation
of the human ventrolateral frontal cortex during mnemonic retrieval of
verbal information. Proceedings of the National Academy of Sciences
92: 58037.

442

Lexical Relations
Price, C. J., and J. T. Devlin. 2003. The myth of the visual word form area.
NeuroImage 19: 47381.
Pulvermller, F., M. Huss, F. Kherif, F. M. del Prado Martin, O. Hauk,
and Y. Shtyrov. 2006. Motor cortex maps articulatory features of
speech sounds. Proceedings of the National Academy of Sciences
103: 786570.
Rapcsak, S. Z., and P. M. Beeson. 2002. Neuroanatomical correlates of spelling and writing. In The Handbook of Adult Language
Disorders: Integrating Cognitive Neuropsychology, Neurology, and
Rehabilitation, ed. A. E. Hillis, 71100. Philadelphia: Psychology
Press.
Rapp, B., and M. Goldrick. 2006. Speaking words: Contributions of
cognitive neuropsychological research. Cognitive Neuropsychology
23: 3973.
Shapiro, K., and A. Caramazza, A. 2004. The organization of lexical
knowledge in the brain: The grammatical dimension. In The Cognitive
Neurosciences. Vol. 3. Ed. M. Gazzaniga, 80314. Cambridge, MA: MIT
Press.
Skipper, J. I., V. van Wassenhove, H. C. Nusbaum, and S. L. Small. 2008.
Hearing lips and seeing voices: How cortical areas supporting speech
production mediate audiovisual speech perception. Cerebral Cortex
18: 243949.
Tomasello, M. 1999. The Cultural Origins of Human Cognition.
Cambridge: Harvard University Press.
Tranel, D., R. Adolphs, H. Damasio, and A. R. Damasio. 2001. A neural
basis for the retrieval of words for actions. Cognitive Neuropsychology
18: 65570.
Tranel, D., H. Damasio, G. R. Eichhorn, T. J. Grabowski, L. L. B. Ponto,
and R. D. Hichwa. 2003. Neural correlates of naming animals from
their characteristic sounds. Neuropsychologia 41: 84754.
Tranel, D., T. J. Grabowski, J. Lyon, and H. Damasio, H. 2005. Naming
the same entities from visual or from auditory stimulation engages
similar regions of left inferotemporal cortices. Journal of Cognitive
Neuroscience 17: 12931305.
Tranel, D., K. Manzel, E. Asp, and D. Kemmerer. 2008. Naming static and
dynamic actions: Neuropsychological evidence. Journal of Physiology,
Paris 102: 8094.

LEXICAL RELATIONS
Lexical relations are ways in which words, or lexemes, share
something in common. This broad definition includes relations
based on phonological relatedness, such as rhyming, and
morphological relatedness, like being the range of tensed
forms of a verb. However, in most contexts, the term is used to
refer specifically to semantic relations among words and, most
frequently, to paradigmatic semantic relations among words,
including synonymy, hyponymy, and antonymy. Such relations
are sometimes called sense relations (Lyons 1977), as it is usually
a single denotative sense of a word rather than every sense and
every aspect of the lexical item that is relevant to the relation.
Thus, we expect the postal, rubber, and stomp senses of
stamp to enter into lexical relations with different sets of words.

Paradigmatic and Syntagmatic Relations


Semantic relations are generally divided into two types, usually called paradigmatic and syntagmatic. Syntagmatic relations are relations of combination that is to say, words that
fill different slots in a phrase, like book and read or delicious
and food. These can be grouped into relational types like modifiermodified or event verbagent. Some theories of lexical

Lexical Relations
semantics build such relations into lexical entries, for example in the selectional restrictions of Jerrold J. Katz and Jerry A.
Fodor (1963) and the lexical functions of meaning-text theory
(Meluk 1996).
Paradigmatic relations are relations of substitutability; the
words in a semantic paradigm are different options for filling the
same phrasal slot. For example, red/yellow/blue are three options
for subject position in X is a primary color. Paradigmatic relations
are studied because of their role in logical relations among sentence meanings, such as entailment, and because of what they
might tell us about how the mental lexicon is organized.

Semantic versus Lexical Relations


The term lexical relation is ambiguous, in that it can refer either
a) to [semantic] relations among words or b) to [semantic] relations among words that are represented in the mental lexicon,
as information in or links between the lexical entries for those
words. Some authors reserve lexical relations for the b meaning and use semantic relations for the a meaning. For example, Derek Gross, Ute Fischer, and George A. Miller (1989) claim
that while large and little are semantically opposite, they are not
directly related to each other as lexical antonyms, whereas large
and small are both semantically and lexically related. In other
words, large and small are not only semantically opposed; we
have also learned through linguistic experience that the words
themselves are opposed. This distinction between lexical and
nonlexical relations is intended to explain why some word sets
are particularly strongly linked to each other, both in terms of cooccurrence in corpora and psycholinguistic behavior for
example, in word association experiments.

Types of Paradigmatic Relations


The most studied paradigmatic lexical-semantic relations are
synonymy, hyponymy, and antonymy/contrast, because a) substitution of members of these sets typically results in regular consequences for truth conditional semantics, and thus b)
they are central organizational principles in many lexicological
theories (see semantic fields). While each of these relations
is easily exemplified, definitions and the role of the relations in
the mental lexicon is still the subject of debate. Traditionally,
definitions that depend on the logical consequences of substitution have been used (e.g., Lyons 1977). More recently, D. A.
Cruse (1994) has proposed prototype-based definitions of
these relations, and M. Lynne Murphy (2003) has proposed a
pragmatic approach. Next, we look at the relations in turn and
highlight some research issues associated with each.
SYNONYMY. Synonymy, or sameness of meaning, is usually
defined in terms of a substitution test. If word X can be substituted for a particular sense of word Y in any sentence with no
change to the truth conditions of the sentence, then X and Y are
synonyms. However, this definition does not include many of the
things that are called synonyms in thesauruses and everyday
discourse, for a few reasons.
First, languages generally avoid synonymy since it is economical (both in lexical acquisition and in any interaction) to
assume that different forms signal different meanings (see, e.g.,
Clark 1992). When a language variety chances on a pair of perfect

synonyms, for example, when the same object is named independently by two subcommunities, either one word falls into disuse or one or both of the words become specialized to a slightly
different sense or context type. English, through its history of
contact with other languages, has come to have many such
near-synonyms, such as riseascend, smartintelligent, and dead
deceased. While these are substitutable for each other in many
contexts, they all differ in meaning and use. For example, while
a balloon could ascend or rise, a person rises, rather than ascends,
from a chair. And while people can be dead or deceased, plants
can only be said to be dead. Many other cases of not-quite-synonymy involve words that refer to similar, but not identical, things
for example, tapas and hors doeuvres or shovel and scoop.
The second problem with a truth-conditional definition of
synonymy is that it allows as synonyms items that have different non-truth-conditional content. For example, the nouns dog,
doggy, and pooch may all be truthfully applied to a particular
animal, but choosing pooch implies different things about the
animal and the speakers relation to it than dog does. Thus, some
apply a more restrictive substitution test that takes into account
connotational and social aspects of meaning, as well as the truth
conditional. To the extent that goodness of synonym relations
can be affected by non-truth-conditional issues like register
and morphological complexity, synonymy can be considered a
lexical relation, as well as a semantic relation.
Synonymy currently receives much attention in computational linguistics, as language generators and machine
translators require principled means for selecting the most
appropriate word for a context from a lexicon filled with nearsynonyms. Such studies can be particularly concerned with discerning ways in which synonyms or near-synonyms can differ
(see, e.g., DiMarco, Hirst, and Stede 1993).
HYPONYMY. Hyponymy, the type of relation, is the relation of
sense inclusion, although it is often defined in terms of referential, or categorial, inclusion. In sense terms, beer is a hyponym
of beverage because the sense of beer includes all the information that is in the sense of beverage, plus additional information
that identifies beers as special types of beverages. In categorial terms, everything that beer denotes is included in the set of
everything that beverage denotes. Hyponym relations are, thus,
linguistic reflexes of categorial inclusion relations, and, as such,
some theorists consider them to be semantic, but not lexical,
relations.
Unlike synonymy, hyponymy is an asymmetrical relation, in
that beer is a type of beverage but beverage is not a type of beer.
Hyperonymy is the term for the converse relation from beverage
to beer. Hyponymy is usually said to be a transitive relation. For
example, if beer is a hyponym of beverage and beverage is a hyponym of liquid, then beer is a hyponym of liquid. However, transitivity holds only in cases of proper sense inclusion not in all
cases that pass the is a type/kind of test (thus proving that the test
can be misleading). For example, speed-reading is a type of reading, and reading is a type of leisure activity, but speed-reading is
not usually considered to be a leisure activity. While reading can
function as a leisure activity, it is not defined in terms of being
leisurely, and thus reading is not a hyponym of leisure activity in
the logical sense of the term.

443

Lexical Relations
Inclusion relations, along with contrast relations, are central
to most approaches to lexical meaning, particularly in the treatment of noun meaning. For other word classes, hyponym relations are less frequent or less clearly paradigmatic. For example,
the adjectives happy and sad describe states that belong to the
category designated by emotion, but since emotion is a noun, it is
not substitutable for happy and sad.
ANTONYMY/CONTRAST. While perfect synonymy involves words
that always refer to the same thing, logical incompatibility
involves words that never overlap in reference. This logical relation contributes to two lexical-semantic relations, antonymy and
contrast.
Binary opposition seems to have special status in language
and conceptualization since even where more than one incompatible alternative is available, binary relations may be discerned. For example, although there are many emotions (happy,
sad, angry, afraid, disgusted), happy is generally understood as
having a single antonym, sad.
Logical approaches to antonymy distinguish various types.
Complementary (or contradictory) antonyms perfectly bisect
a semantic domain. For example, in relation to electrical
items, if something is not on, then it is necessarily off and vice
versa. Contrary antonyms designate the extremes of a scale.
For instance, something that is large is necessarily not small,
but something that is not small is not necessarily large. (Some
authors, including Lyons 1977 and Cruse 1986, restrict the term
antonym to just contraries.) Converse antonyms indicate different perspectives on the same relationship for example send/
receive, teacher/student, north/south. So, if X is north of Y, then Y
is south of X. Other types of antonymy have received some attention, but these tend not to show the same kinds of logical relations as those mentioned earlier. For example, kinship terms can
be opposed on the basis of gender (brother/sister) or generation
(mother/daughter).
In the broad sense of the term, antonymy is often defined as
a relationship of minimal difference; that is, antonymous words
share most of their semantic content, but for one difference that
makes the two terms incompatible. On such a definition, mother
and daughter are opposites because they only differ in the generation they refer to, while mother and son are not opposites
because they differ in both gender and generation.
As mentioned, the lexical relation antonymy is sometimes
contrasted to the semantic relation opposition, and antonym
pairs present the best evidence in favor of the position that relations between lexical items, not just senses, are represented in
the mental lexicon. Antonym pairings are particularly strong in
experiments such as word association tasks and lexical priming, and antonyms co-occur frequently in text leading some
(e.g., Jones 2002) to question whether antonymy is also a syntagmatic relation.
Larger sets of incompatible items exist, for example solid/
liquid/gas, but more linguistic-semantic attention is usually paid
to the not-necessarily-incompatible relation of co-hyponymy
or contrast, which, along with hyponymy, is basic to the taxonomic organization at the basis of semantic field and network
approaches. Such relations can also be defined in terms of minimal difference. For example, the basic color terms are similar

444

Lexical Semantics
in all being direct hyponyms of color but differ in the part of
the spectrum they designate. Still, they are not truly incompatible since they may overlap some shades of turquoise may be
considered to be both blue and green. The fact that most people
would insist that it must be one or the other, however, indicates
our preference for acting as if contrast sets are incompatible.
OTHER RELATIONS. The foregoing types of paradigmatic relation are generally held to be the most important to lexical and
semantic theories, but many more have been noted in the literature. The most discussed of these is meronymy, the part of
relation, though this can be thought of as a cover term for many
other types of relation, such as segmentwhole (slicecake),
materialwhole (woodtable), leaderorganization (captain
team), and so on.
Whether a precise taxonomy of relations is possible or necessary is an open question. While semantic field views of the lexicon rely on a small number of well-defined relations, theories
employing looser semantic networks or semantic domains might
allow for as many types of lexical relations as there are possible
relations among entities in the world. As the subtypes of antonymy and meronymy show, there is also the question of the level
of specificity that needs to be employed in representing these
relations in linguistic theory.
M. Lynne Murphy
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Clark, Eve V. 1992. Conventionality and contrast. In Frames, Fields, and
Contrasts, ed. Adrienne Lehrer and Eva Feder Kittay, 17188. Hillsdale,
NJ: Erlbaum.
Cruse, D. A. 1986. Lexical Semantics. Cambridge: Cambridge University
Press.
. 1994. Prototype theory and lexical relations. Rivista di Linguistica
6: 16788.
DiMarco, Crysanne, Graeme Hirst, and Manfred Stede. 1993. The semantic and stylistic differentiation of synonyms and near-synonyms.
Proceedings, AAAI Spring Symposium on Building Lexicons for Machine
Translation, Stanford, CA: 11421.
Gross, Derek, Ute Fischer, and George A. Miller. 1989. The organization of adjectival meanings. Journal of Memory and Language
28: 92106.
Jones, Steven. 2002. Antonymy. London: Routledge.
Katz, Jerrold J., and Jerry A. Fodor. 1963. The structure of a semantic
theory. Language 39: 170210.
Lyons, John. 1977. Semantics. 2 vols. Cambridge: Cambridge University
Press.
Meluk, I. A. 1996. Lexical functions: A tool for the description of
lexical relations in a lexicon. In Lexical Functions in Lexicography
and Natural Language Processing, ed. Leo Wanner, 37102.
Amsterdam: Benjamins.
Murphy, M. Lynne. 2003. Semantic Relations and the Lexicon.
Cambridge: Cambridge University Press.

LEXICAL SEMANTICS
Lexical semantics is often loosely described as the study of word
meaning, but both word and meaning require more precise
definition in this context. The term is usually used to describe the
study of lexical, or content, words or lexemes (including nouns,

Lexical Semantics
verbs, adjectives), rather than grammatical words (conjunctions,
determiners), which are more usually studied in the context of
sentential (or propositional) semantics. Lexical semantics can
also refer to the semantics of non-word lexical items, such as
idioms. The meaning aspect of lexical semantics most often
refers to denotative sense in particular that is, determining
what such words can and cannot refer to, as opposed to their
connotation or social import. Some of the main issues that concern lexical semanticists are the following:
How is the meaning of a word best represented in a model of
the mental lexicon?
Are different types of representation required for different
kinds of meaning?
How should multiple interpretations of a single word be
described and explained?
How are different words meanings related to one another?

are a major force in determining the form of a language and that


lexicon and grammar do not constitute completely separate
types of linguistic knowledge. Thus, such approaches are usually
lexico-centric. 4) Lexical concerns have also been at the forefront
of computational linguistics, in part because lexically
driven approaches to grammar have proved most computationally practicable. Furthermore, the goals of most computational
linguistics programs involve the creation of systems that can use
language meaningfully thus requiring models of how meanings
might be represented and means to acquire and use such semantic information. 5) Meanwhile, advances in computer hardware
and software led to the growth and development of corpus
linguistics, which is particularly suited to the study of words
and their use and has become one of the major methodological
tools of lexical semantics and lexicography.
This confluence of diverse motivations and assumptions has
contributed to the variety of approaches to lexical semantics and
to contrary positions on major questions in the field.

Lexical Semantics in Linguistic Theory: Historical Context


Word meaning is a long-standing area of interest in philosophy
and, of course, lexicography, but attention to it in theoretical
linguistics has varied by time and place. In recent decades, lexical semantics (and lexicology more generally) has experienced
revitalization after a slow period in the early to midtwentieth
century. For instance, Noam Chomsky (1965, 84) described the
lexicon as simply an unordered list of all lexical formatives,
and Leonard Bloomfield claimed that we have no way of defining most meanings and therefore the linguist cannot define
meanings, but must appeal for this to students of other sciences
(1933, 1446). Of course, influential lexical semantic work was
pursued in this period, but much of the work in the generative
tradition (e.g., Katz and Fodor 1963 and generative semantics in the 1970s) ran into problems of internal consistency or
explanatory insufficiency. Other semantic work was pursued in
Europe by practitioners of structuralism and functional
linguistics or by philosophers of language.
Nowadays, the lexicon is central to most major theories of
language. There are several reasons (presented here in no particular order) for the (re)emergence of the lexicon and lexical
semantics in linguistic study: 1) Psychological experimentation
in the 1970s (particularly by Eleanor Rosch) provided evidence
that word meaning is represented in the mind quite differently
from the way that it is represented on a dictionary page that
some lexical meanings might be based on prototype representations. Since the goals of general linguistics were, by this
time, mostly concerned with the mental representation of
language, such evidence needed to be integrated with linguistic
theory more generally. 2) Since the 1980s, theories of grammar
have become more lexically driven (e.g., head-driven phrase
structure grammar, lexical-functional grammar,
and minimalism). In such theories, the main constraints on
sentence structure come from the syntactic and semantic
requirements of the lexemes in the sentence, and lexical structures and rules account for the types of things that transformations
did in earlier Chomskyan approaches (cf. transformational
grammar). 3) Around the same time, the group of theoretical approaches called cognitive linguistics was emerging.
Cognitive linguistic theories hold that semantic considerations

Major Distinguishers of Theoretical Approaches


Any semantic theory must say something about the representation of lexical meaning, and theories of lexical meaning must fit
into theoretical accounts of the meaning and grammar of larger
constituents. Thus, in a sense, there is no such thing as a freestanding lexical semantic theory, but instead there are theories
of meaning that pay more or less attention to representation at
the lexical level.
Theories thus differ in the extent to which they take the lexical
or sentential meanings as their starting point. Those that start at
the lexical level of analysis, for example, Anna Wierzbickas (e.g.,
1996) Natural Semantic Metalanguage approach, are sometimes
criticized for lack of attention to the way that word meanings
combine in order to create sentence meanings. On the other
hand, those that start with complete propositions in mind, for
example, Ray Jackendoffs (e.g., 1990) Conceptual Semantics, tend
to focus on the aspects of word meaning that interact with each
other in sentential contexts (particularly the relation between
predicate and argument), but not with more detailed
nuances of meaning, as would interest a lexicographer.
One of the most basic issues for a lexical semantic approach
is the issue of whether lexical meanings should be distinguished
from concepts. That is to say, is the meaning of a content word,
like apple, different from our conceptualization of the category
apple? Is knowing about apples different from knowing the
meaning of apple? Those theorists who think that lexical meaning
is different from conceptualization generally make the distinction between definitional and encyclopedic aspects of meaning.
On this view, only definitional meaning that which is sufficient
to identify a referent and allow for grammatical interpretation of
the sentential context is relevant to lexical semantics. To use
a simple example, the definition of girl would be young female
human. Encyclopedic information, on the other hand, includes
other information that comes from our experience of the things
and situations referred to by words. For girl, this might include
information such as may wear pigtails and associated with
the color pink. Jerrold J. Katz and Jerry A. Fodors (1963) desiderata for a theory of linguistic semantics takes the position that
encyclopedic information should not be represented as part of

445

Lexical Semantics

Lexicography

linguistic (including lexical) meaning. However, many, if not


most, theories of meaning have moved away from this assumption and treat lexical semantics as involving the interface between
language and conceptualization. Such approaches are less likely
to hold that meanings are represented in the mental lexicon (i.e.,
the repository of linguistic information about words), but instead
see lexical meaning as represented in the conceptual realm. In
this case, lexical semantic theories become intertwined with theories regarding the representation of concepts and must explain
a) whether/how lexicalized concepts differ from nonlexicalized
concepts, and b) how the formal aspects of language interact
with the conceptual representations of meaning.
Another issue that has divided lexical semanticists (and lexicologists) is whether word meanings can be defined on their own
or whether lexical meaning is derived (at least in part) through
semantic relations (lexical relations) among words. That
is, does a words meaning emerge (to some degree) through the
relations between words in a languages lexicon? Lexical field
theorists and some computational linguists (e.g., the WordNet
project Fellbaum 1998) take the position that meaning emerges
from relations, whereas most working within a componential
framework presume that lexical semantic relations should be
explained by a lexical semantic theory, rather than being primitive elements of the theory.
Finally, there is the very big question of how the senses of
words are to be represented in a linguistic model. The most common approach is to devise a componential semantic metalanguage, which provides a limited and precise vocabulary for
representing elements of meaning and some form of grammar
for combining those elements into more complex meanings. The
form that such metalanguages take varies considerably among
theories, and some cognitive linguistics theories eschew metalanguages as such in favor of representations (for example, the
image schema) with more visual-schematic elements than linguistic ones. What all of these approaches have in common is the
aim to represent meaning using a restricted set of meaning elements. That is, meanings are composed from smaller meaningful parts (often semantic primitives [primes]), and the set of
meaningful parts available to a semantic theory is smaller than
the set of lexical items that could be described by such a representational system (cf. language of thought).

Conclusion Lexical Semantics Today


The range of lexical semantic research today is extremely varied
in the topics studied, the methodologies applied, and the theoretical assumptions behind them. Unlike some other subdisciplines of linguistics, no particular theoretical approach can be
said to be the clear leader in the field. The advent and development of corpus linguistics means that much lexical work is now
based on empirical rather than just introspective evidence,
and continuing developments in that field strengthen the value
of that evidence.
M. Lynne Murphy
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bloomfield, Leonard. 1933. Language. New York: Holt, Rinehart, and
Winston.

446

Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge,


MA: MIT Press.
Cruse, D. A. 1986. Lexical Semantics. Cambridge: Cambridge University
Press.
Fellbaum, Christiane, ed. 1998. WordNet: An Electronic Lexical Database.
Cambridge, MA: MIT Press.
Jackendoff, Ray. 1990. Semantic Structures. Cambridge, MA: MIT Press.
Katz, Jerrold J., and Jerry A. Fodor. 1963. The structure of a semantic
theory. Language 39: 170210.
Pustejovsky, James. 1995. The Generative Lexicon. Cambridge, MA: MIT
Press.
Ravin, Yael, and Claudia Leacock, eds. 2000. Polysemy: Theoretical and
Computational Approaches. Oxford: Oxford University Press.
Wierzbicka, Anna. 1996. Semantics: Primes and Universals. Oxford: Oxford
University Press.

LEXICOGRAPHY
While lexicography is often thought of as a subfield of linguistics, it is a scholarly discipline in its own right with its own principles and practices. This discipline is divided into two subfields,
practical lexicography and theoretical lexicography. Practical
lexicography is concerned with compiling, writing, and editing
dictionaries, which serve a double function: as a record of the
vocabulary of the language and as a reference work to meet the
needs of users for information about words, their usage, and
their spelling. Theoretical lexicography is concerned with the
definition of general principles governing the compilation of
dictionaries.
Dictionaries differ in their selection of vocabulary and other
items that the editors believe merit inclusion, given the size
and purpose of the volume. While most dictionaries use alphabetized word lists, certain others such as Rogets Thesaurus of
English Words and Phrases are arranged by topic. It lists words
according to the ideas that they express, for example, abstract
relations (existence, relation, quantity, etc.), space (generally,
dimensions, etc.), and intellect (formation of ideas, communication of ideas, etc.), among others. Words and phrases are listed in
the main body of a thesaurus according to their word class, but
without a definition or any information about pronunciation or
etymology.
Dictionaries come in various formats. Besides traditional
print dictionaries, online dictionaries and dictionaries on solid
state media (such as CD or flash memory) have become increasingly popular during the last quarter of the twentieth century
because they facilitate rapid access to information, cross-referencing, and immediate updates with the latest vocabulary.
General-purpose monolingual dictionaries are organized
alphabetically and use the same language for both the object and
the means of description. While all dictionaries aim for comprehensiveness, the number and structure of lexical entries depends
upon the target audience, as well as constraints in funding and
time to complete the dictionary. General-purpose dictionaries
focus on the description of a standard language, aim to provide
an exhaustive coverage of the words in a language (abridged
dictionaries focus on somewhat shorter lists), and are typically
more linguistic than encyclopedic.
Dictionaries are usually divided into three parts: an introduction (including instructions on how to use the dictionary),

Lexicography
the body (the alphabetically ordered list of entries), and appendices (other information, such as weights and measures, punctuation, etc.). The arrangement of the entries of a dictionary
is referred to as its macrostructure. Dictionaries differ in the
placement of homonyms, derived words, compounds, and
phrases, which can be given independent entries or included
in an entry. The layout and organization of the individual entry
is referred to as the microstructure of the dictionary. Each dictionary differs in its conventions for structuring lexical entries.
Typically, the headword at the beginning of an entry is in bold
and indented by a few spaces. Bold or italic typeface may be
used to mark the part of speech at the beginning of a lexical
entry. Some dictionaries include the standard pronunciation of
headwords and spelling variants. When a word is polysemous,
its senses are often numbered, with the most frequently occurring sense first. Similarly, when a sense or a group of senses
belong to a different word class or subclass, the sense(s) are
labeled accordingly, in combination with definitions, examples, and usage notes. More technical, archaic, or obsolete
senses and idiomatic phrases usually appear toward the end of
the lexical entry.
The headword of a lexical entry consists of a lemma (the basic
word form) that conventionally represents all of the inflected
forms of the unit. Following the headword, lexical entries provide a definition of it in the form of a paraphrase (in the same
language), which is semantically equivalent to it. Lexicographers
aim to offer definitions that are simpler than the word itself.
Another goal is to avoid circularity of definitions, that is, to not
define two or more lexemes in terms of each other. The most
common type of definition is the analytic definition since it
aims at maximal inclusion and independence from the context.
Synonyms or antonyms are sometimes used as alternatives to
analytic definitions because they are short. However, they may
require the dictionary user to look up other definitions to understand their meanings. Depending on the scope and aim of the
dictionary, lexical entries also provide examples that illustrate
the headwords syntactic behavior or offer additional semantic
information.
Metalexicography has the goal to improve the structure
and content of dictionaries. One way of achieving this goal is
to be involved in research about lexicography. There are several professional organizations and academic journals devoted
to metalexicography. Another way is to suggest methods and
criteria for reviewing and evaluating dictionaries. Evaluating
and assessing dictionaries, also known as dictionary criticism,
is difficult because it is not always clear what types of criteria
should be applied. One solution to this problem has been to
take large sets of dictionary reviews and determine the range
of criteria applied by different reviewers in their evaluations.
These include reversibility, alphabetization, directionality,
coverage, reliability, currency, redundancy, retrievability, and
equivalents.
Another difficulty of dictionary criticism is the large number
of entries. Reviewers analyze only a small sample of entries or
focus on particular features of dictionaries. To begin with, they
typically focus on the preface of the dictionary that explains how
to use it, who the intended audience is, and what types of information are included in the main body. After flipping through the

Linguistic Relativism
dictionary to get an idea of the microstructure of lexical entries,
reviewers generally conduct an arbitrary sampling of dictionary entries. Depending on the time constraints, this procedure
may, for example, lead the reviewer to scrutinize every 10th main
entry on every 20th page for completeness, clearness, accuracy,
simplicity, and modernity. To ensure that the review is based on
a representative sample, it is necessary to check that the different parts of speech are adequately represented, that polysemy is
taken into account, and that there is a balance between words
from both the general and specific domains. In reviewing dictionaries and devising methods for improving the structure of
dictionaries, the reviewer notes the presentation of the text on
the page, as it plays a significant role in influencing the accessibility of the content. The range of vocabulary is also important
since users typically expect dictionaries to offer the latest words
from the domains of fashion, technology, and business, among
others, while at the same time including variants. Polysemy, the
structure of definitions, usage notes, examples, and etymological information are also important criteria for the evaluation of a
dictionarys content.
Hans C. Boas
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Atkins, Beryl T. S., and A. Zampolli. 1994. Computational Approaches to
the Lexicon. Oxford: Oxford University Press.
Bjoint, Henri. 2000. Modern Lexicography: An Introduction.
Oxford: Oxford University Press.
Hartmann, R. R. K., and G. James. 1998. Dictionary of Lexicography.
London and New York: Routledge.
Zgusta, Ladislav. 1971. Manual of Lexicography. The Hague: Mouton.

LINGUISTIC RELATIVISM
Linguistic relativism refers to the idea that language influences
thought and worldview (see also language of thought). In
essence, thinking and worldview are relative to the language one
learns to speak in childhood. Language, thought, worldview, and
influence are thus key concepts in linguistic relativism.
Various understandings of language have led to proposals for several levels of relativism, from semiotic relativity when
referring to the general faculty of language, to structural relativity when referring to the grammatical properties of languages,
and to functional relativity when referring to communicational
patterns of interaction within and across languages (Lucy 1997;
Hymes 1966). Most past and current research has concentrated
on structural relativity.
Several levels of thought have been posited as potentially
under the influence of language, including the neurological, cognitive, and propositional levels. To date, little work has
investigated neurological variation across speakers of different
languages (Gilbert et al. 2006). Most work addresses conceptualization by examining cognitive processes, such as memory,
categorization, inference, analogy, and emotions (Lucy
1992; Levinson 2003; Boroditsky, Schmidt, and Phillips 2003).
Very few studies, if any, have proposed that language may influence the actual content, or propositional level, of thought. This
idea, known as linguistic determinism, is commonly discredited
as scientifically untenable.

447

Literacy
Concerning the relationship between language and thought,
language has variably been suggested to influence, impact,
shape, mould, condition, limit, or channel thinking.
An important issue, then, concerns the scope of these effects.
Current research has been asking whether language effects are
on line in the process of producing and comprehending language (see Slobin 1996 on thinking for speaking) or whether
language effects pervade human cognition (i.e., effects exist
whether or not one is engaged in linguistic acts). Most research
has assumed the latter type of language effects on thought.
Finally, note that linguistic relativity was originally formulated
as a scientific principle by Benjamin Lee Whorf in 1940 (1956,
214, 221). The principle has since then been relabeled the SapirWhorf hypothesis, following an article by Harry Hoijer (1954)
referring to Edward Sapir and Whorf, who contributed to the
early development of linguistic relativity in the 1920s and1930s
(Sapir 1985; Whorf 1956).
Stphanie Pourcel

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Boroditsky, Lera, Lauren Schmidt, and Webb Phillips. 2003. Sex, syntax
and semantics. In Language in Mind, ed. Dedre Gentner and Susan
Goldin-Meadow, 6179. Cambridge, MA: MIT Press.
Gilbert, Aubrey, Terry Regier, Paul Kay, and Richard Ivry. 2006. Whorf
hypothesis is supported in the right visual field but not the left.
Proceedings of the National Academy of Sciences 103: 48994.
Hoijer, Harry. 1954. The Sapir-Whorf hypothesis. In Language in
Culture, ed. Harry Hoijer, 92105. Chicago: University of Chicago
Press.
Hymes, Dell. 1966. Two types of linguistic relativity. In Sociolinguistics,
ed. William Bright, 11457. The Hague: Mouton.
Levinson, Stephen. 2003. Space in Language and Cognition: Explorations
in Cognitive Diversity. Cambridge: Cambridge University Press.
Lucy, John. 1992. Grammatical Categories and Cognition.
Cambridge: Cambridge University Press.
. 1997. Linguistic Relativity. Annual Review of Anthropology
26: 291312.
Sapir, Edward. 1985. Selected Writings in Language, Culture and
Personality. Berkeley: University of California Press.
Slobin, Dan. 1996. From thought and language to thinking for speaking. In Rethinking Linguistic Relativity, ed. John Gumperz and
Stephen Levinson, 7096. Cambridge: Cambridge University Press.
Whorf, Benjamin Lee. 1956. Language, Thought, and Reality. Cambridge,
MA: MIT Press.

LITERACY
Prominent Research Frameworks
Written symbolic codes of languages may be alphabetic, syllabic, morphosyllabic, (Perfetti 2003; Ho et al. 2007), or alphasyllabic (Mishra and Stainthrope 2007). In each case, children
must first learn how the written code of their language embodies
spoken language units (Perfetti 2003, 17). Children must also
have language socialization experiences that promote thinking
and talking in more literate ways if they are to achieve academic
language proficiency (Wilkinson and Silliman 2000) or linguistic literacy (Ravid and Tolchinsky 2002). Contemporary research
frameworks and their related studies reflect different aspects

448

of the code and sociocultural emphases in the cross-linguistic


study of literacy learning.
The code approach is the most prominent research framework because of its focus on the universal and language-specific features that can explain the cognitive-linguistic processes
underlying
decoding mastering a languages written code relative to its
spoken language units. For English, this means that children
must become aware of how letter patterns (graphemes) correspond to the smallest segment of their spoken language, the
phoneme, as the means for achieving automatic and fluent
word-level recognition.
comprehension deriving an overall interpretation of an
authors intended meanings as actively constructed through
interactions with the textual medium.
composition the generation and organization of ones own
ideas as expressed through interactions with the textual
medium.
A contentious debate concerns whether proficient word-decoding abilities must occur before reading comprehension skills can
develop (known as the simple view of reading) (Vellutino et al.
2007) or whether reading comprehension develops concurrently
with general spoken language comprehension (Cain and Oakhill
2007).
A second perspective integrates facets of the code and socialization frameworks in highlighting purposes and types of literacy,
especially for alphabetic knowledge. The basic level of literacy is
alphabetic and functional. Individuals who break the alphabetic
code are able to negotiate daily activities that involve recognizing and accessing known meanings from their spoken language,
such as reading street signs or writing familiar food items for a
grocery list. Functional literacy is inadequate for meeting current
standards in either educational or workplace contexts. In contrast, critical literacy stresses proficiency. Individuals must be
capable of using literacy tools competently for learning how to
learn. Proficiency includes knowing how to analyze critical linkages among ones prior knowledge the meaning or significance
of a read or written text relative to perspectives expressed and,
at the highest level, integrating this information with other texts
as the process for generating new questions. This ability to draw
on and contrast multiple sources of information to formulate
new understandings entails intertexuality.
A third stance broadens the concept of literacy proficiency
from the traditional code and socialization emphases to multiple literacies. This construct, rooted in the profound sociocultural changes in communication brought about by the digital
age, encompasses computer literacy, information literacy, and
digital media literacy as components.
These three literacy frameworks are not mutually exclusive.
Moreover, notions of being literate and their associated standards will continue to evolve as outcomes of sociocultural interactions with new technologies.
Since literacy knowledge originates from spoken language
knowledge, the study of literacy crosses multiple disciplines and
subareas. Language studies range from language-learning
environment; theory of mind and language acquisition; word meaning; the mental lexicon; and acquisition

Literacy
of syntax (see syntax, acquisition of), to language variation and second language acquisition. Literacy learning
processes are also subjects of developmental language study
from differing viewpoints. Subareas include phonological
awareness, reading, composition, spelling, distinctions
between the spoken and written communication modes, and the
effectiveness of teaching reading and teaching writing
in educational programs that, internationally, span alphabetic
and nonalphabetic languages. Furthermore, literacy research
has expanded to incorporate neuroimaging in order to identify neurobiological correlates of dyslexia and effects on brain
function of scientifically based reading (SBR) interventions.
Behavioral studies have examined associations among oral language impairment, dyslexia, and text comprehension and related
disorders of reading and writing (for reviews, see Cain
and Oakhill 2007; Scarborough 2005).

Modern History
Research on literacy learning is relatively new, initiated in
the 1970s. Since its inception, one major principle has guided
this work: Children should have significant home and school
opportunities for the integration of oral and written language
experiences. These experiences support the development of literate stances in comprehension, speaking, reading, and writing.
Becoming literate is a social process, influenced largely by childrens search for meaning. Prior to school-based reading and
writing, childrens engagement in literacy-like actions in play,
such as scribbling on paper with crayons, and with adults, for
example, storybook reading, forms the foundation for later literacy learning.
Initial research on literacy was strongly shaped by studies of
classroom language. Nearly four decades ago, sociolinguists
launched a new direction for inquiry into language and literacy
learning, focusing on oral language use in classrooms. The first
research concentrated on language functions, the communicative demands of classrooms, individual differences, and the
social basis and social integration necessary for learning. Initial
reading studies addressed assessments of reading comprehension, whereas current studies emphasize effective instructional
models of decoding and reading comprehension. In the United
States with the 2001 passage of the No Child Left Behind Act
(NCLB), federal educational policies for the first time exerted a
profound influence on the way that beginning reading is taught.
The expectation was that explicit SBR instruction would guide
reading practices and curriculum development in phonemic
awareness, decoding, and fluency.
A broader array of language-related features is implicated in
more literate spoken and written language uses beyond phonemic awareness, however. These include the scope and density of
vocabulary knowledge, command of more advanced syntactic
constructions applied to diverse reading and writing purposes,
and familiarity with a variety of narrative and expository dialogue structures and their organization. Non-language factors
also contribute, such as working memory capacity for different kinds of information-processing demands, the motivation to
learn, inferencing and integration, and metacognitive strategies
for the self-regulation of literacy learning. The causal relationship of these language and non-language-related variables and

their interactions for proficient literacy learning remain unidentified (Cain and Oakhill 2007).

Current State of Research


OVERVIEW OF THEORY. Contemporary research on literacy has
been catalyzed by two general theoretical traditions on the
human capacity for knowledge: Sociocultural science and cognitive science. Each involves an effort to build a comprehensive
and coherent account of human knowledge capacity, but differ
in their view of how knowledge accrues to individuals.
The sociocultural tradition, which encompasses the language
socialization framework, counts its origins in American pragmatism, such as that of William James and John Dewey. The
key concept is that human knowledge is embedded in the social
and physical context; socioculturalists view the individual within
a specific social context as the fundamental unit of analysis for
studying human learning and development. Different strands of
the sociocultural tradition emphasize alternate ways of social/
cultural analysis (e.g., interpersonal exchanges versus broad cultural patterns versus local sociopolitical power hierarchies). Two
of these variations, social constructivism and participation/practice theory, have played significant roles in language and literacy
research. Sociolinguistic approaches have been prominent in
academic discourse studies, whereas critical theory, which advocates for social change and empowerment, has played a major
role in literacy studies.
The cognitive science tradition traces its origins to the nineteenth-century studies of individual differences in perceptual
processing (e.g., Wilhelm Wundt). Modern-day approaches,
however, highlight the mechanisms by which people process
and integrate information, and the unit of analysis tends to be the
individual. The cognitive science tradition also represents a variety of conceptual approaches sharing the premise that the individuals real-time information processing should be the focus
of inquiry. Different approaches vary from a strong nativist (see
innateness and innatism) and representational perspective to the primacy of the emergence of knowledge from system
experience interactions, as stated by the connectionists. For
example, neuroimaging studies range from information processing, which emphasizes the processing constraints and multilevel
integration of information, to connectionism, which investigates
parallel processing and the extraction of inherent regularities
from the input.
ELABORATION OF APPROACHES. The cognitive science approach
in which the code framework is embedded has been most influential in revealing a) the precursory phonological awareness
knowledge needed for beginning reading across alphabetic
and nonalphabetic languages and necessary language-specific
knowledge (e.g., how the consonant cluster patterns of spoken
Czech influence phonemic awareness development) (Caravolas
and Bruck 1993); b) the instructional design and content that
best facilitates beginning reading in struggling readers; and
c) neurobiological signatures of dyslexia. The focus in instructional
studies is on experimental investigations as the scientific basis for
determining the treatment validity of instruction to prevent reading problems in grades 1 to 3. These studies, conducted primarily
in the United States, employ randomized controlled field trials,

449

Literacy

Literariness

often using a response to intervention model to determine the


efficacy of outcomes in alphabetic reading. While differences exist
in the form of response to intervention designs, all involve a hierarchical process of alphabetic instruction and ongoing reading-related assessments. Minimal responsiveness may mean that a child
requires special education support to be successful.
The cognitive-experimentalist approach is not without criticism, particularly as this research is reflected in the NCLB goal
that all grade 3 children will read proficiently by 2014. One critique pertains to individual differences. Given the individual
diversity in neurobiological makeup and sociocultural experiences, it is not possible to erase normal variation in the distribution of reading ability. Instead, normal variation should be
treated as an asset to build upon and not as a liability (Berninger
and Richards 2002).
In contrast, sociocultural approaches converge on the belief
that literacy learning does not consist exclusively of recruiting
neurobiological and cognitive events inside the learners head.
A significant question is how academic discourse serves as the
social mechanism for children to learn how to do literacy
and advance their language learning as members of the larger
school literacy culture. Study designs are typically descriptive or quasi-experimental. A limitation of the sociocultural
tradition is that descriptive studies function best to generate
new hypotheses about causal mechanisms but cannot yield
broader generalizations unlike randomized controlled trials.
However, future literacy research may lead to mixed-methods
approaches that combine the tools of the cognitive and sociocultural sciences.
AN EXAMPLE OF INDIVIDUAL DIFFERENCES. Literacy research in
both the cognitive and sociocultural sciences examines group
differences. However, classrooms consist of particularized differences. The translation of group-differences research into everyday practices that meet the learning needs of individual children
is far from an easy task. In any third grade in the United States,
some children are still struggling with fluent decoding and spelling; others have no problems with decoding but face significant
comprehension barriers when reading expository texts; and still
others may be exerting great effort to unravel the complexity of
academic discourse demands, which then impedes their language and literacy learning. No uniform set of reasons accounts
for these individual patterns. Some patterns may be grounded
primarily in sociocultural experience, such as less familiarity
with academic discourse; other patterns may involve complex
interactions of neurobiological, cognitive, social, linguistic, and
communicative factors.
While much has been learned about literacy processes, two
significant educational challenges persist: understanding variations in individual profiles and how to craft evidence-based practices that will assist individual children to become full members
of their larger literacy communities.
Elaine R. Silliman and Louise C. Wilkinson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Berninger, Virginia W., and Todd L. Richards, 2002. Brain Literacy for
Educators and Psychologists. San Diego, CA: Academic Press.

450

Cain, Kate, and Jane Oakhill, eds. 2007. Childrens Comprehension


Problems in Oral and Written Language: A Cognitive Perspective. New
York: Guilford.
Caravolas, Markka, and Margaret Bruck. 1993. The effect of oral and
written language input on childrens phonological awareness: A crosslinguistic study. Journal of Experimental Child Psychology 55: 130.
Ho, Connie, David W. Chan, Kevin H. Chung, Suk-Han Lee, and Suk-Man
Tsang. 2007. In search of subtypes of Chinese developmental dyslexia. Journal of Experimental Child Psychology 97: 6183.
Mishra, Ranjita, and Rhona Stainthrope. 2007. The relationship between
phonological awareness and word reading accuracy in Oriya and
English: A study of Oriya-speaking fifth-graders. Journal of Research
in Reading 30: 2337.
Perfetti, Charles. A. 2003. The universal grammar of reading. Scientific
Studies of Reading 7: 324.
Ravid, Dorit., and Liliana Tolchinsky. 2002. Developing linguistic literacy: A comprehensive model. Journal of Child Language 29: 41747.
Scarborough, Hollis S. 2005. Developmental relationships between language and reading. In The Connections between Language and Reading
Disabilities, ed. H. Catts and A. Kamhi, 324. Mahwah, NJ: Lawrence
Erlbaum.
Silliman, Elaine R., Louise C. Wilkinson, and Maria R. Brea-Spahn.
2004. Policy and practice imperatives for language and literacy
learning: Who shall be left behind? In Handbook on Language and
Literacy: Development and disorders, ed. C. Stone, Elaine R. Silliman,
B. Ehren, and K. Apel, 97129. New York: Guilford.
Vellutino, Frank R., William E. Tunmer, James J. Jaccard, and RuSan
Chen. 2007. Components of reading ability: Multivariate evidence for
a convergent skills model of reading development. Scientific Studies
of Reading 11: 332.
Wilkinson, L.C., and E. R. Silliman 2000. Classroom language and literacy learning. In Handbook of Reading Research. Vol. 3 Ed. M. Kamil,
P. Mosenthal, P. Pearson, and R. Barr, 33760. Mahwah, NJ: Lawrence
Erlbaum.

LITERARINESS
This term refers to the perceived distinctive quality of the language of literary, as opposed to nonliterary, texts. If the linguistic study of literature attempts to understand how linguistic
form is adapted to literary purposes, then identifying a text as
literary based on its language is one of the central problems.
There is currently among scholars of literature, both linguists
and literary critics alike, little agreement about the status of literature as language. Despite this lack of consensus, or maybe
because of it, the investigation of literature using linguistic
models has become a productive field within applied linguistics. The question of literariness is closely tied to the modern
concept of literature and its history, and so it seems best to proceed by examining first the history of the concept of literature
before turning to central issues that the question of literariness
has raised.
Although most cultures consider verbal art a separate,
recognizable class of speech, its rendering into print has subjected it to the transforming effects that mark the influence
of print on every aspect of modern culture (see print culture ). Literature (from Latin littera letter) in its restrictive, modern English sense of imaginative writing in the main
genres of poetry, prose, and drama which has claim to consideration on the ground of beauty of form or emotional effect
(Oxford English Dictionary), arose in the nineteenth century in

Literariness
conjunction with the increasing availability of authorship as a
profession. Before the advent of movable type printing, to write
or copy a book required a great investment of time and energy,
and so only highly valued items would have been widely circulated. Even after print technology began slowly to diffuse into
the wider culture, it was never doubted that only important
works would ever be printed, distributed, and saved. The rise of
industry and the middle class, with increased literacy rates and
an expanded market for writing, saw the normative concept of
literature emerge essentially as a means for differentiating traditionally sanctioned texts from those of supposedly ephemeral
quality. The reading and study of the superior texts were promoted in secondary and university curricula as profitable for
cultural improvement and development of national identity
(see nationalism and language ).
Arguments for the suitability of literature for this project
depended for their success on demonstrating that the privileged
texts possessed certain inalienable qualities. Matthew Arnold,
an English school inspector and poet, argued in 1880 that the
superior character of truth and seriousness in the matter and
substance of the best poetry, is inseparable from the superiority
of diction and movement marking its style and manner ([1880]
1988, 416). For Arnold, and the liberal humanism he has come to
represent, the reading of literature functions to stabilize society because it educates citizens in supposedly universal sociobehavioral norms.
Like Arnold, most influential theorists in the first half of the
twentieth century never questioned the status of literature as
an identifiable and beneficial form of linguistic behavior. They
expected that the literary text could be differentiated from nonliterary texts by some constellation of intrinsic linguistic characteristics, the discovery of which, Roman Jakobson argued in
1921 when he coined the term literariness, should be the goal of
literary linguistics. But when formalist investigators failed over
time to adduce a convincing set of characteristics necessary and
sufficient for identifying literature, attention turned to the role of
extrinsic factors, such as audience and medium, in establishing
literary distinctiveness. Many literary critics have resolved the
problem by historicizing the concept of literature itself, arguing
that the search for intrinsic features determinate of literariness
cannot be successful because wherever or whenever the category of literature arises, it does so from specific sociocultural
forces that situate readers differently with regard to the purposes
attributed to the literary texts within the cultural or theoretical
discourse that promotes the concept. Approaches grounded in
linguistics, however, prefer the term verbal art for their subject
because they also question the sufficiency of the traditional concept of literature to account for the variety of genres developed
around the world, by principally oral cultures, that function in literary ways for the culture in question. Most contemporary study
can thus be characterized as interactionist, refusing to privilege
intrinsic or extrinsic characteristics but understanding textual
genre categorization as a complex process involving interaction
among text, reader, performance situation, and various sociocultural practices.
Some intrinsic characteristics that have been recognized
as occurring across time and languages in texts that become
literary include iconicity and defamiliarization. Iconicity is an

important element in the organicist metaphors of literary structure developed principally by Robert Penn Warren and other
so-called New Critics in the United States, though by no means
unique to them. Organicist approaches see the form of a text as
highly responsive to its meaning, and the New Critics especially
valued the ability of a text to reconcile within itself the various
strands of meaning that its language evokes. In semiotic terms,
an icon is a sign in which the signifier somehow resembles its
referent. Iconicity operates to a limited extent in language, for
example, in onomatopoeia, where a word sounds like the thing
it represents, as is often the case for linguistic representations
of animal sounds English cows moo and sheep baa (but
see Haiman 1985 for arguments that iconic signs are widely
exploited in language). In poetry, sounds sometimes mimic the
thing being described, as in these lines from Tennysons The
Princess:
The moan of doves in immemorial elms
And murmuring of innumerable bees (7.2212)

The repeated nasal consonants suggest the hum of bees on a


summer afternoon.
Iconicity often contributes to thematization, the acting out
by a texts words or other speech elements of a particular theme
given semantically in the text. The speaker of George Herberts
Deniall, for example, laments that his prayers are not being
heard by God:
When my devotions could not pierce
Thy silent eares;

Then was my heart broken, as was my verse:


My breast was full of fears
And disorder:

The final line of the stanza does not rhyme, nor does it match the
iambic rhythm established in the first four lines. The disorder
spoken of in the line is acted out by the lines lack of sonic fit,
implying in the process a parallel between the form of the prayer
and its failure to penetrate. In the final line of the final stanza the
prayer at last comes around:
O cheer and tune my heartlesse breast,
Deferre no time;
That so thy favours granting my request,
They and my minde may chime,
And mend my rime.

When the speakers mind is aligned with Gods wishes, his prayer
also becomes formally complete. Iconicity is one way that patterns of linguistic elements can contribute to larger patterns
of meaning. Poetic iconicity is a local phenomenon, however,
which depends on the immediate environment of the signs
involved. The phrase and disorder is not without rhythm, nor is
it unrhymable, but in the context Herbert has created, the phrase
stands out for its ill fit. Organicist approaches value the extent to
which the local patterns of significance can be reconciled to one
another, making each poem a coherent emotional and semantic
whole. In the final stanza of Deniall, the local pattern of end
rhyme acts out an additive semantic logic, so that the notions
of chime and time together constitute the meaning of the word
rime, and at the same time the words function sonically within

451

Literariness
the rhyme scheme to act out the metonymy whereby my rime
refers to the poem as a whole.
The ideological content for the New Critical emphasis on
unity and coherence in literature is illuminated by comparison
to Viktor Shklovskys influential theory of ostranenie defamiliarization. Shklovsky was one of several scholars and writers who
met informally in Moscow and St. Petersburg. Often referred to
as the Russian Formalists, this group was the first to mix literary
analysis with an increasing awareness of the status of the literary text as a linguistic object and, therefore, subject to description by specifically linguistic tools (see Erlich 1955). Shklovsky
argued that the distinguishing characteristic of the literary text
was that it defamiliarized items or events the perception of which
had become automatized by the reader due to familiarity or
repeated exposure. Leo Tolstoy, as Shklovsky points out in several examples, makes the familiar seem strange by not naming
the familiar object, like the description in War and Peace of an
opera scene as pieces of painted cardboard, or when the narrator of Shame describes the sequence of actions in a flogging
without using the word ([1917] 1988, 21). Shklovsky called these
and similar techniques of creative distortion prim devices,
and Jakobson went so far as to claim in 1921 that if literary history wants to become a science, it must recognize the artistic
device as its only concern (quoted in Erlich 1955, 57; see also
foregrounding).
The young Jakobson, whose specifically linguistic theory of
poetics eventually became widely influential, believed that he
could justify his admiration for the futurist poetry of Velimir
Khlebnikov by explicating in linguistic terms the complex, suggestive, phonemic, and morphemic patterning of poems heavy in
neologisms, like Kuznechik The Grasshopper (Jakobson 1987,
252). Khlebnikovs poetry, and that of other so-called Russian
Futurists, so clearly eschewed traditional forms of poetic practice that there was no question of its being considered canonical. Like Arnold, the Futurists were interested in distinguishing
true literature from the products of the mass marketers, whom
they called traitors, and they also identified a strong nationalistic purpose in reading and writing literature. Unlike Arnold,
however, they saw literature as having a revolutionary, rather
than a stabilizing, purpose (influenced perhaps by the differing cultural conditions prevailing under the unstable czarist
regime in Russia and the imperative to govern as a global power
in England).
As Jakobsons poetics matured, he increasingly saw the role
of the device in terms of a Peircean semiotics. The devices do not
function alone to interrupt the direct awareness of the identity between sign and object (1987, 378), but the whole text is
revealed as a system of systems of equivalences, that, through
similarities and contrasts at all levels of linguistic organization,
up to and including the arrangement of the entire text, display
the text as primarily interested in the linguistic medium, the
materiality of its linguistic signs. This approach resembles the
organicist metaphors of the New Critics in that linguistic form
is motivated by poetic meaning, and devices are valued less for
their interruption of usual relationships of significance than for
the surplus of meaning that iconic relationships create.
In Jakobsons best-known literary-critical essays, however,
such as Baudelaires Les Chats, written with Claude Levi-

452

Strauss, he and his co-authors exhaustively catalog patterns of


linguistic or structural elements within a poem, often with no
attempt to discover motivations for individual patterns except to
observe their interaction as formal patterns. They aim to demonstrate that the whole poem is essentially a tissue of many such
overlapping and interlocking patterns, a complex and indivisible totality where a perpetual interplay of sound and meaning establishes an analogy between the two facets, a relationship
either paronomastic and anagrammatic, or figurative (occasionally onomatopoeic) (1980, 23).
Objections to internalist theories of literariness include skepticism about any readers being able to detect all or even many
of a poems linguistic patterns, as well as observations that some
passages of nonliterary prose contain as many patterns as poetry.
Indeed, the strongest reactions against internalist theories concern the genre of prose. Internalist theories of literariness often
elevate the importance of poetry because as a genre, it is maximally distinct from everyday language. As a result, objections
to these theories are often concerned with accounting for the
characteristics of prose genres. Prose generally has observably
fewer sonic foregrounding devices and is more likely than poetry
to make use of devices that also appear in everyday language,
such as irony, metaphor, or repetition. There is some evidence
that these devices occur with greater frequency in literary than in
nonliterary prose, and that there are some linguistic forms, such
as free indirect speech, that tend to occur only in literary prose
(see Miall 2006 and narrative). On the balance, however, literary prose tends to require some attention to elements of literary
interaction that are extrinsic to the text in order to account for
literariness.
One important class of extrinsic approaches, often termed
reader response theories, focus on the reader as the source of the
distinctiveness of literature. Perhaps the strongest version of this
extrinsic theory is offered by Stanley Fish (1980), who describes
an impromptu experiment (later executed to the same ends with
readers from three continents) in which he told undergraduates
that a list of linguists on a classroom chalk board was a poem
and asked them to interpret it, which they had no trouble doing.
This demonstrated for Fish that literariness was wholly a function of prior reader commitment, rather than of anything within
the text. Readers learn interpretive practices from the communities of which they are members, and so what counts as literary is
what the community has determined to be so. Although Fishs
description of the interpretive community has not generally been
retained, his position on literariness became, until very recently,
the default standard within literary theory. Feminist and postcolonialist challenges to the hegemony of Western literary cultural
practice, demonstrating that texts created and valued by a dominant class will be read quite differently by less privileged classes,
helped establish that a precise mode or history of reception cannot be inferred from the text itself (Harrison 2005, 7). Evacuating
the text of any determining role in its own reading allows critics
to explore how the text participates in the various discourses of
power circulating at the time of its writing and reception.
Current theories of literariness that utilize linguistic tools for
analysis are unwilling to locate the determining characteristics
specifically within the text or the reader, but generally see textual
genre categorization as a complex process involving interaction

Literariness
among text, reader, and various sociocultural practices. Reuven
Tsurs sophisticated analysis of religious trance poetry, for example, identifies not only textual devices that contribute to a hypnotic
reading but also different types of cognitive styles of individual
readers, who are psychologically predilicted to react to certain
environmental variables in different ways and so read texts differently. Nigel Fabb, on the other hand, adopts a modularist
approach, identifying two types of structures in poetry, inherent
and communicated structure. Inherent structure is determinate
and is processed, rather than interpreted, by the appropriate
module (such as syntax or meter). Communicated structure is
implied by textual evidence in the context of the readers knowledge, so that being a sonnet holds of a text by inference, rather
than as an independent fact about the text. The important thing
for literariness is again that the text provides only some of the
cues necessary for categorizing it as literary. Communicated
structure is dependent on individual learning, so that responses
to any given textual characteristic might readily vary by individual. Fabb also argues that literary texts communicate more
about their form than verbal behavior generally does, and so in
a Jakobsonian sense draw more attention to form. Form in this
sense takes on the characteristics of meaning (because implied);
therefore, form is more likely to be ambiguous, indeterminate,
metaphorical, or ironic in literary texts.
Emotional responses to literature have always been important for the identification of literariness. Indian theories of
rasa, for example, categorized texts by the type of emotion they
prompted (see dhvani and rasa). Fresh empirical approaches
to the issue of literariness have focused on the role of emotion in
literary response. Miall identifies literariness as a combination
of formal elements in the literary text and the array of responses
these initiate when read (2006, 144). Readers who encounter
foregrounding devices experience them as defamiliarizing, a
cognitive response that has an associated affective dimension.
Subsequent changes in understanding seem to be guided largely
by the feeling evoked by defamiliarization, especially the feeling
of self-modification that, Miall argues, accompanies the recontextualization of the defamiliarized concept (see literature,
empirical study of).
In the last decade, literary critics interested in the ethical
dimensions of literary reading have begun to reconsider, to some
degree, the importance of the defamiliarizing effect of literature.
Derek Attridge (2004) has recuperated the concept of literariness
for literary studies by identifying it with the inventiveness that a
text manifests whenever it is enacted by a reader. For Attridge,
literariness inheres not in some fundamental unchanging core in
the work but in the inventiveness it shows over time, because it
remains open to change and porous to new contexts, continuing
to introduce what is unknown into the known. Readers experience that inventiveness anew when they perform the work, and
they desire to do justice to it by shifting the norms and habits
they use for dealing with the world.
As literature undergoes remediation into digital formats (and it should be noted that the Russian Formalists were
applying Shklovskys concept of ostranenie to film analysis
already before 1920), the notion of literariness is undergoing concomitant refinement. Motivated in part by new media
forms of verbal art, Jerome J. McGann and Lisa Samuels

Literary Character and Character Types


(2001) have proposed the expansion of the notion of reading to include the manipulation of poems to deform them
in ways that defamiliarize the poems and prompt new experiences of them, by reversing them (so that they can be read
backwards), rearranging the words or lines, even removing or
replacing parts of speech. Such formally interactive processes
of reading, increasingly built into hypertext, video games, and
other forms of electronic textuality, continue to obscure the
traditional text/reader division that figured so prominently in
early notions of literariness. Arguments for the literariness of
nontraditional (i.e., nonprint, nonverbal) formats usually rely
on some form of the strong sociocultural construction argument, yet there are also attempts to canonize some digitally
mediated works over others based on intrinsic characteristics,
suggesting that interactive definitions of literariness continue
to prevail for the time being.
Claiborne Rice
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Arnold, Matthew. [1880] 1988. The study of poetry. In The Critical
Tradition, ed. David H. Richter, 41116. Boston: Bedford.
Attridge, Derek. 2004. The Singularity of Literature. London: Routledge.
Bauman, Richard. 1984. Verbal Art as Performance. Long Grove,
IL: Waveland.
Erlich, Victor. 1955. Russian Formalism. The Hague: Mouton.
Fabb, Nigel. 2002. Language and Literary Structure. Cambridge:
Cambridge University Press.
Fish, Stanley. 1980. Is There a Text in This Class? The Authority of
Interpretive Communities. Cambridge: Harvard University Press.
Haiman, John. 1985. Natural Syntax. Cambridge: Cambridge University
Press.
Harrison, Nicholas. 2005. Who needs an idea of the literary? Paragraph
28.2: 117.
Herbert, George. [1633] 1974. Deniall. In The English Poems of George
Herbert, ed. C. A. Patrides, 96. London: J. M. Dent & Sons.
Jakobson, Roman. 1980. A postscript to the discussion on grammar of
poetry. Diacritics 10.1: 2135.
. 1987. Language in Literature. Ed. Krystyna Pomorska and
Stephen Rudy. Cambridge: Harvard University Press. This text collects
Jakobsons writing on literary topics throughout his career.
McGann, Jerome J., and Lisa Samuels. 2001. Deformance and interpretation. In Radiant Textuality, 10536. London: Palgrave Macmillan.
Miall, David S. 2006. Literary Reading: Empirical and Theoretical Studies.
Frankfurt: Peter Lang.
Shklovsky, Viktor. [1917] 1988. Art as technique. Trans. Lee T. Lemon
and Marion J. Reis. In Modern Criticism and Theory, ed. David Lodge,
1630. London: Longman.
Tompkins, Jane P., ed. 1980. Reader-Response Criticism: From Formalism
to Post-Structuralism. Baltimore: Johns Hopkins University Press.
Though somewhat dated, this remains an excellent introduction to the
range of ideas usually identified as reader-response.
Tsur, Reuven. 2003. On the Shore of Nothingness: A Study in Cognitive
Poetics. Exeter: Imprint Academic.

LITERARY CHARACTER AND CHARACTER TYPES


Discussions of character types date back to Aristotle and appear
within numerous brands of literary criticism. Within the context
of the language sciences, however, character types have particular relevance to two strands of narratology.

453

Literary Character and Character Types


The first strand, which David Herman refers to as classical
narratology (see also generative poetics and narrative,
grammar and), first flourished in the 1960s. These linguistic
approaches to literature heavily influenced by Vladimir Propps
Morphology of the Folktale (originally published in 1928 but not
available in translation until some 30 years later), the works of
Algirdas Julien Greimas, and structuralism in general treat
character types as functional units of meaning. Propps seven
classifications of character types were determined by the roles
that characters commonly occupy in Russian folktales: the hero,
the villain, the helper, the donor (provider of magical agents),
the sought-for-person and her father, the dispatcher, and the
false hero. Building upon Propps model, Greimas claimed that
characters were significant for the actantial roles they perform
within a narrative. In Structural Semantics ([1966] 1983, 198),
he described actants as embodying a small number of roles
in the drama of discursive utterances. Greimass three pairs
of actants subject/object, sender/receiver, and helper/opponent were intended to correspond to grammatical concepts.
Subjects (characters who do the action) and objects (characters
who undergo the action) are clearly related to the equivalently
named sentence constituents, while helper and opponent can be
regarded, in Mieke Bals words, as adverbial adjuncts ([1985]
1997, 201). The category of sender/receiver proved most problematic: While it attempted to supplement Propps dispatcher
with Roman Jakobsons distinction between the initiator of a
communication (e.g., a speaker) and the addressee of that communication (the receiver), the relationship between those linguistic concepts and the corresponding character types was unclear.
Critics have since expanded on the theoretical foundations laid
by Propp and Greimas, and such work has provided useful generalizations about characters as structural units of meaning within
narratives (see Schleifer and Velie 1987, for instance, for a typology of literary genres based upon Greimass receiver-actant).
Furthermore, Bals use of semantic axes and Michael Toolans
semantic feature analysis offered ways to account for the less
generalizable details of characterization such as specific physical and psychological qualities which were outside the scope of
story grammars.
Within the second strand of narratology, or what Herman
terms postclassical narratology, considerably more attention has
been devoted to narrative reception. This change in focus has
occurred in tandem with the rise of cognitive linguistics
in the 1970s and 1980s, which focused more attention on the
mental processes behind grammatical systems; it also undoubtedly was influenced by reader response theories that considered
the impact of a readers experience and expertise, literary and
otherwise, upon their interpretations of a text (see interpretive community and competence and performance,
literary). Recently, literary characters have been discussed as
mental models or schemas (see story schemas, scripts,
and prototypes) sets of expectations generated by exemplars, personal experience, stereotypes, and literary knowledge which undergo continual modification as the reader
receives more information about a character (see Gerrig and
Allbritton 1990; Schneider 2001).
Theorists continue to mine the connections between linguistic models and narrative. Work done on literary universals

454

Literary Universals
or narrative universals, which is modeled in part on the
study of language universals, focuses more on the construction
than the reception of texts. Such approaches tend to treat character types as narrative components that carry out action necessary for certain prototypical plot structures (see prototype). At
the same time, writers adopting this approach (e.g., Hogan 2003)
also consider the emotional effects elicited by different forms of
characterization in terms of empathy and identification. In addition, recent work by such writers as Toolan suggests that certain
contemporary grammars might provide better models of character types than the early generative grammar and transformational grammar upon which the theories of classical
narratology relied. For example, Toolan examines characters in
terms of the meaning-oriented grammar detailed by Michael
Halliday, which considers the different types of participants
in actions. These participants include a medium (the affected
participant), agent (a participant acting intentionally), force
(an inanimate agent), instrument (participant controlled by the
agent), beneficiary, and recipient (see also thematic roles).
Karen Renner
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bal, Mieke. [1985] 1997. Narratology: Introduction to the Theory of
Narrative. Toronto: University of Toronto Press.
Culler, Jonathan. 1975. Structuralist Poetics: Structuralism, Linguistics,
and the Study of Literature. Ithaca, NY: Cornell University Press.
Contains an excellent survey of structuralist approaches to narrative,
including a section on character, pp. 2308.
Gerrig, Richard J., and David W. Allbritton. 1990. The construction of literary character: A view from cognitive psychology. Style 24: 38092.
Greimas, A. J. [1966] 1983. Structural Semantics: An Attempt at a
Method. Trans. Daniele McDowell, Ronald Schleifer, and Alan Velie.
Lincoln: University of Nebraska Press.
Herman, David. 1997. Scripts, sequences, and stories: Elements of a
postclassical narratology. PMLA 112: 104659.
Hogan, Patrick. 2003. The Mind and Its Stories: Narrative Universals and
Human Emotion. Cambridge: Cambridge University Press.
Propp, Vladimir. [1928] 1968. Morphology of the Folktale. Trans. Laurence
Scott. 2d ed. Austin: University of Texas Press.
Schleifer, Ronald, and Alan Velie. 1987. Genre and structure: Toward an
actantial typology of narrative genres and modes. Modern Language
Notes 102: 112250.
Schneider, Ralf. 2001. Toward a cognitive theory of literary character: The dynamics of mental-model construction. Style 35: 60740.
Toolan, Michael. [1988] 2001. Narrative: A Critical Linguistic Introduction.
2d ed. New York: Routledge.

LITERARY UNIVERSALS
In parallel with language universals (see laws of language,
universal grammar, and universals, nongenetic,
among other entries), literary universals are, generally speaking,
patterns and structures exhibited widely by works of literature
across various familiar literary boundaries, whether national,
generic, or historical. Literary universals apply to various domains
besides world literature as a whole: individual regional and
national traditions; discrete literary forms and genres (poetry,
drama, narrative, etc.); separate literary histories and periods;
foundational literary concepts (plot, character [see literary

Literary Universals
character and character types], etc.); and common literary devices (e.g., metaphor), among others. Literary universals
may also be present as correlations across these various scalar
domains. Thus, Aristotles observation that tragedies have a
beginning, a middle, and an end applies to literary works more
broadly.
A specific call for literary universals appeared in Hogan (1994;
as a subtype of aesthetic universals), but the general concept
can be traced to Goethe, if not Aristotle. The Aristotelian binaries of universal/particular and substantial/nonsubstantial,
when intersected, generate a four-category ontology (see Lowe
2006). Literary universals include both nonsubstantial universals (abstract properties and relationships) and substantial universals: literary kinds (genres) and literary morphology, very
generally considered. Examples of the latter include metrical
analysis, biblical form criticism, thematics of creation stories,
Freytags pyramid, Proppian functions, Bakhtinian speech
genres, and so on. Substantial universals implies something
very different from the oxymoron concrete universals, sometimes invoked by New Critics and others, to suggest the possibility that literature can entirely transcend the universal/particular
dichotomy. Even when substantial (i.e., measurable, able to
be cataloged), literary universals work outside of any particular
instantiation or touchstone.
This entry briefly discusses five areas of literary universals: their
rationale and origins, some basic terminology, their relationship
to dominant strands of literary theory, various successful findings and limitations, and possible lines of future investigation.
The focus remains on universals as they apply directly to literary study, as opposed to universalist models conceived as more
purely linguistic or cognitive, and which merely invoke literary
terms (e.g., story grammar). In contrast, literary universals
seek to explore what unifies the incredible richness, beauty, and
diversity of global literatures, past and present.

three or more, and general or universal literature examined


all literatures as a whole (see Wellek and Warren 1977, 4653).
Such distinctions correlated with the nineteenth-century evolutionary paradigm: Comparative literature is coined by analogy
with the sciences of comparative zoology, comparative anatomy, comparative philology, and so on (see philology and
hermeneutics).
The nature of literature as verbal art, and thus cognitively
founded in language, is not the sole spring of its universality
(see verbal art, evolution and, and verbal art, neuropsychology of). The roots of literary universals also lie in
the common stock of anthropological development and social
behavior, or human universals (Brown 1991). Art is one such
human universal and, if the prehistoric impulse of making special is the ultimate origin of all kinds of artistic production and
aesthetic appreciation (Dissanayake 1992), it follows that literature, as one of the arts, will also exhibit certain universal and
nontrivial patterns. To the extent that it stems from the mysterious biology of play, literature may re-present any or all of the four
fundamental types of human and biological play described by
Roger Caillois (1961): mimesis (dress-up and lets pretend/
mimicry and camouflage), alea (games of chance/random variation), agon (sports and contests/survival of the fittest), and vertigo (swings and slides/flight and chasing). It could also be that
literature somehow recapitulates the Darwinian drama of survival (Meeker 1997), and the analysis of basic plots in world literature (e.g., Polti [1921] 1977) suggests that the same social and
sexual competitions of early human life are repeatedly replicated
in literature. On the other hand, given the difficulty that sociobiology and evolutionary psychology have in explaining
why the human is so different from the rest of the natural world,
more promising ways of sourcing literary universals may lie in
the study of creativity as a universal human phenomenon. Only
the human seems to actualize the original sense of creature, or
the still-becoming-creation.

Rationale and Origins


As with language universals, the key criterion for literary universals is not that they occur in all known literatures (though this
is possible for absolute universals; see absolute and statistical universals), but that they are represented, more often
than chance alone would suggest, by literatures that are areally
and genetically distinct (see areal distinctness and literature), that is, free from the kinds of relations and influences that
are to be expected when literatures are linked by literary history
or geography (Hogan 2003, 1719). Thus, if epic can be called a
universal genre, it is not so because it was written both by Homer
and by Milton, who knew (i.e., read) Homer, but because it is a
literary form also recognizable in the Mahbhrata in India, the
Tale of the Heike in Japan, and the Popul Vuh in South America
all traditions that are areally and genetically unrelated.
One of the earlier precursors of literary universals is the idea
of Goethes world literature or Weltliteratur, by which Goethe
seems to have meant a broad cultural unity whose understanding could also lead to global social progress. Weltliteratur was
one of the inspirations for the discipline of comparative literature. Reflecting its scientific ambitions, this field was subdefined
so that comparative literature per se examined the relationships of two national literatures, world literature compared

Terminology
The basic vocabulary of literary universals stems from parallels in the theory of language universals. Absolute universals
are those that apply to all literatures, past and present. As with
absolute universals of language, these may be few in number
and difficult to substantiate completely since available information about both the languages and literatures of the world is far
from complete. Nevertheless, some absolute literary universals
do appear to exist. One simple absolute universal is that literature (including oral literature or orature) occurs in all known
cultures. Whether this is historically monogenetic, like language
in Homo sapiens sapiens may be, or polygenetic, like the invention of writing systems certainly was, remains an open question since oral literature long precedes the written record (see
oral composition). Another possible absolute is that all
literatures (eventually) develop fundamental generic differentiations, such as between poetry and prose (see poetic form,
universals of). Another content-oriented absolute is the universality of myth (stories of creation, flood, etc.) in the earliest
recorded traditions. Often, there are striking parallels between
very specific elements among even the most areally and genetically distinct myths, which may lend credence to the existence

455

Literary Universals
of a monogenetic mother literature. For instance, it is possible
to reconstruct a protoline of Indo-European epic poetry like he
killed the dragon (Watkins 1995), suggesting that heroic tales
are a common origin for global literatures.
Of course, common origin may imply prevalent rather than
across-the-board. If universals are not absolute, they are statistical, that is, occurring more often than chance alone would predict. The common distinction between poetry, drama, and fiction
is a statistical universal because these forms are widely but not
universally distinguished in the literary traditions of the world.
Universals that correlate (in ways also not influenced genetically or areally) are typological universals (see typology).
One typological universal may be that if a tradition has a category for nonfiction, it will also (n.b. the awkward non-) have
a category for fiction, or that drama presupposes poetry.
In other words, such literary categories may function like basic
color terms (Berlin and Kay 1991), with traditions that differentiate fewer kinds of literature, including the same few kinds.
Logical universals are typological universals that are logically
entailed by the nature of the given literary phenomena: Thus, a
narrative has only two options for recounting a plot sequence,
either temporally or out of that temporal order (e.g., a flashback).
This suggests, in turn, a less obvious but important statistical universal: very few plots are atemporal, far from the half expected by
random distribution.
Above all, literary universals are empirical they can be
(coldly) documented and are not the products of one cultural
point of view imposed upon another. Thus, any of the common
usages of universal(ity) in literary studies that imply normative, hegemonic, or totalizing judgments do not pertain to literary universals as discussed here but are instances of critical
contamination that parallel genetic and areal influence. (For
further discussion of this issue, along with other terminology of
literary universals, see Hogan 2006.)

Universals and Literary Theory


One of the first universalist schemes in Western literary theory is
presented in Platos Ion, which probes the inability of oral poets
to account for their activities. Socrates explains poetic divine
possession by imagining a magnetic chain beginning with the
Muse, the oracular lodestone, whose power suspends a descending series of iron rings: original composer, intermediate reciter,
and final audience. Though meant in the Platonic scheme to
belittle all of art as a derivative and irrational form of knowledge,
this doctrine of inspiration has remained a fundamental myth of
literary theory through the twentieth century and beyond, as for
Muse has been substituted a virtual series of pervasively powerful and likewise subconscious lodestones: psyche, economy,
ideology, identity, empire, and so on.
Modern literary theory, however, has taken an attitude
toward universals that is schizophrenic, tacitly assuming that
universalizing theoretical modes are possible while vociferously
denying that literary works are anything but contingent idiographic particulars. Two passages from Ren Wellek and Austin
Warrens influential Theory of Literature, just a page apart, are
symptomatic. The first distances itself from scientific universalism of any kind: [N]o general law can be assumed to achieve
the purpose of literary study: the more general the more the

456

concrete object of the work of art will elude our grasp (1977, 18).
In short, universal formulas have little purchase in any individual act of literary criticism. Yet, one page later, the fundamental
need for a universal theory reasserts itself: Like every human
being, each work of literature has its individual characteristics;
but it also shares traits with humanity. [Thus, the] characterization [of its individuality] can be accomplished only in universal
terms, on the basis of literary theory (1977, 19). The universal/
particular paradox of literary studies was already in place at the
fields birth around the turn of the twentieth century, when the
nomothetic/idiographic divisions of the German university were
dominant, and all disciplines were preconceived in the category
of Wissenschaften, sciences.
Linguistic approaches to literature, inaugurated by Ferdinand
de Saussure alongside the birth of modern structural linguistics, also imply universals. Though an odd literary digression
from his linguistic theory, the mysterious anagrams Saussure
culled from Latin texts might serve to show that literature
involves an equally systematic (and compensatorially non-arbitrary?) selection of signs (even if authorially unintended; see
Starobinski 1979). In any case, the literary situation is an instance
of language, and thus the linguistic universals of the moment of
communication will apply and will also be available for artistic exploitation. Roman Jakobson (1960) famously delineated
how each of six components of the communicative situation
(addresser, message, contact, etc.) could be exploited for different linguistic and literary purposes (emotive, poetic, referential,
etc.), with different purposes predictably dominant in different
literary genres. Among other brilliant readings, Jakobsons perspicacious exploration of the neuropsychology of the metaphoric
and metonymic poles of linguistic competence in relation to
the divergent styles of various Russian novelists (1956) also suggests that literary universals (pace Wellek and Warren) can illuminate seemingly idiographic cases in literary criticism.
The coming of various posts in late-twentieth-century literary theory (poststructuralism, postmodernism, postcolonialism) has added new complexity and urgency to the search for
universals in literature (see Carroll 1995). Even while prioritizing irreducible differences, polyvalent identities, and the local
particulars of each text and reader, current literary theory nonetheless proceeds by placing literature in the context of grand
sociopolitical, economic, or linguistic structures.

Findings and Limitations


The domains with the most advanced treatment of literary universals thus far are poetics (i.e., prosody and meter) and narrative. Literary universals in the study of narrative are treated
thoroughly elsewhere (see narrative universals, narratology, and story schemas). One recent discovery by Patrick
Colm Hogan is that narratives seem to fall into three prototype
stories, a significant improvement over existing, more reductive
approaches (e.g., Joseph Campbells androcentric monomyth
of the heros journey).
Well before literary universals was coined, Paul Kiparsky presciently argued for a universal metrics: [A] theory of meter
cannot restrict itself to one poetic tradition, any more than a theory of grammar can restrict itself to one language. We must make
our theory account for metrical systems of other languages, and

Literary Universals
begin to construct a universal metrics (1981, 266). Metrical
schemes are by nature universal formulas for generating and
judging individual poems: [W]e can only begin to state invariant facts about the iambic pentameter line if we state them in
terms of an abstract representation of the line, rather than by reference to any of the actual performances of the line (Fabb 2002,
6). Although the stanza has received some attention, the focus of
prosody is the verse line (or line group, e.g., the elegiac couplet), affording a vast corpus for analysis.
Other suggestive work on literary universals has been done
on the level of fundamental literary concepts. Evolutionary psychology has conflated Darwinian survival with literary conflict
(the primary problem or adversary to be overcome in a given
story), but this is only one of the bases of literature, being limited
primarily to narrative or dramatic works. In poetry, probably the
most ancient literary form, the dominant literary universals seem
to be imagery (regarding content) and meter (regarding form).
Some early formalists believed that imagistic language was the
root of all literature. However, literature is animated by both
imagistic tropes and also schemes (recalling the classical distinction within the rhetorical canon of style), that is, the deliberate
patterning of existing linguistic components (i.e., multitudes of
parallelism: rhyme, chiasm, ascending cola, etc.). Schemes
are, in Jakobsons vocabulary, either poetic or phatic (i.e.,
at play with the message itself or its instantiation in a linguistic
code), rather than referential, iconic, or invoking deliberate
visualization.
Tropes have found other universalizing uses since Peter
Ramus, Giambattista Vico, and other Enlightenment rhetoricians elevated four of these traditional figures of thought to the
status of master tropes: metaphor (see metaphor, universals
of), metonymy, synecdoche, and irony. Of these, metaphor
has been most ascendant (see, e.g., Kvecses 2005). The cognitivist theory of conceptual metaphor even suggests that the
fundamental metaphoric principle, to understand one thing in
terms of another, applies to all concepts, since the mind is universally embodied through the constant real-world experience
of comparison (Lakoff and Johnson 1999). Kenneth Burke took a
specific form, the drama, as a metaphor of all social action ([1945]
1969). Going a step beyond both Aristotles triad and Jakobsons
six factors of communication, Burkes pentad of dramatism
(scene, act, agent, agency, purpose; with a sixth term, attitude,
supplemented later) was conceived synecdochically so that its
terms could pair into any of ten ratios (e.g., the pathetic fallacy
is an instance of the scene-act ratio).
Even as previously literary things as metaphor and drama
show great power for comprehending human thought and society, however, there remains the problem of the way that such
findings feed back to illuminate literature per se. If we all use metaphor continually in language, what then sets apart the poetic
metaphor of a Shakespeare sonnet? Is all literature ultimately
dramatic? As the study of literary universals moves forward, it
must be remembered that ever since Saussurean structuralism
and Russian (and other Eastern European) formalisms, that the
fundamental theoretical agenda is literariness (literaturnost), rather than the particular literary work (which remains the
realm of criticism) the langue of literature, as it were, rather
than the parole.

Future Directions
As the aforementioned paths are furthered, two broad new avenues of investigation present themselves. The first has to do with
defining and integrating the fundamental disciplinary unit of
analysis of literature. In each discipline, the unit of analysis drives
the research paradigm, such as event and its time-context for
history, or the atom and its forces in physics. Literary studies are
blessed, or cursed, with an array of merely informative terms,
such as character, theme, genre, and reading, of which none by
itself provides the overarching anecdote for literary studies. In
practice, moreover, the field is divided among such competing
intradepartmental interests as criticism, theory, creative writing,
rhetoric and composition, education, film studies, and journalism. There may nonetheless be a grand theory of literariness
that unites these disparate subfields; if so, it would also likely
reveal the fundamental interconnections among all the sister
arts (literature, visual art, performing art, and new media). Will
there be a new grammatike, the ancient word for literary study,
now a more limited purview of linguistics?
The second little-explored territory for literary universals
(even since Carroll 1995) is the area of diachronic universals
(cf. language change, universals of). For instance, do all
literatures begin with poetry and then proceed to prose? Does
myth typically diverge into history (fact) and fiction (imagination)? Are there other universals of the rise and fall of various
genres over time? The word tradition, often used in this entry as
synonymous with a literature, implies an entire literary history,
and the historical mode is one of the literary disciplines oldest
and largest strands. Much of this standing evidence might be
mined for diachronic literary universals.
Christopher M. Kuipers
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Berlin, Brent, and Paul Kay. 1991. Basic Color Terms: Their Universality
and Evolution. 2d ed. Berkeley: University of California Press.
Brown, Donald E. 1991. Human Universals. New York: McGraw-Hill.
Burke, Kenneth. [1945] 1969. A Grammar of Motives. Berkeley: University
of California Press.
Caillois, Roger. 1961. Man, Play, and Games. Trans. Meyer Barash. New
York: Free Press.
Carroll, Joseph. 1995. Evolution and Literary Theory. New York: Cambridge
University Press.
Dissanayake, Ellen. 1992. Homo Aestheticus: Where Art Comes from and
Why. New York: Free Press.
Fabb, Nigel. 2002. Language and Literary Structure: The Linguistic
Analysis of Form in Verse and Narrative. Cambridge: Cambridge
University Press.
Hogan, Patrick Colm. 1994. The possibility of aesthetics. British Journal
of Aesthetics 34.4: 33750.
. 1997. Literary universals. Poetics Today 18.2: 22349.
. 2003. The Mind and Its Stories: Narrative Universals and Human
Emotion. New York: Cambridge University Press.
. 2006. What are literary universals? Literary Universals Project.
Available online at: <http://litup.unipa.it/docs/whatr.htm>.
Jakobson, Roman. 1956. Two aspects of language and two types of aphasia disturbances. In Fundamentals of Language, ed. Roman Jakobson
and Morris Halle, 5582. The Hague: Mouton.
. 1960. Closing statement: Linguistics and poetics. In Style in
Language, ed. Thomas Sebeok, 35077. Cambridge, MA: MIT Press.

457

Literature, Empirical Study of


Kiparsky, Paul. 1981. Stress, syntax and meter. In Essays in Modern
Stylistics, ed. Donald C. Freeman, 22572. New York: Methuen.
Kvecses, Zoltn. 2005. Metaphor in Culture: Universality and Variation.
Cambridge: Cambridge University Press.
Lakoff, George, and Mark Johnson. 1999. Philosophy in the Flesh: The
Embodied Mind and Its Challenge to Western Thought. New York: Basic
Books.
Lowe, E. J. 2006. The Four Category Ontology: A Metaphysical Foundation
for Natural Science. Oxford: Clarendon Press.
Meeker, Joseph. 1997. The Comedy of Survival: Literary Ecology and a
Play Ethic. 3d ed. Tucson: University of Arizona Press.
Polti, Georges. [1921] 1977. The Thirty-Six Dramatic Situations. Trans.
Lucille Ray. Boston: The Writer.
Starobinski, Jean. 1979. Words upon Words: The Anagrams of Ferdinand
de Saussure. Trans. Olivia Emmet. New Haven, CT: Yale University
Press.
Watkins, Calvert. 1995. How to Kill a Dragon: Aspects of Indo-European
Poetics. New York: Oxford University Press.
Wellek, Ren, and Austin Warren. 1977. Theory of Literature. 3d ed. San
Diego, CA: Harcourt Brace Jovanovich.

LITERATURE, EMPIRICAL STUDY OF


The broad field of empirical research over the last three decades
has included a range of topics and approaches. Sociologically
oriented researchers have taken up literary socialization, such as
the reputation of authors, and audience research; book historians have surveyed the experiences of readers, especially those in
the working class over the last two centuries, and the role of reading clubs; writers interested in aesthetic response have studied
individual literary experience and compared literary response to
experiences of other media, such as film, computer gaming, or
hypertext. At the heart of the empirical endeavor is the formation
of theories and narratives about the role and status of literature
based on actual data, either verbal or numeric: This may consist
of readers memoirs or statistics for library borrowings, the study
of questionnaires elicited from readers, or evidence of reading
processes gathered during carefully controlled laboratory experiments. In this entry, the primary focus is on the formal aspects of
literary texts as reflected in studies of literary reception.
Reception studies cover a wide spectrum of topics, including style and narrative; readers feelings and the relation of literary understanding to the self; individual differences in readers
preferences or the influence on reading of personality traits;
cross-cultural differences in reading; and the relation of literary
experiences to other media. Some empirical studies attempt to
clarify or improve on the models of reading developed by discourse psychologists; others may represent an attempt to test a
particular claim about reading proposed by literary theorists or
narratologists. Either explicitly or implicitly, a number of
studies have raised the question of literariness: whether literary texts involve response processes that are measurably distinctive in some way.

Experimental Examples
In this section I discuss three themes that have been pursued
empirically and provide examples of the ways that readers
responses have been studied.
The term foregrounding refers to stylistically distinctive
aspects of literary texts. These may be apparent at the level of

458

sound (metrical effects or alliteration), syntax (such as


ellipsis or inversion), or semantics (metaphor, hyperbole,
etc.). The Russian Formalist critic Victor Shklovsky, commenting
on the purpose of literary devices, argued that literary art exists
to make one feel things; its purpose is to increase the difficulty
and length of perception ([1917] 1965, 12). The immediate effect
of foregrounding is to make strange (ostranenie), to defamiliarize. These ideas allow the empirical researcher to frame specific
hypotheses addressing the impact that foregrounded passages
have on readers.
First, as studied by Willie van Peer (1986), readers should find
that passages high in foregrounding are striking when compared
with passages low in foregrounding. To test this hypothesis, van
Peer chose four short poems and carried out a comprehensive
analysis of the foregrounding in each line at the three different
levels (sound, syntax, semantics). This enumeration enabled
the lines of the poems to be rank-ordered for density of foregrounding. Readers were then asked to respond to the poems by
underlining all words and phrases they found striking. In all of
the poems, the frequency of readers underlinings was found to
correlate highly with the density of foregrounding.
A similar study by David S. Miall and Don Kuiken (1994) took
up the additional suggestions of Shklovsky that art makes us feel
and that perception is lengthened. Working with three modernist short stories, they analyzed the presence of foregrounding in
each sentence. The stories were then presented a sentence at a
time on computer; reading times per sentence were recorded
while readers undertook a first reading at their normal reading
speed; they then read the story a second time while providing
a rating of each sentence. For all of the stories, after adjusting
for sentence length, the speed of reading correlated significantly
with foregrounding (highly foregrounded sentences took about
twice as long to read as sentences without foregrounding); and
readers ratings for strikingness and intensity of feeling also correlated with foregrounding. The readers in both this study and
that by van Peer were university students from a wide range of
backgrounds, yet correlations with foregrounding were significant regardless of their expertise in literature; thus, response to
foregrounding appears to be independent of literary education.
These findings suggest a theory of text processing: The encounter with foregrounding is found striking by the reader, who then
slows down in order to gain a better apprehension of the unusual
textual features; the experience is defamiliarizing and arouses
feeling, and feeling may be the vehicle by which the reader elicits
an alternative framework for reconceptualizing the meaning of
the text at that moment. The main findings have been confirmed
in several later studies on the effects of foregrounding.
Literary reading is often said to be engaging in its power to
evoke ideas and feelings about the self. A method for investigating this idea, termed self-probed retrospection, was first demonstrated in a study by Uffe Seilman and Steen F. Larsen (1989).
They proposed that a literary text was more likely to evoke personal resonance than a nonliterary text. Readers were given either
a short story or an expository text (on population growth), both
of about 3,000 words. While reading, readers put a pencil mark
in the margin whenever they were reminded of something from
their own lives; otherwise, reading occurred at a normal pace.
The two texts gave rise to the same number of remindings: about

Literature, Empirical Study of


13 for each text. After reading, readers reviewed their marks and
completed a short questionnaire on each reminding: its type,
vividness, emotional quality, and the like. The types of reminding were found to distinguish the two texts: For the literary text,
twice as many remindings involved a memory of the self as an
actor. The expository text, in contrast, invoked more memories of
things heard or read about. It was also noticed that remindings
in general were more frequent in the opening sections of both
texts (the downward trend was more marked in the literary text).
These findings suggest that readers of a literary text situate themselves by recruiting specific, self-related information, particularly
at the beginning of a text where an appropriate schema must be
developed, and that this information refers predominantly to the
active engagement of the reader in the world.
Several subsequent studies have built on the remindings
method. Larsen and Jnos Lszl (1990) studied the cultural
proximity of readers, working with Hungarian and Danish readers of a Hungarian story. Of the remindings produced, they
found that markedly more event memories were produced by
the Hungarians and, of these, significantly more were of experienced rather than reported events. Lszl and Larsen (1991)
then extended the method to look at the implications of point
of view in fiction. Several passages of the Hungarian story
were rewritten to change point of view. As before, personalevent remindings were significantly more frequent among the
Hungarian readers; in addition, shifting to the inside point of
view of a character increased the percentage of such remindings from 55 percent to 75 percent for the Hungarians (but had
no effect on Danish readers). There was also some evidence that
inside point of view influenced readers toward more emotional
remindings, regardless of their cultural background.
Another variant of the remindings method was developed by
Keith Oatley (1999) to examine gender and personality differences
in readers. Instead of a simple mark, readers were instructed to
write a letter in the margin: an E for an emotion; M for an autobiographical memory; and T for a train of thought. Among readers of
short stories by Alice Munro and Carson McCullers, female readers produced overall significantly more emotions than the male
readers; in addition, male readers produced fewer emotions when
the protagonist was female, whereas gender of protagonist had
no influence on female readers. In a second study, the method
was used to examine the aesthetic distance of readers from a
short story and how their responses mirrored their adult attachment styles. Kuiken, Miall, and Shelley Sikora (2004) developed
the method of self-probed retrospection to elicit verbal commentaries by readers. The reader is invited to read a text and make
marginal marks whenever a passage seems striking or evocative;
the reader later returns to the marked passages and chooses (say)
five on which to provide a commentary. Readers are readily able
to recover the thoughts and feelings that occurred during reading,
giving access to at least some of the mental processes that appear
to make literary reading distinctive.
A third set of studies is focused on the role of the narrator in
fiction. Marisa Bortolussi and Peter Dixon (2003) elaborate a
theoretical framework that accounts for the readers relation to
the narrator; in particular, they propose that readers treat the
narrator as a conversational participant and make inferences
about the narrators personality and values that influence their

reading. The authors have examined the relationship with the


narrator both theoretically and empirically in a series of studies
involving aspects such as dialogue, plot, point of view, and characterization. For instance, in studying the effects of free indirect
discourse (where the narrators voice represents the speech or
thought of a character without attribution), they found that this
style led readers to endow the narrator with the personality of the
character. Taking a story about a husband and wife, Rope, by
Katherine Ann Porter, that is related almost entirely in free indirect discourse focused on the male character, they constructed
several other versions of the story in which the character roles
were reversed and the dialogue was rewritten as direct quoted
speech. After reading the story, readers were asked several questions about their impressions of the narrator and the characters.
It was found, for example, that the rationality of a character was
rated higher when it was associated with the narrator through
free indirect discourse. Judgments about the likely gender of the
narrator were also aligned with the gender of the character represented through free indirect discourse.

Prospects
Although studies of the kind just described are not yet well
known among literary scholars, they point to a basis for rethinking the nature of literary studies and education. In contrast to
claims voiced by critics from I. A. Richards to E. D. Hirsch that
the responses of ordinary readers are ill-informed or whimsical, empirical studies demonstrate the presence of significant
regularities in readers responses, enabling significant conclusions to be drawn about the effects of literary language and form.
Empirical studies also shift the focus away from the interpretative
issues that have largely preoccupied literary scholars onto the
experiential aspects of literary reading. In this respect, they invite
a reconsideration of the role of literature in human culture.
David S. Miall
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bortolussi, Marisa, and Peter Dixon. 2003. Psychonarratology: Foundations
for the Empirical Study of Literary Response. Cambridge: Cambridge
University Press. An integrated approach to narrative theory and
the empirical study of reading, including exemplary studies by the
authors.
Kuiken, Don, David S. Miall, and Shelley Sikora. 2004. Forms of self-implication in literary reading. Poetics Today 25: 171203.
Larsen, Steen F., and Jnos Lszl. 1990. Cultural-historical knowledge and personal experience in appreciation of literature. European
Journal of Social Psychology 20: 42540.
Lszl, Jnos, and Steen F. Larsen. 1991. Cultural and text variables in
processing personal experiences while reading literature. Empirical
Studies of the Arts 9: 2334.
Martindale, Colin. 1990. The Clockwork Muse: The Predictability of Artistic
Change. New York: Basic Books.
Miall, David S. 2006. Literary Reading: Empirical and Theoretical Studies.
New York: Peter Lang. Chapter 3 in this book provides an introduction
to the methods of empirical study, while Chapter 7 surveys the principal research topics in empirical reception studies.
Miall, David S., and Don Kuiken. 1994. Foregrounding, defamiliarization, and affect: Response to literary stories. Poetics 22: 389407.
Oatley, Keith. 1999. Meetings of minds: Dialogue, sympathy, and identification, in reading fiction. Poetics 26: 43954.

459

Logic and Language


Seilman, Uffe, and Steen F. Larsen. 1989. Personal resonance to literature. Poetics 18: 16577.
Shklovsky, Victor. [1917] 1965. Art as technique. In Russian Formalist
Criticism: Four Essays, ed. and trans. L. T. Lemon and M. J. Reis, 324.
Lincoln: University of Nebraska Press.
Steen, Gerard, and Dick Schram, eds. 2001. The Psychology and Sociology
of Literature: In Honour of Elrud Ibsch. Amsterdam: John Benjamins.
A wide-ranging collection, mainly illustrating recent empirical studies
of literature.
van Peer, Willie. 1986. Stylistics and Psychology: Investigations of
Foregrounding. London: Croom Helm.
van Peer, Willie, ed. 2007. Foregrounding. Language and Literature 16.2
(Special Issue). A recent collection of contributions to foregrounding,
including both theoretical and empirical studies.
van Peer, Willie, Frank Hakemulder, and Sonia Zyngier. 2007. Muses
and Measures: Empirical Research Methods for the Humanities.
Cambridge: Cambridge Scholars Publishing.
Zwaan, Rolf. 1993. Aspects of Literary Comprehension: A Cognitive
Approach. Amsterdam and Philadelphia: John Benjamins.

LOGIC AND LANGUAGE


Every language, suitably understood, has a logic, suitably understood. The suitable understanding is a common semantic conception of logic and language. On this conception, the logic of
a language is the so-called consequence relation, which, on the
semantic conception, essentially involves truth preservation. The
chief goal of this entry is to briefly convey the basic and very
common sense in which every language has a logic. (N.b.: for
space and simplicity reasons, this essay privileges the so-called
semantic, or model-theoretic, approach to logic. Moreover, this
essay again, for space reasons only aims to convey basic ideas;
it doesnt aim to be a history or even survey of the semantic conception of logic and language.)

Languages and Truth Conditions


In specifying the logic of a language (or some fragment thereof),
one seeks precision. Much as physics idealizes away from the
messiness of physical reality, formal semanticists and logicians
(at least those concerned with natural languages) idealize away
from the messiness of linguistic reality. One such idealization is the
assumption that all (declarative) sentences of a language have socalled truth conditions. (Another immediate idealization is that we
can easily, and precisely, specify the target declarative sentences,
the sentences that, in some sense, are used to make assertions. In
what follows, sentence will be short for declarative sentence.)
For present purposes, such truth conditions are best thought of as
truth-in-a-case conditions, that is, conditions that provide, for any
relevant case, what it takes for sentences to be true-in-that-case.
If one thinks of cases as possible worlds, then truth conditions
provide the conditions under which sentences are true-in-w, for
any possible world w. Similarly, if one thinks of cases as situations, then truth conditions provide the conditions under which
sentences are true-in-s, for any situation s. Moreover, if one thinks
of cases as Tarskian models, then truth conditions provide the conditions under which sentences are true-in-M, for any model M.
In addition to the assumption of truth conditions, another
idealization is that sentences may be cleanly, precisely carved
into the atomics and molecular sentences, where, in the present

460

context, the latter sentences contain at least one logical connective. In standard approaches, the truth conditions for molecular
sentences are given recursively, piggy-backing, as it were, on
the truth conditions for atomics. (An example follows.)
For present purposes, a language is a precise syntax (involving, among other things, a precisely defined set of sentences,
some of which are atomics, some molecular) coupled with truth
conditions, which, as noted, provide truth-in-a-case conditions
for all sentences. So, in addition to specifying a syntax, ones
specification of a language involves specifying a class of cases in
terms of which all sentences, provided by ones specified syntax,
enjoy truth-in-a-case conditions. (An example follows.)

Logical Consequence Qua Truth Preservation


The consequence relation of a language (or fragment thereof)
is the chief concern of the field of logic, broadly understood. In
effect, a consequence relation yields what follows from what,
what sentences of the language logically follow from what sentences. Given a language, as understood here, we define logical
consequence or validity (i.e., semantic validity) as follows, where
L is a language, and A and B sentences of L.
Val. B is a consequence of A in L if and only if there is no case in
which A is true but B untrue.

If B is a consequence of A in L, we say that the argument from


A to B is (semantically) valid in L, that B logically follows from A
in L, that A (semantically) implies B, with all such terminology
being equivalent (for present purposes).
(Val), in turn, may be generalized. We say that a set X of
L-sentences is verified-in-a-case just if every member of X is
true-in-that-case. In turn, we say that the argument from X to A
is valid just if theres no case in which X is verified but A untrue.
Similarly, we say that a sentence A is logically true in L just if there
is no case in which A is untrue.

Sample Language and Logic


Consider an example of the foregoing ideas, in particular, a socalled classical propositional language. (Such languages are terribly simple; they have no quantifiers. To simplify even more,
our propositional language will contain no names or predicates!)
One motivation for the language is that we seem to have so-called
truth-functional connectives in English (and natural languages,
generally), and one might be interested in clearly specifying the
logic of that (truth-functional) fragment of our language. For
example, there seems to be a truth-functional usage of and, one
in which and expresses conjunction, where a conjunction is true
in a given case if and only if both conjuncts are true in the given
case. Similarly, a truth-functional usage of negation in English is
evident, one in which, for example, negation does no more nor
less than toggle truth values.
As noted, we first need to precisely specify a language. We
need to specify a vocabulary (in effect, the building blocks
of the language) and, in general, a full syntax, which contains a
(precisely defined) set of sentences; we then give our truth conditions. We proceed to define our language L as follows.
(1) Vocabulary: p, q, and r, with or without subscripts
(numerals for natural numbers), are our atomics. In addition

Logical Form

Logic and Language


to the atomics, we have a set of punctuation marks, namely,
( and ). Furthermore, we have a set of connectives: & is a
binary connective (takes two sentences to make a sentence),
and ~ is a unary connective (takes one sentence to make a
sentence). Our three given sets of symbols are disjoint.
(2) The set S of L-sentences is defined recursively as
follows.
a) All atomics are L-sentences.
b) If A and B are L-sentences, then so too are ~A and (A&B).
c) Nothing is an L-sentence unless its being so follows from
(2a) and (2b).

(3) L-cases are (total) functions c from S into V = {1,0}, where


V is our set of semantic values.
(4) Truth conditions: an L-sentence A is true in a case c iff
c(A) = 1.
a) An atomic sentence A is true in a case c iff c(A) = 1. (A is
false in c otherwise.)
b) A sentence of the form ~A is true in a case c iff c(A) = 0.
c) A sentence of the form (A&B) is true in case c iff c(A) = 1
= c(B).

With our language L so given, we can now see the sense in which
every language at least given suitable idealizations has a logic.
Applying (Val), we immediately see that, for any L-argument, it is
either truth preserving, in which case valid in L, or not (in which
case, invalid in L).
EXAMPLE. Consider the L-argument from premise (p&q) to
conclusion q. According to (Val), this argument is valid in L just
if theres no case in which (p&q) is true but q untrue. In L,
our cases are functions from the L-sentences into {1,0}. Is there a
case in which (p&q) is true but q not true? No. To see this, just
consider the truth conditions on L-sentences. According to those
conditions, a sentence of the form (A&B) is true in a case just if
both A and B are true in the given case. So, for any case c, if c(p&q)
= 1, then c(p) = 1 and c(q) = 1, in which case c(q) = 1. Hence, there
is no case c in which (p&q) is true but q isnt true.
What one also notices or would notice, on reflection about
L is familiar truth-functional behavior for negation. For example, given the truth conditions in L, and given (Val), it is easy to
see that p is a consequence of ~~p, and vice versa. In other
words, the double negation of a sentence is logically equivalent
to the given sentence.
One may proceed (as an exercise) to record the other logical
forms that are valid in our artificial language L. Once recorded,
one has a precise account of the logical behavior of & and ~
in L. In turn, one may evaluate whether such logical behavior
accurately captures the behavior of corresponding connectives
in ones natural language. In this respect, the artificial L serves
the role of idealized models in natural sciences: It gives a clear
account of how the target phenomena (in this case, logical connectives) behave. Rivalry among logical theories (more on which
follows) turns on the extent to which L is an accurate model.
Of course, L is but one example of a language (i.e., an idealized, artificial language), and a very simple one at that. Still, it
is not hard to see that, provided languages come equipped with
precisely defined cases and each sentence enjoys truth-in-acase conditions, (Val) quickly yields a logic for the language a

consequence relation, which carries the information about what


follows from what in the given language.

Artificial Versus Natural Languages


One might agree that every artificial language, as understood
here, has a logic, that is, a consequence relation, specified via
(Val). What, though, of natural languages?
The question is a good one, but very complex. Natural languages appear to have arguments that are truth preserving
in the strict sense of (Val) arguments such that theres no
case in which the premises are true but the conclusion untrue.
(Consider the limit example: the argument from A to A.) The
trouble, of course, concerns the relevant cases involved in natural languages truth conditions. Assuming (a not insignificant
assumption) that all sentences of a natural language have truth
conditions in the relevant sense, it remains unclear what counts
as a relevant case in such truth conditions.
For present purposes, the right account of cases for natural
languages is not pressing. The pressing issue is whether, in the
end, natural languages are sufficiently equipped with truth-in-acase conditions, whatever the cases may be. If so, the chief point
remains: Any such language, in virtue of (Val), enjoys a logic.

Logical Theories and Rivalry


A logical theory of a language (or fragment thereof) is a theory of
the consequence relation on that language (or fragment). One
way in which logical theories might disagree is on the choice
of logical connectives, but this is not necessary for disagreement. Two logical theories might agree on the class of (relevant) connectives in a language (fragment) while nonetheless
disagreeing about the logical behavior of such connectives a
disagreement that, in general, will show up in rival truth conditions for the given connectives. (Such disagreement over
truth conditions often centers on what counts as a case in the
relevant truth conditions.) Suffice to say that rivalry currently
reigns in the field of logic, at least concerning the right logical
theory of natural language (or fragments thereof). Problems
of vagueness and consistency, truth, and paradox
are particularly fertile phenomena for contemporary rivalry
among logical theories.
J. C. Beall
SUGGESTIONS FOR FURTHER READING

Any textbook on classical logic (of which there are many) will be
suitable further reading. From there, one might turn to textbooks
on nonclassical or so-called intensional logics (of which there
are many). As a first step, one might profitably peruse entries
under logic in the Stanford Encyclopedia of Philosophy, edited
by Edward N. Zalta, available online at: http://plato.stanford.
edu/.

LOGICAL FORM
The construction of systems in which valid inference can be characterized has been the central concern of logic since its inception with the ancients. Beginning with Gottlob Freges seminal
insights of the late nineteenth century (Frege 1879, 1893), it has

461

Logical Form
been understood that accomplishing this goal in a manner sufficiently rigorous that the inferred proposition can be taken to
be proven requires strict attention to the structural properties
of propositions, to their logical form. What was crystalized by
Frege was that this form can be revealed only in a language that
differs in two key respects from the superficial form of natural
languages: i) The structure of propositions is function-argument,
not subject-predicate, and ii) expressions of generality, unlike
proper names, do not occur as arguments, but rather bind variables that do. Together, these two differences afforded the first
adequate account of multiple generalization; by distinguishing xy(P(x,y)) from yx(P(x,y)) in terms of the scopes of
the universal and existential quantifiers, Frege was able to allay
one of the central problems that had plagued traditional logic.
(Cf. Kneale and Kneale 1962, 483ff.)
Freges insight, that grammatical form does not reliably reveal
logical form, was taken up by Bertrand Russell, most notably in
the theory of descriptions (Russell 1905). Russell proposed that
definite descriptions, as in The present King of France is bald,
is not to be understood in the manner of a proper name, that is,
standing as an argument, but rather as a complex term of generality. Thus, the proper logical form is not B(k), but rather
x(K(x) & y(K(y) x = y)) & B(x); that is, there is one and only
one present King of France and he is bald. By taking this to be
the proper logical form, Russell argued that a number of logical
issues could be directly addressed. For example, the ambiguity of
The present King of France is not bald could be accounted for
by taking the negation as having scope either inside or outside
the existential quantifier; negation having broader scope brings
the case into conformance with the law of the excluded middle,
as Russell observed.
Both Frege and Russell realized not only that the insights
about logical form being surveyed clarified the formal nature of
inference, but that these aspects of form also allowed for semantic elucidation, being directly related to an account of the conditions for the truth of propositions. For instance, for Russell,
a substantial virtue of the account of descriptions was that the
intuition that The present King of France is bald is false can be
directly accommodated. But that we can proceed beyond intuitive elucidation to a formally and materially adequate definition
of truth, based on the sort of conception of logical form pioneered
by Frege and Russell, is due to Alfred Tarski ([1936] 1956, 1944).
In the case at hand of quantification, Tarskis semantic clauses
run as follows: With respect to a universe of objects U, x(P(x))
is true just in case every sequence of objects of U satisfies P(x);
x(P(x)) is true iff this is so for some sequence. (A sequence S
satisfies an open formula P(x) iff there is an assignment of a
value a of S to the variable x such that a is P.) Because Tarskis
method iterates, it extends to multiple generalization, distinguishing the truth conditions of xy(P(x,y)) from those of
yx(P(x,y)), thus providing semantic foundation for the syntactical insights of Frege.
Central to the importance of Tarskis formalization of semantics is that it paved the way to metalogic, the study of the properties of logical systems, centrally their consistency, soundness,
and completeness. Tarski was clear to maintain that the definition of truth could be reliably applied only to those systems
whose propositions have the requisite logical form; like his

462

predecessors, Tarski was skeptical that natural languages are


such systems. Indeed, his recommendation was that we eschew
anything other than formal languages when engaging in scientific discourse. A more moderate view emerged, however, which
tried to show that natural languages, at least to a certain extent,
could be rendered in the logical idiom. Most closely associated with W. V. O. Quine (1950), the idea was that logic is to
be understood as schematic. On this view, there are no logical
propositions, per se, but only propositional schema; valid inference is characterized with respect to such schemata, and holds
for any instantiation of the schema. Natural languages can
then be regimented as instances of the schemata; a sentence
then has a certain logical form because it conforms to a propositional schema of a certain form. (A simple example: John
came and Mary left is an instance of the schema & , and
so has the logical form John came & Mary left, where & is
the logical symbol for conjunction.) Quines view nevertheless
is no departure from the tradition that distinguishes grammatical from logical form; it does depart, however, in holding that
systematic associations can be established between grammatical forms and logical forms, for significant aspects of natural
languages (1960).
Accepting the traditional separation of grammatical and logical form is not universal; rejection of it has been central within
linguistic theory since the mid-1970s. On this latter view, the
derivation of a sentences logical form is an aspect of its syntactic derivation, hence, an aspect of its grammatical form. Again,
the central case is quantification; in pivotal work, Robert May
(1977) showed that the scope of quantifiers, including multiple quantifiers, can be represented, as in the manner noted,
by assuming that there is a transformational rule that moves
quantifier phrases, leaving a trace, interpreted as a variable
bound by the moved phrase. By hypothesis, the formulation
of Mays rule QR requires theoretical resources no greater
than those independently needed within linguistic theory to
otherwise express transformational mappings, for instance,
wh-movement. The class of syntactic representations generated by transformational mappings, including those effected
by QR, is known as LF. Thus, sentences with multiple quantifiers, such as Everyone loves someone will have two distinct
LF-representations, roughly:
[everyonei [someonej [ti loves tj]]]

and
[someonej [everyonei [ti loves tj]]]

which can be defined as representing the differing scope


orderings of the quantifiers, the traces of QR being interpreted
as variables bound by the quantifiers, so that in this regard,
LF-representations constitute logical forms. That grammars
of natural languages have the rule QR is now a widely (if not
universally) accepted assumption within linguistics (Fiengo
and May 1994; Fox 2000; Hornstein 1984, 1995; Larson and
Segal 1995; May 1985; Reinhart 1997); among the most wellknown independent arguments are those from weak crossover
(Chomsky 1976) and anaphoric binding more generally, and
antecedent contained deletion (May 1985). It has also become
a commonly accepted assumption within recent thinking in

Logical Positivism

Mapping

philosophy of language (King 2001; Neale 1990; Stanley 2000;


Ludlow 2002).
Robert May
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 1976. Conditions on rules of grammar. Linguistic
Analysis 2: 30351.
Fiengo, Robert, and Robert May. 1994. Indices and Identity. Cambridge,
MA: MIT Press.
Fox, Danny. 2000. Economy and Semantic Interpretation. Cambridge,
MA: MIT Press.
Frege, Gottlob. [1879] 1967. Begriffsschrift: A Formula Language Modeled
Upon That of Aithmetic, for Pure Thought. Trans. Stefan BauerMengelberg. In From Frege to Gdel, ed. Jean van Heijenoort, 582.
Cambridge: Harvard University Press.
. [1893] 1964. The Basic Laws of Arithmetic. Trans. Montgomery
Furth. Berkeley and Los Angeles: University of California Press.
Hornstein, Norbert. 1984. Logic as Grammar. Cambridge, MA: MIT
Press.
. 1995. Logical Form: From GB to Minimalism. Oxford: Blackwell.
King, Jeffrey C. 2001. Complex Demonstratives: A Quantificational
Account. Cambridge, MA: MIT Press.
Kneale, William, and Martha Kneale. 1962. The Development of Logic.
Oxford: Oxford University Press.
Larson, Richard, and Gabriel Segal. 1995. Knowledge of Meaning.
Cambridge, MA: MIT Press.
Ludlow, Peter. 2002. LF and natural logic, In Logical Form and Language,
ed. Gerhard Preyer and Georg Peter. Oxford: Oxford University Press.
May, Robert. 1977. The Grammar of Quantification. Ph.D. diss.,
Massachusetts Institute of Technology.
. 1985. Logical Form: Its Structure and Derivation. Cambridge,
MA: MIT Press.
. 1999. Logical form in linguistics. In The MIT Encyclopedia of
the Cognitive Sciences, ed Robert A. Wilson and Frank C. Keil, 4867.
Cambridge, MA: MIT Press.
Neale, Stephen. 1990. Descriptions. Cambridge, MA: MIT Press
Quine, W. V. O. 1950. Methods of Logic. New York: Henry Holt.
. 1960. Word and Object. Cambridge, MA: Technology Press.
Reinhart, Tanya. 1997. Quantifier scope: How labor is divided between
QR and choice functions. Linguistics and Philosophy 20: 399467.
Russell, Bertrand. 1905. On denoting, Mind 14: 47993.
Stanley, Jason. 2000. Context and logical form. Linguistics and
Philosophy 23: 391434.
Tarski, Alfred. [1936] 1956. The concept of truth in formalized languges. In Logic, Semantics, Metamathematics, trans. J. H. Woodger.
Oxford: Oxford University Press.
. 1944. The semantic conception of truth. Philosophy and
Phenomenological Research 4: 34175.

LOGICAL POSITIVISM
Also known as logical empiricism, logical positivism was an
important philosophical movement in the first half of the twentieth century that reached its peak in the interbellum period and is
associated with the Vienna Circle (Wiener Kreis) and the Berlin
Circle (Berliner Kreis). The most prominent members of the
former were Moritz Schlick, Rudolf Carnap, Otto Neurath, Hans
Hahn, and Friedrich Waismann; of the latter, Hans Reichenbach,
Kurt Grelling, Carl Gustav Hempel, and Richard von Mises.
On the standard account (see, e.g., Alfred Ayer 1936), logical
positivism is committed to the following principles:

Firstly, formal logic as it has been developed by Gottlob Frege,


is seen both as an instrument for analysis and as an ideal language
wherein all scientific knowledge is expressable. Many logical positivists also accepted Freges logicism, namely, the view that mathematics is reducible to logic. Hence, they endorsed the view that
mathematics is a language, not a science like, for example, physics.
Secondly, it follows that a clear distinction can be made
between analytic and synthetic sentences. The former consist
of logical and mathematical tautologies, whereas the latter can
be either true or false and are therefore dependent on the way
things are; that is, they are empirical.
Thirdly, this leads to the principle of verifiability: If a sentence is meaningful, then it should be possible to determine its
truth value.
Straightforward consequences of these principles are, firstly,
that metaphysical statements are neither analytic, as they are not
tautologies, nor synthetic, as they do not refer to the empirically
accessible world; hence, they are meaningless. And, secondly, if all
of the sciences can be expressed in one and the same language, that
is, the language of mathematics, all sciences can be unified into a
single framework. Hence, the unity of science is a reachable goal.
It should be noted that although there are connections with
Ludwig Wittgensteins views, as expressed in the Tractatus
Logico-Philosophicus, and although Wittgenstein attended some
of the meetings of the Vienna Circle, it would not be correct to
label him a logical positivist.
On a more refined account, qualification is needed. It suffices to look at the original manifesto of the Vienna Circle, the
Wissenschaftliche Weltauffassung (the Scientific Worldview), to
notice that the logical positivist program also included ethicalsocietal views. In recent years, many authors have made a strong
case to have a second and historically more nuanced look at logical positivism (see, e.g., Michael Friedman 1999).
It is generally accepted that both Karl R. Popper, the founding father of falsificationism, and Willard Van Orman Quine, the
founding father of naturalized epistemology, have been the most
important critics of logical positivism. The former questioned
the verifiability theory; the latter rejected the analytic-synthetic
distinction.
Jean Paul Van Bendegem
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ayer, Alfred. 1936. Language, Truth and Logic. London: Victor Gollancz.
Friedman, Michael. 1999. Reconsidering Logical Positivism.
Cambridge: Cambridge University Press.
Janik, Allan, and Stephen Toulmin. 1973. Wittgensteins Vienna. New
York: Simon and Schuster. This book outlines the cultural setting
wherein logical positivism could arise.

M
MAPPING
In what follows, mapping is used in the general mathematical
sense of a partial or total correspondence between elements,
relations, and/or structures in two sets.

463

Mapping
Much of the theoretical thinking in modern linguistics has
been strongly linked to the development since the 1950s of cognitive science, artificial intelligence, and neuroscience. The first
wave of cognitive science looked upon the brain as a sophisticated symbol-processing digital computer, and linguistic models
in the fifties and sixties took a largely algorithmic approach, with
a strong focus on syntax and logic.
In the 1970s and 1980s, there was a sharply different second
wave of thinking that launched a rigorous, empirically based
study of conceptual mappings: analogy, frames, metaphor, metonymy, grammatical constructions, and mental space projections. This original and ambitious research
program revisited from a modern point of view some fundamental issues that have been known since antiquity. It drew
on a powerful multidisciplinary mix of psychology, linguistics,
computational modeling, and philosophy. Names associated
with pioneering efforts in the new field of conceptual mappings include Douglas Hofstadter, Melanie Mitchell, Dedre
Gentner, and Keith Holyoak, for analogy; George Lakoff, Mark
Johnson, and Mark Turner, for metaphor and image schemas;
Erving Goffman and Charles Fillmore, for frames and frame
semantics; Ronald Langacker, Charles Fillmore, and Adele
Goldberg, for cognitive and construction grammars; Gilles
Fauconnier, Eve Sweetser, and John Dinsmore, for mental space
projections; and Geoffrey Nunberg for metonymic mappings
(pragmatic functions).
In the 1990s and up to the time of writing, there was substantial further evolution of our thinking on these issues. The creative
dimension of conceptual mappings was explored through the
study of conceptual blending and compression (Fauconnier
and Turner 2002; Coulson 2001) and through the modeling of
emergent structure in analogy (Hofstadter 1995; Hummel and
Holyoak 1997). The role of primary metaphors was discovered by
Joe Grady (1997); constraints on mappings were proposed within
metaphor theory and within blending theory.
Metaphor was once commonly viewed as literary, figurative,
poetic something exotic that we add to ordinary language to
make it more colorful, vivid, and emotional. But since the inception of conceptual metaphor theory, it is widely acknowledged that metaphor is, in fact, central to thought and language
and necessary for human language in its many forms. In order
to talk and think about some domains (target domains), we use
the structure of other domains (source domains) and the corresponding vocabulary (see source and target). Some of these
mappings are used by all members of a culture, for instance, in
English, TIME as SPACE. We use structure from our everyday conception of space and motion to organize our everyday conception
of time, as when we say Christmas is approaching, The weeks go
by, Summer is around the corner, The long day stretched out with
no end in sight. Rather remarkably, although the vocabulary often
makes the mapping transparent, we are typically not conscious of
the mapping during use unless it is pointed out to us. Though cognitively active, such mappings are opaque: The projection of one
domain onto another is automatic. Metaphoric mappings may
also be set up locally, in context, in which case they are typically
perceived not to belong to the language but rather to be creative
and part of the ongoing reasoning and discourse construction.

464

Creative metaphors are often elaborations of conventional ones,


as in the following typical literary example:
Perhaps time is flowing faster up there in the attic. Perhaps the
accumulated mass of the past gathered there is pulling time out
of the future faster, like a weight on a line. (McDonald 1992,
8283)

Thought and language are embodied. Conceptual structure


arises from our sensorimotor experience and the neural structures that give rise to it. The properties of grammars are the properties of humanly embodied neural systems. Inference inherently
built into a source domain will be transferred by projection to an
abstract domain. For example, the conventional metaphors of
SEEING as TOUCHING (e.g., I couldnt take my eyes off her) and
KNOWING as SEEING (e.g., I see what youre saying) combine
with one schema for the English preposition over to motivate
overlook: The line of sight travels over (i.e., above) the object;
hence, there is no contact; hence, it is not seen; hence, it is not
noticed or taken into account. In contrast, look over (she looked
over the draft) uses a related but different schema for over, a path
covering much of a surface, as in she wandered over the entire
field. This sense combines with the same mappings to produce
a very different abstract meaning the object this time is seen
and noticed.
Metonymic mappings link two relevant domains, which may
be set up locally. They typically correspond to two categories of
entities, which are mapped onto each other by a pragmatic function. For example, authors are matched with the books they write,
or hospital patients are matched with the illnesses for which they
are being treated. Metonymic mappings allow an entity to be
identified in terms of its counterpart in the projection. So, when
a nurse says The gastric ulcer in room 12 would like some coffee,
he/she uses the illness (the gastric ulcer) to identify the patient
who has it. Metonymy allows information to be compressed. If
Jack is the patient and if the nurse is addressing a physician, his/
her statement simultaneously conveys that Jack wants coffee and
that he has a gastric ulcer, which could be further intended as a
question to ask if coffee is permitted under the circumstances.
Im in the phone book uses a metonymic mapping from people
to names. It says not only that my name is written in the phone
book but also that the number linked to my name is indeed my
phone number. So it really says something about me, not just
about my name: how to reach me, that I dont mind making my
number publicly available, and so on.
Metonymic and metaphoric mappings can combine to provide even greater compression, as in Martina is three points away
from the airport, said by a sports announcer of the tennis star
Martina Navratilova, who was about to lose a tournament match.
The points stand metonymically for the events of losing a point.
Three such events would lead to defeat. The events are on a metaphorical spatial scale to which the tennis player gets mapped.
On that scale, the player is metaphorically at a spatial distance
of three points from the end of the match which would mean
defeat. A metonymic chain takes us from the end of the match to
defeat, then to exclusion from the rest of the tournament, then
to returning home. The airport (a place) stands metonymically
for an event (flying home) that starts in that place. Through the

Mapping
metonymic chaining, flying home links to leaving the tournament, which links in turn to losing the match, itself caused by
the three lost points. Strikingly, very little of this is indicated
by the linguistic structure itself. It is constructed by means of the
cognitive models that we have for games, tennis, tournaments,
and travel and by applying to them the appropriate mappings.
The same sentence can take on completely different meanings if
we bring in different cognitive models.
Mental space projections link elements and relations in connected mental spaces. For instance, in saying Liz thinks her husband is tired, we build a mental space for Lizs reported beliefs,
with a counterpart for her husband and properties within that
space (tired) that may or may not be satisfied in connected
spaces: Liz thinks her husband is tired, but actually hes in great
shape. In saying Last year, Lizs husband was tired, we build a
mental space for last year, and in saying Liz thinks that last
year, her husband was tired, we build a space for last year embedded in a belief space, itself embedded in a base space. presuppositions (such as Lizs having a husband) can spread across
spaces: In the last example, we infer that Liz has a husband, that
she thinks she has a husband, and that last year, she also had
this husband. But any of these presuppositions can be prevented
from projecting by an explicit overriding entailment.
In mental space projection, the access principle allows a
description of an element to identify its counterpart in another
mental space. For example, if Liz got married to Bob yesterday,
we can say Last year, Lizs husband was tired, identifying Bob in
the mental space last year by means of his counterpart (Lizs
husband) in the mental space now.
Conceptual blending generalizes the notion of conceptual
mapping to arrays of multiple mental spaces with the creation
of new blended spaces and the emergence of novel structure.
Such arrays of connected spaces are called integration networks.
Partial mappings link the mental spaces in such networks, and
selective projection maps the spaces onto novel blended spaces.
The mappings are supported by a small number of vital relations, such as analogy, change, identity, rolevalue, causeeffect.
Compression is systematic in integration networks: A vital relation in one part of the network can be compressed into a different (or a scaled-down) vital relation in another part of the
network. Take, for example, My tax bill gets longer every year. The
inputs are the mental spaces corresponding to different years. In
each one, there is a tax bill. These input spaces are linked by the
vital relation of analogy: Each one is structured by the frame of
paying taxes in a particular year, and each tax-paying situation
is analogous to the others. The inputs are also linked by disanalogy: Each tax bill is different (longer than the previous one).
The analogous input spaces are integrated into a single blended
space, in which all the tax bills are fused into one: Analogy is
compressed into identity. Disanalogy is compressed into change.
In the blended mental space, there is a single tax bill that changes
over time.
Metaphors typically result from double-scope integration
networks, whereas metonymy turns out to be the compression of
one vital relation into another.
Conceptual mappings are not prompted only by spoken or
signed language. They are part of human thought, communication, and interaction quite generally; they are signaled through

multiple modalities (Alac 2006) and anchored by human cultural artifacts as part of socially distributed cognition
(Hutchins 1995).
Biologically, it is currently widely assumed that mappings are
effected by means of neural binding (Shastri 1996). Computational
models of such binding have been proposed within the neural
theory of language (Feldman 2006). Experimental techniques to
show the psychological reality of various mappings have been
devised by Lera Boroditsky (2000), Ray Gibbs (Gibbs et al. 1997),
and Seana Coulson (2001), among others
Gilles Fauconnier
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alac, Morana. 2006. How brain images reveal cognition: An ethnographic study of meaning-making in brain mapping practice. Ph.D.
diss., University of California, San Diego.
Boroditsky, L. 2000. Metaphoric structuring: Understanding time
through spatial metaphors. Cognition 75.1: 128.
Coulson, Seana. 2001. Semantic Leaps. Cambridge: Cambridge University
Press.
Fauconnier, Gilles. [1985] 1994. Mental Spaces. Cambridge: Cambridge
University Press.
. 1997. Mappings in Thought and Language. Cambridge: Cambridge
University Press.
Fauconnier, Gilles, and Mark Turner. 2002. The Way We Think. New
York: Basic Books.
. 2008. Rethinking metaphor. In Cambridge Handbook of
Metaphor and Thought, ed. Ray Gibbs, 5366. Cambridge: Cambridge
University Press.
Feldman, Jerome. 2006. From Molecule to Metaphor. Cambridge,
MA: MIT Press.
Gentner, Dedre. 1983. Structure-mapping: A theoretical framework for
analogy. Cognitive Science 7: 15570.
Gentner, Dedre, Keith Holyoak, and Boicho Kokinov, eds. 2001. The
Analogical Mind: Perspectives from Cognitive Science. Cambridge,
MA: MIT Press.
Gibbs, R., J. Bogdonovich, J. Sykes, and D. Barr. 1997. Metaphor in idiom
comprehension. Journal of Memory and Language 37: 14154.
Goffman, E. 1974. Frame Analysis. New York: Harper and Row.
Grady, J. 1997. Foundations of meaning: Primary metaphor and primary
scenes. Ph.D. diss., University of California, Berkeley.
Hofstadter, Douglas. 1995. Fluid Concepts and Creative Analogies.
New York: Basic Books.
Hummel, J., and K. Holyoak. 1997. Distributed representations of structure: A theory of analogical access and mapping. Psychological Review
104: 42766.
Hutchins, Edwin. 1995. Cognition in the Wild. Cambridge, MA: MIT
Press.
Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By.
Chicago: University of Chicago Press.
Lakoff, George, and Rafael Nez. 2000. Where Mathematics Comes
From: How the Embodied Mind Brings Mathematics into Being.
New York: Basic Books.
Liddell, Scott K. 2003. Grammar, Gesture, and Meaning in American Sign
Language. Cambridge: Cambridge University Press.
McDonald, Ian. 1992. King of Morning, Queen of Day. New York: Bantam
Books.
Mitchell, M. 1993. Analogy-Making as Perception. Cambridge, MA: MIT
Press.
Nunberg, G. 1978. The Pragmatics of Reference. Bloomington: Indiana
University Linguistics Club.

465

Markedness
Nez, Rafael. 2005. Creating mathematical infinities: Metaphor, blending, and the beauty of transfinite cardinals. Journal of Pragmatics
37: 171741.
Nez, Rafael, and Eve Sweetser. 2006. Looking ahead to the
past: Convergent evidence from Aymara language and gesture in the
crosslinguistic comparison of spatial construals of time. Cognitive
Science 30: 40150.
Shastri, Lokendra. 1996. Temporal synchrony, dynamic bindings, and
SHRUTI a representational but non-classical model of reflexive reasoning. Behavioral and Brain Sciences 19.2: 3317.
Sweetser, Eve. 1996. Reasoning, mappings, and meta-metaphorical
conditionals. In Essays in Semantics and Pragmatics, ed. Masayoshi
Shibatani and Sandra Thompson, 22134. Amsterdam: John
Benjamins.
Turner, Mark. 1991. Reading Minds. Princeton, NJ: Princeton University
Press.
Williams, Robert. 2005. Material anchors and conceptual blends in timetelling. Ph.D. diss., University of California, San Diego.

MARKEDNESS
The original insight of markedness was that many linguistic phenomena consist of polar opposed pairs for example, the phonological feature unvoicedvoiced and the grammatical relation
activepassive and that typically there is an asymmetry, such that
one term is more general and thus unmarked (given first in the
examples) and the other is more constrained and thus marked.
Markedness was first developed in phonology as an explanation
for asymmetries in phonological systems based on cross-linguistic
comparisons, with evidence from typology and universals for
example, more (unmarked) oral consonants than marked nasal
ones: Unmarked consonants occur in places of nonconditioned
neutralization (e.g., only unmarked voiceless consonants in wordfinal position in Russian). Later, markedness was used to study
grammatical semantics (where the unmarked term has a larger
semantic range than the marked term), to explain the order of phonological acquisition in child language (unmarked terms learned
before marked terms, e.g., stops before fricatives) and aphasia (marked terms lost before unmarked ones), and to identify
implicational universals, in which the presence of a marked
element implies the presence of the corresponding unmarked
element, but not vice versa (all of these in Jakobson 1990). Since
then, it has developed into an important (though controversial)
concept in other areas of linguistics, such as morphology, syn-

tax, lexical semantics, historical linguistics, second


language acquisition, stylistics, and so on.
Since about the 1960s, two substantially different approaches
to markedness have developed and with them different types of
evidence, and explanations, for markedness. The FUNCTIONAL(-typological) approaches (e.g., Givon 1990; Croft 2003), based
on earlier work in typology universals, depend on diagnostic
criteria, not only from linguistic systems but also from language
use, and these are related to functional criteria, communicative needs, processing efficiency, learnability, memory, and so
on. Criteria include zero or simple expression for the unmarked
term (isomorphism) unmarked singular with zero expression as
in cat versus marked plural with the marker -s in cats; text (token)
frequency (unmarked term more frequent); contextual distribution (unmarked category has greater freedom of occurrence);

466

and leveling toward the unmarked term in pidgins/creoles,


dialects, informal speech, and so forth. A further outgrowth has
to do with markedness hierarchies that are scalar in nature, such
as the noun phrase accessibility hierarchy for relativization (e.g., from less marked to more marked role of the relative
pronoun in the relative clause, subject > direct object > indirect
object > prepositional object).
The formal (generative) approaches, especially universal grammar (UG), the principles and parameters plus
minimalist approach to syntax (Chomsky 1995), and optimality theory in phonology (Prince and Smolensky, 2004), focus
on competence and reject criteria for markedness related to
use (performance). Using cross-linguistic details of grammar,
UG has been posited; UG determines a set of possible core grammars for languages by setting parameters, so that systems that
fall within a core grammar constitute the unmarked phenomena
and more marked elements are found in the periphery (see core
and periphery) (Chomsky 1981). More recently, work has also
focused, for example, on expression of markedness relations by
constraints, on an explanation of markedness asymmetries
through constraint interaction, and the use of constraint forms
to express markedness hierarchies.
In both of these approaches, the unmarked category has also
at times been assimilated with the concept of naturalness, as in
natural phonology and natural morphology, as well as in optimality theory; some see it as overlapping with normality, regularity,
generality, and productivity; it has also been used, for example,
in studies of word order to define the basic, dominant, or preferred WORD ORDER (e.g., subject-verb-object in English); and
it has certain elements in common with the notion of prototype. While not everyone uses the term markedness and some
linguists think that it is an unwieldy cover term with too wide a
range of application and no central definition, others in both traditions see it as a major conceptual and explanatory tool that will
continue to be of interest and utility for understanding various
phenomena of language.
Linda Waugh
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Battistella, Edwin. 1990. Markedness: The Evaluative Superstructure of
Language. Albany: State University of New York Press. Battistellas two
books are the most accessible long treatments of the topic.
. 1996. The Logic of Markedness. New York: Oxford University Press.
Battistellas two books are the most accessible long treatments of the
topic.
Chomsky, Noam. 1981. Markedness and core grammar. In Theory of
Markedness in Core Grammar, ed. A. Belletti, L. Branmdi, and I. Riozzi,
12346. Pisa: Scuola Normale Superiore di Pisa.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
Croft, William. 2003. Typology and Universals. 2d ed.
Cambridge: Cambridge University Press.
Givon, Talmy. 1990. Syntax: A Functional-Typological Introduction, 2.
Amsterdam: John Benjamins.
Jakobson, Roman. 1990. On Language. Ed., with an introduction by,
L. Waugh and M. Monville-Burston. Cambridge: Harvard University
Press.
Prince, Alan, and Paul Smolensky. 2004. Optimality Theory: Constraint
Interaction in Generative Grammar. Malden, MA: Blackwell.

Market, Linguistic
Waugh, Linda, and Barbara Lafford. 2000. Markedness. In
Morphology: An International Handbook on Inflection and WordFormation, I: 27281. Berlin: Walter de Gruyter. An accessible treatment of the topic.

MARKET, LINGUISTIC
Pierre Bourdieu defines a linguistic market as a system of relations of force which determine the price of linguistic products and
thus helps fashion linguistic production (1989, 47). If linguistic
habitus is the subjective element of habitus connected with
language use, linguistic market represents the objective field
relations. As always with Bourdieu, the two are in a constant state
of dynamic interrelationship, as well as evolving dynamically as
a part of the transformation of social structures.
In positing a concept such as linguistic market, Bourdieu is
targeting traditional linguistics. His quarrel is with all linguistics going back to the work of Ferdinand de Saussure, which he
sees as treating language as an object of study rather than as a
practice. The concept thus constitutes language as logos rather
than praxis. Bourdieus critique extends to Noam Chomsky and
Chomskyan linguistics, with its discovery a of semibiological
language acquisition device, deep syntactical structure
(see underlying structure and surface structure),
and universal grammar. Bourdieu cannot accept the
Chomskyan precepts that linguistics should be concerned with
an ideal speaker-listener, a homogeneous speech community,
and perfect grammatical competence. Bourdieus alternative
can be summed up as follows:
In place of grammaticalness it puts the notion of acceptability,
or, to put it another way, in place of the language (langue), the
notion of legitimate language. In place of relations of communication (or symbolic interaction) it puts relations of symbolic power,
and it replaces the meaning of speech with the question of the
value and power of speech. Lastly, in place of specifically linguistic competence, it puts symbolic capital, which is inseparable
from the speakers position in the social structure. (1977, 646;
italics in original)

In other words, Bourdieu is seeking to socialize, or at least sociologicalize, all the major principles of traditional linguistics.
The linguistic market is, therefore, essentially an expression
of linguistic relations. However, like all markets, not everyone
to be found within it is equal, and linguistic knowledge is never
perfect. In reality, some are found to have greater practical mastery (connaissance). This knowledge is itself defined not simply
in terms of use but as an expression of legitimate language. In
most social contexts, there is a dominant language form. This
is most evident at a national level where there is received pronunciation and other standard language forms. However, it can
extend to sublevels and categories and field microcosms. In each
case, there is a right way of using language. This rightness is
defined by social common assent or common acknowledgment. The particularity of language is that while orthodox language forms are maintained by this consensus and recognized
as such reconnaissance not all can use them. There can be a
mismatch between any one individuals connaissance and reconnaissance, resulting from upbringing and proximity to legitimate

language forms. Moreover, this mismatch is itself recognized by


individuals, albeit implicitly or unconsciously, who understand
the symbolic value of language. Language is, therefore, another
form of cultural capital in that it is symbolic in the way it both
values and is valued in terms ultimately related to the structure
of the field. For Bourdieu, the most predominant field structures
are those of social class, which also express the distribution of
power in society.
There are relations of linguistic production and authorized
language within the linguistic market. Moreover, everyone
enters the market in order to compete as a way of gaining and
sanctioning social prestige, and, consequently, status and position, through the acknowledgment of others in the market. Value
is ascribed to individuals; it is not within their own capacity to
give it to themselves. There is a kind of anticipation and actualization of profits, much in the same way as in any market (see
Bourdieu [1982] 1991, 76 ff). Bourdieu refers to many examples
where the power relation between two or more individuals is
expressed in the language they use with respect to one another.
For example, in the postcolonial context (see colonialism
and language), those in a position of dominance sometimes
abdicate their position of authority by linguistically reaching
down to the interlocutor. However, he sees this as simply a
strategy of condescension aimed at reasserting their domination. Normally, it is the opposite that applies: Those dominated
are forced to adopt the language of the dominant. Bourdieu also
contrasts the broken English of the black American vernacular with the air of naturalness of the English (1992, 143). For
Chomskyan linguistics, both are natural and unbroken since
they follow the same complex principles (e.g., binary merge
and wh-movement). The point is not only that power relations
are expressed in such linguistic exchanges but that the linguistic market also defines what is and is not linguistically valued by
rewarding and sanctioning specific forms of language. In theory, everything is available to all in the market. However, some
already hold specific forms of linguistic capital, which they have
obtained from family background, education, and professional
trajectory. Moreover, such symbolic value is not only expressed
in language forms but also structurally homologous to other
forms of cultural capital; indeed, it can be found in physical body
gestures, as well as other forms of self-presentation. For those
without this capital, it is almost impossible to catch up.
Ultimately, such relations are expressed in political relations,
where certain individuals and representatives are endowed with
the power to sanction. For Bourdieu, these are acts of quasimagic as, through this endowment, power is literally invested
in someone by a formal acknowledge of status a form of social
consecration. A most obvious form of this phenomenon is when a
title is bestowed on an individual: Head of Department, for example. Some who write similarly of the linguistic variation between
individuals conclude with a deficit model of language, whereby
lack of language competence is addressed through complementary education. Ultimately, this leads to a form of linguistic
communism where all are linguistically equal (see Bourdieu
and Boltanski 1975). However, the logic of the linguistic market
is that such compensatory measures will always give rise to disappointing results in terms of social inclusion since, ultimately,
they go against the logic of practice constituting the field the

467

Marxism and Language


market in the first place. Just as communist alternatives to capitalism eventually collapsed, leading to an embracing of liberal
economics and free-market principles, so linguistic communism
cannot work since it runs counter to the raison dtre of the linguistic market, which, in terms of substantive cause and effect,
is social differentiation. However, this should not be seen as a
form of poststructuralist nihilism; rather, Bourdieu is offering a
metanoia, a new gaze or way of looking at the world through
his epistemological thinking tool (see Grenfell 2004, Chapter 7).
Michael Grenfell
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bourdieu, Pierre. 1977. The economics of linguistic exchanges. Social
Science Information 16.6: 64568.
. [1982] 1991. Language and Symbolic Power. Trans. G. Raymond
and M. Adamson. Oxford: Polity Press.
, with Loc Wacquant. 1989. Towards a reflexive sociology: A workshop with Pierre Bourdieu. Sociological Theory 7.1: 2663.
, with Loc Wacquant. 1992. An Invitation to Reflexive Sociology.
Trans. L. Wacquant. Oxford: Polity Press.
Bourdieu, Pierre, and Luc Boltanski. 1975 Le ftichisme de la langue.
Actes de la recherch en sciences sociales 2: 95107.
Fehlen, Fernand. 2004 Pre-eminent role of linguistic capital in
the reproduction of the social space in Luxembourg. In Pierre
Bourdieu: Language, Culture and Education, ed. M. Grenfell and M.
Kelly, 6172. Bern: Peter Lang.
Grenfell, Michael. 1993. The linguistic market of Orlans. In France:
Nation and Regions, ed. M. Kelly and R. Bock, 7299. Southampton, UK:
ASM & CF.
. 2004. Agent Provocateur: Pierre Bourdieu. London: Continuum.
Snook, Ivan. 1990. Language, truth and power: Bourdieus Ministerium.
In An Introduction to the Work of Pierre Bourdieu, ed. R. Harker,
C. Mahar, and C. Wilkes, 16079. Basingstoke: Macmillan.

MARXISM AND LANGUAGE


The aim of Marxism is to understand history and society according to the precepts first outlined in the works of Karl Marx and
Friedrich Engels, later developed by other thinkers in this tradition, in order to effect revolutionary social change. Given the
fact that Marxism is in part a description of the determinants of
everyday life as a way of explaining the social order, it is somewhat surprising, therefore, to note that the Marxist contribution
to thinking about language has been limited. This omission has
been unfortunate both for Marxism and for those nonformalist accounts of language that stress its historicity in a general
sense and its specific and variable links to particular social
formations.
In The German Ideology, as part of their attack on philosophical idealism, Marx and Engels provide a sketch of their materialist conception of history. With regard to the nature and function
of language, they assert:
From the start the spirit [mind] is afflicted with the curse of
being burdened with matter, which here makes its appearance in the form of agitated layers of air, sounds, in short, of
language. Language is as old as consciousness, language is practical consciousness that exists also for other men, and for that
reason alone it really exists for me personally as well; language,

468

like consciousness, only arises from the need, the necessity, of


intercourse with other men (Marx and Engels 1964, 42).

The stress on language as central to human activity, or praxis,


indicates the important role that Marx and Engels gave it in their
account of the distinctiveness of human life. Language forms an
essential part of the evolving process by which human beings in
social relationships create historical reality through the negotiation of material needs and the requirement for self-reproduction.
It is important to note, however, that language was not viewed as
either primary or derivative; it was not the faculty that enabled
human beings to become social in the first place, nor was it the
means by which they could express themselves once they had
been socialized. Instead, it was an aspect of the social, material
activity labor in its general, technical sense by which human
beings were constituted qua human beings and by which they
acted upon nature and other human beings in order to create
history.
Within the Marxist tradition, the stress on the constitutive
aspect of language as a form of labor material practice was
almost lost, as the term labor itself became narrowly conceived
simply to mean certain types of work. As a result, more attention was paid to other statements by Marx and less to his original
focus on language as social activity. These comments included
his reference to the existence of a bourgeois form of language
(Marx and Engels 1964, 249), his assertion that ideas do not
exist separately from language (1973, 163), and his declaration
that the ideas of the ruling class are in every epoch the ruling
ideas (1964, 60). Marxs remarks, which amount essentially to
the observations that the language in use is affected by the class
relations that hold in a given social formation and that ideology
is disseminated in language, were again rather narrowly interpreted within orthodox Marxism.
In the Soviet Union, in particular, a whole set of somewhat
fruitless debates ensued as to whether language belonged to the
base or superstructure of society. For N. S. Marr, for example,
languages were stratified in such a way that between communities employing distinct languages, the speech of the same class
would be closer than the speech of different classes using the
same language. In this account, language belongs to the social
superstructure of society, which is simply determined by class;
the idea that the unity of a group not based on class (such as the
nation) could be explained by the idea of a common language
was dismissed. Marrs influence, which was widespread in the
1930s and 1940s, was ended by Stalins equally dogmatic declaration in Marxism and the problems of linguistics [1950]
1974) that languages did not have a class character but rather
a national character and were thus not part of the superstructure. Despite the title of Stalins piece, and though it was an
important correction to the misleading effect of Marrs theories,
it did not represent any sort of breakthrough in the Marxist treatment of language.
In fact, precisely such an advance had been heralded in the
writings of a number of linguists in the Soviet Union primarily in
Vitebsk and Leningrad which, in effect, amounted to a school of
Marxist linguistics. Because of the terror exercised by Stalinism,
the exact membership of this group is unknown and the names
used for publishing may or may not be those of the authors of

Marxism and Language


the works. Nevertheless, the principal texts are recognized as
V. N. Voloinovs Marxism and the Philosophy of Language, published in 1929 and translated in 1973; Mikhail Bakhtins Problems
in Dostoyevskys Poetics, published first in 1929 and translated
from the second (1964) edition in 1984; and P. N. Medvedevs The
Formal Method in Literary Scholarship: A Critical Introduction
to Sociological Poetics, published in 1928 and translated in 1978.
Despite the fact that the work of Bakhtin is the best known to
readers in the West, the most significant contribution to a strictly
Marxist treatment of language was provided by Voloinovs pioneering text.
The radical thrust of Voloinovs work came in his opposition
to two key tendencies that he identified in thinking about language: individualistic subjectivism and abstract objectivism.
The first, traced by Voloinov to the German idealist tradition and
articulated most clearly in the work of Wilhelm von Humboldt,
takes the individual human psyche as the most important site
of linguistic production and focuses on the individual creative
act of speech. Regarding speech as a type of aesthetic creativity, this approach rejects language, understood as a
fixed system, as simply the product of the abstract methods of
linguistics. The second tendency, abstract objectivism, is the
binary opposite of the first and is typified in the model proposed
by Ferdinand de Saussure and developed by structuralism.
In this approach, the static and apparently immutable linguistic
system is divorced from history, is distinguished rigorously from
individual instances of language use, and is considered to be
composed of nothing other than the normatively identical forms
of lexis, grammar, and phonetics. If the first focuses on the
unceasing process (energeia) of individual linguistic creativity,
then the second treats language as a finished product (ergon),
open to the objective gaze of the science of linguistics.
For Voloinov, the concentration on individual consciousness as the basis of an explanation of linguistic signification
is a mistake. The individual consciousness cannot serve as the
foundation of linguistic analysis because it is itself in need of
explication from a social point of view: [C]onsciousness takes
shape and being in the material of signs created by an organized
group in the process of its social intercourse nurtured on signs,
it derives its growth from them; it reflects their logic and laws
(Voloinov [1929] 1973, 13). This does not, however, mean that
the individual consciousness is formed by and in the normatively
identical signs of the abstract objectivist system. On the contrary,
Voloinovs point is that signs themselves, as dynamic complexes of form and meaning, are not simply presented as given,
fixed elements of a system but are open products of the activity the material practice of language making between socially
organized individuals. Language, in this sense, is not the middle
term that unites the individual and the social, nor is it a medium
that reflects a preexistent reality. Instead, it is an aspect of the
constitutive social activity labor, in Marxs original sense that
allows for the very possibility of the individual, the social,
and reality itself.
Despite the importance of semantic indeterminacy to poststructuralist literary theory, and the stress on context in linguistic pragmatics, the radical challenge of Voloinovs work has not
been taken up widely in twentieth-century thinking on language.
Even in the tradition of Western Marxism, few of the major

theorists concerned themselves directly with language, and


when they did, as in the case of Walter Benjamin or Jean-Paul
Sartre, it is difficult to see how the work qualifies as Marxist in
any recognizable sense. Yet a number of Marxist theorists, such
as Ferrucio Rossi-Landi (1983), Terry Eagleton (1982), and JeanJacques Lecercle (2006), have produced interesting work based
on Voloinovs text. More significantly, it was the inspiration for
much of the later work of Raymond Williams, the major British
socialist critic of the twentieth century.
Williamss chapter on language in Marxism and Literature
(1977) stressed the importance of Voloinovs theory of signification, both in general and for his own original work on historical semiotics in Keywords (1976). Beginning with Voloinovs
argument that signs are neither expressive nor systematic in any
simple sense but, rather, communicative media deployed in the
social process of making history, Williams stressed that signs are
shaped by past use but are engaged at the same time in the creative making of the present (and are thus of necessity open to
the future). This idea of the historical variability of signs, which
Voloinov calls their mulitaccentuality, formed the basis of
Williamss investigation of the vocabulary of a number of discursive fields, centrally those that involved discussion of culture and
society. In essence, what he provides in Keywords and Marxism
and Literature is a retrospective theoretical account of his work
in Culture and Society (1958), a text that effectively began the
debates that led to the appearance of cultural studies as an academic discipline. Though rarely acknowledged as such, it was an
historical materialist approach to language that lay at the base of
this important intellectual development.
Marxs comments on the existence of bourgeois language
and Voloinovs assertion that the sign becomes an arena of
the class struggle ([1929] 1973, 23) point to another field of
research in which Marxist thought has been significant: the
politics of language, with particular regard to the historical
construction of national languages, the class-based hierarchy of language within education, and the role of language
in imperialism and colonialism. Important work in this area
was conducted by Antonio Gramsci, the Italian Communist
Party intellectual and leader, who drew attention to the class
perspective in his discussion of the merits and demerits of the
use of dialect versus a national form of language in political
struggles in Italy. Other examples include Rene Balibars historical research on the emergence of a standard language in
France in Les franais fictifs (1974) and Linstitution du franais
(1985), and Tony Crowleys related work in the British context in The Politics of Discourse (1989). Writing from the postcolonial conjuncture, the Kenyan writer Ngg Wa Thiongo
used a Marxist approach to denounce the colonial linguistic
legacy in his Decolonising the Mind (1986). And in educational debates, Basil Bernsteins theory of restricted and
elaborated codes attempted to explain the differential academic achievement of children from different social classes. In
Reproduction in Education, Society and Culture, written with
Jean-Claude Passeron (1977), and Language and Symbolic
Power (1992), Pierre Bourdieu used a neo-Marxist framework
to account for the same phenomenon.
Tony Crowley

469

Meaning and Belief


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bakhtin, M. M. [1929] 1984. Problems in Dostoyevskys Poetics. Ed. and
trans. Caryl Emerson. Manchester, UK: Manchester University Press.
Balibar, Rene. 1974. Les franais fictifs. Paris: Hachette.
. 1985. LInstitution du Franais. Paris: Presses Universitaires de
France.
Balibar, Rene, and Dominique Laporte. 1974. Le Franais National.
Paris: Hachette.
Bernstein, Basil. 1971. Class, Codes and Control. Vol. 1. London: Paladin.
Bourdieu, Pierre. 1992. Language and Symbolic Power. Ed. John
Thompson, trans. Gino Raymond and Matthew Adamson. Cambridge,
UK: Polity.
Bourdieu, Pierre, and Jean-Claude Passeron. 1977. Reproduction in
Education, Society and Culture. London: Sage.
Crowley, Tony. 1989. The Politics of Discourse: The Standard Language
Question in British Cultural Debates. Houndmills: Macmillan.
Eagleton, Terry. 1982. Wittgensteins Friends. New Left Review
1.135: 6490.
Gramsci, Antonio. 1985. Selections from Cultural Writings. Ed. David
Forgacs and William Nowell-Smith, trans. William Boelhower.
London: Lawrence and Wishart.
Lecercle, Jean-Jacques. 2006. A Marxist Philosophy of Language.
Boston: Brill.
Marx, Karl, and Friedrich Engels. 1964. The German Ideology.
Moscow: Progress Publishers.
. 1973. Grundrisse. Trans. Martin Nicolaus. Harmondsworth:
Penguin.
Medvedev, P. N. [1928] 1978. The Formal Method in Literary Scholarship: A
Critical Introduction to Sociological Poetics. Trans. Albert J. Wherle.
London: Johns Hopkins University Press.
Ngg Wa Thiongo. 1986. Decolonising the Mind: The Politics of Language
in African Literature. London: James Currey.
Rossi-Landi, Ferruccio. 1983. Language as Work and Trade. Trans.
Martha Adams. South Hadley, MA: Bergin and Garvey.
Stalin, Joseph. [1950] 1974. Marxism and the problems of linguistics.
In A Primer of Linguistics, ed. Anne Fremantle, 20318. New York:
St. Martins.
Voloinov, V. N. [1929] 1973. Marxism and the Philosophy of Language.
Trans. L. Matejka and I. R. Titunik. London: Seminar Press.
Williams, Raymond. 1958. Culture and Society 17801950. London: Chatto
and Windus.
. 1976. Keywords: A Vocabulary of Culture and Society. London:
Fontana.
. 1977. Marxism and Literature. London: Oxford University Press.

MEANING AND BELIEF


There is a commonsense view of the relation between meaning
and belief that has been tacitly presupposed in many philosophical, linguistic, and other treatments of these topics. It runs something like this. words refer to objects and have definitions. The
definitions represent properties of the objects, provide criteria
for using the words to refer to objects, and allow understanding
of such uses. Thus man means, roughly, adult, human, male.
That meaning picks out properties of some objects, isolating men
(rather than women, children, apes, and so on). In keeping with
this, the meaning allows us to use the term to refer to certain sets
of objects (men) or particular members of that set (individual
men) and to understand what other people refer to when they use
the term. Additionally, meanings allow us to express our beliefs
about members of sets of objects (generally or individually). In

470

sum, meaning and belief are different (the distinctness component of the traditional view), and meaning allows the articulation
of belief (the expressive component).
In the last half-century or so, this picture of the relation
between meaning and belief has been challenged from a number of perspectives. One important challenge concerns the
interaction of meaning and belief, addressing such questions as
whether meaning is a relatively neutral vehicle for expressing
belief or something that may affect belief. This is the challenge of
linguistic relativism. Another concerns the validity of the
division between meaning and belief. This is the critique of analyticity. The two challenges point in somewhat contradictory
directions. (These are not by any means the only ways in which
meaning and belief have been discussed in recent years. For
example, Akeel Bilgrami [1992] addresses the issue of how to reconcile meaning externalism with certain subjective aspects
of belief. Unfortunately, a short entry can only point to a couple of
key issues that have arisen in connection with this broad topic.)
As to linguistic relativism, a number of writers (most famously
Edward Sapir and Benjamin Lee Whorf) have argued that meaning is not simply a means for articulating belief, but a means
of shaping belief (as well as emotion, action, even perception).
A popular version of this view is developed in George Orwells
novel 1984, where the government seeks to control peoples
ideas by changing their language. The idea of any conceptual
scheme relativism is difficult to sustain in global terms, as writers
such as Donald Davidson (1984; see Chapter 13) have pointed
out. However, it is clear that we do not have at least some beliefs
before we have some categories, and commonly those categories go along with words and meanings. For example, small children do not have beliefs about gravitation or terrorism because
they do not have the relevant concepts, and the concepts are
presumably something they acquire by learning the words and
their meanings. More importantly, it seems likely that peoples
beliefs about particular events are affected by the concepts (thus,
meanings) available to and salient for them. Thus, without the
concept of terrorism, perhaps Americans would have understood the events of September 11, 2001, as crimes. This would
have changed their beliefs about the nature of the event, proper
responses to the event (e.g., police investigation, extradition,
etc., rather than war), and so on.
This challenge to the expressive component of the commonsense view seems to preserve the distinctness component. In
other words, it seems to rely on a presumption that meaning and
belief are different. After all, if meaning and belief are not distinct, then it is difficult to tell exactly how meaning could affect
belief. This division between meaning and belief is precisely what
is challenged by the critique of analyticity.
There are two clear ways in which the meaning/belief division may be criticized. They relate to two obvious ways in which
the division itself may be formulated. One way concerns revisability. We might say that the belief component of our ideas
is revisable by reference to experience or facts. In contrast, the
meaning component is steady in the face of new experiences
or facts. However, W. V. Quine has argued famously that there
is no way of putting some truths into empirical quarantine and
judging the remainder free of infection. Thus meaning and
sensory evidence are inextricably intertwined (1976, 139; see

Meaning and Belief


also 1981, 67, 712). This suggests a number of things about the
relation between meaning and belief. Perhaps most obviously,
it indicates that sentences are not true by their meanings alone.
More importantly for our purposes, it suggests that what we consider meanings are open to empirical revision, precisely in the
manner of beliefs. For instance, a hundred years ago, one might
have thought that My father is a man was true analytically.
However, sex-change operations have shown us that the meaning component male is revisable due to empirical information
about the objects to which father refers. If some idea about an
object or set of objects is revisable due to empirical information
about referents, then it would seem to count as a belief, not as a
meaning.
There are undoubtedly cases where we would find it difficult
to imagine such revision. For example, I have no good idea for
how I could possibly revise my view that If Jones is currently a
man, then Jones is currently a male, adult, human. (Obviously,
there are scenarios where the meaning of man could change, but
that is not at issue.) However, it might be argued that this tells us
something about my imagination, not about the facts. Perhaps
it was impossible for people to imagine sex-change operations a
century ago. It may be that what we consider to be meaning is a
function of what we can imagine changing. But our imagination
could always be mistaken.
On the other hand, perhaps the obvious cases of revision are
not so obvious as they initially appear. For example, people did
imagine men changing into women and women changing into
men well before sex-change operations. If Tiresias had a child
as a man, then was transformed into a woman, his child could
truthfully say, My father is a woman. So perhaps male was
always only more limitedly part of the meaning of father, closer
to a belief than we recognize. In this way, it may be that the revisability argument is not definitive.
A more productive approach may still be Quinean in orientation naturalizing our treatment of the topic, as Quine
often urged (1969; see Chapter 3) by turning to the natural sciences. Here, we might consider two sorts of cognitive architecture that are common in discussions of meaning today. (This,
of course, is not Quinean as it is mentalistic.) The first is intentional/representational; the second relies on neural networks,
either artificial (see connectionism, language science,
and meaning ) or natural (see semantics, neurobiology
of ).
A standard intentional/representational account of lexical
semantics involves headings, some sort of meaning units connected with the headings, and connections across headings.
The connections across headings establish lexical relations
of various sorts, including semantic fields. The semantic
units themselves are structured into complexes of relations
with default values and are typically hierarchized, such that
some units are more important than others. Consider, for
example, man. This entry is linked to woman for one domain
(adult human), to boy for another domain (male human),
and so on. It includes a range of information, comprising not
only definitional components, but empirical components as
well. For example, 50 years ago, father included not only male,
adult, human, and progenitor of ego but probably husband of mother and breadwinner of the family or the like. In

other words, understanding father involves various schemas


that cluster information into relations. These schemas have
default values (such as father is mothers husband), perhaps
along with specified alternatives to defaults (such as father is
divorced from mother). This information is hierarchized in that
we generally consider items higher in the hierarchy to be more
criterial for application of the term than items lower in the hierarchy. Put very simply, if we find out that Peter is Sallys progenitor but is not the breadwinner, we are more likely to count him
as Sallys father than if we find out that he is the breadwinner but
not the progenitor. On the other hand, hierarchy effects are not
absolute. We may be more inclined to apply father to the breadwinning, affectionate, live-in husband of Sallys mother than to
an unknown progenitor. (The last point, if developed further,
would lead us to the place of prototypes in lexical semantics. However, the inclusion of prototypes or for that matter,
exemplars would not affect the main argument as it bears on
meaning and belief.)
Insofar as this model of meaning is accurate, it suggests,
first of all, that there is no sharp meaning/belief division. There
does not seem to be any point at which the information associated with a given heading stops being semantic and starts being
empirical. On the other hand, it also suggests that the meaning/
belief division is not wholly pointless in that there does appear
to be a continuum from more definition-like information to
more observation-like information. But this, too, is not all.
The hierarchical continuum is not determinative. We may think
of the hierarchy as a series of weighted properties and/or relations. Although those higher in the hierarchy are more heavily
weighted, they may be outweighed by a large enough number of
lower-level properties/relations. Alternatively, in connectionist
terms, a large number of weak connections may reach some activation threshold that is not reached by a small number of strong
connections. This last point suggests that despite the hierarchy,
all information associated with lexical items is in some ways
more akin to belief than to meaning (though perhaps neither
term is truly adequate here).
As the preceding reference to connectionism suggests, the
same conclusions hold for accounts of meaning that rely on neural networks. For example, neural accounts treat meaning as a
complex of circuits linking configurations of neurons in different areas of the brain insofar as these bear on the sound of the
relevant word, the appearance of the referent, our own actions
as they might bear on the referent, and so on (see, for example,
Chapter 4 of Pulvermller 2002). These circuits are presumably
not fully fixed and identical across all uses. Rather, the precise
configuration activated at any given moment will vary, depending on what other neural circuits are simultaneously activated.
For example, suppose I say squeeze. That activates circuits that
include neuron populations that govern closing together the fingers of the dominant hand. Suppose I then say ball. The, so to
speak, resting circuit for ball includes a range of neuron populations, some of which bear on closing together the fingers of the
dominant hand. Since some part of the latter population was just
activated by squeeze, it should be more fully activated by ball. The
prior activation due to squeeze will slightly alter the circuit activated by ball, perhaps enough to make one think of hand-sized,
rubber balls.

471

Meaning and Stipulation

Meaning Externalism and Internalism

Of course, not everything is equally variable. There are some


connections in these networks that are stronger than others.
These differences in connection strengths should correspond
roughly with the hierarchy of properties/relations in the intentional/representational account. Here too, then, we have reason
to believe that there is some sort of continuum. Not all of our
ideas about a set of objects are equally salient, expected, and so
forth. However, none seems precisely to qualify as a meaning, to
be distinguished from a belief and, once again, a greater degree
of activation bearing on initially weaker connections may have
greater effects than a weaker activation bearing on initially strong
connections. Here too, then, any correlates we may posit for the
neuronal circuits seem more like beliefs than like meanings.
In conclusion, we might return briefly to linguistic relativism. If the preceding discussion of meaning/belief (non)distinctness is accurate, then it seems that we cannot reasonably say
that meanings guide beliefs. We can only say that some beliefs
affect other beliefs. On the other hand, we also cannot say that
meanings simply allow us to express beliefs. Our ideas about the
world and our production and reception of language are, rather,
dynamic (neurocognitive) processes. These processes do not trap
us in a prison house of language (as some writers have put it).
But they also do not allow us some simple freedom to describe
and evaluate the world in abstract removal from the perception,
memory, and other circuits that are already in place when we
come to formulate our descriptions and make our evaluations.
Patrick Colm Hogan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bilgrami, Akeel. 1992. Belief and Meaning: The Unity and Locality of
Mental Content. Cambridge, MA: Blackwell.
Davidson, Donald. 1984. Inquiries into Truth and Interpretation.
Oxford: Clarendon Press.
Pulvermller, Friedemann. 2002. The Neuroscience of Language: On Brain
Circuits of Words and Serial Order. Cambridge: Cambridge University
Press.
Quine, W. V. 1969. Ontological Relativity and Other Essays. New
York: Columbia University Press.
. 1976. The Ways of Paradox and Other Essays. Rev. ed. Cambridge:
Harvard University Press.
. 1981. Theories and Things. Cambridge: Harvard University Press.

of items to which the noun refers (see intension and extension). One can only adjudicate the definition of a term by reference to an extension. For example, consider a definition of U.S.
state that involves the criterion of continuous land. One can
reject this definition by pointing to Hawaii, which is part of the
extension of U.S. state. But one can only adjudicate an extension
by reference to a definition. In other words, we rely on a definition of U.S. state to judge that Hawaii is a U. S. state. Thus, one
cannot adjudicate a definition and an extension simultaneously.
One of the two has to be established arbitrarily. By this argument,
there is no such thing as the real meaning of any term, including meaning. Meaning may be social, intentional, or whatever,
as we choose in particular contexts. Thus, whenever we engage
in an interpretive task, the type of meaning at issue should be
stipulated.
This argument disposes of one problem what meaning
really is. But it leads us to three other concerns.
The first is ontological just what sorts of meaning actually
exist. We may, for example, stipulate Platonic meaning as our
object of hermeneutic interest (see philology and hermeneutics). But we cannot actually interpret for Platonic meanings if they do not exist.
The second concern takes up the demarcation of our stipulative categories. These need to be adequately precise. For example, we might stipulate that we are concerned with intentional
meaning. But there are numerous sorts of intentional meaning
that should often be distinguished in the case of legal interpretation, the self-conscious intent of the author who drafted a
piece of legislation, the intents of the legislators who passed it,
the intents of the judges who gave opinions on its constitutionality, and so on.
The final concern bears on the particular purposes for which
we are interpreting. For example, for any given term in a law,
there may be variable social meanings. Ordinary people may use
a term with one meaning; scientists may use it with a slightly different meaning. In particular cases of interpretation, the meaning
associated with one or the other group may be more significant.
Note that in these cases, we are not trying to determine the real
meaning of the law. Rather, we are acknowledging that there are
many sorts of meaning and we are trying to determine which is
the most important in the case at hand.
Patrick Colm Hogan

MEANING AND STIPULATION


Meaning is commonly understood as social, mentalistic, or
abstract. A social account views meaning as existing in social
groups. A mentalistic account places meaning in individual
minds. An abstract account locates meaning in a Platonic realm.
Discussions of meaning often involve debates about which
of these gives the real meaning of a term or utterance. For
example, in legal interpretation, there have been debates
between writers who view the Constitution as an ongoing product of social developments and those who see it as fixed by the
Framers intent.
A stipulative account of meaning argues that such debates are
pointless. They are, in effect, debates over the real meaning of
the word meaning. Formally, the meaning of any common noun
(such as meaning) involves a definition and an extension or set

472

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Bundgaard, Peer, and Frederik Stjernfelt. 2009. Patrick Hogan
[Interview]. In Signs and Meaning: 5 Questions , 7185. Copenhagen:
Automatic Press/VIP.
Hogan, Patrick Colm. [1996] 2008. On Interpretation: Meaning
and Inference in Law, Psychoanalysis, and Literature. 2d ed.
Athens: University of Georgia Press.
Levinson, Sanford, and Steven Mailloux. 1988. Interpreting Law and
Literature: A Hermeneutic Reader. Evanston, IL: Northwestern
University Press.

MEANING EXTERNALISM AND INTERNALISM


Hilary Putnam (1975) argued for a view now known as meaning (or semantic) externalism the view that there are terms

Meaning Externalism and Internalism


whose meanings are not determined by their users psychological states. Meaning internalism is simply the denial of meaning
externalism.
Putnam qualifies his meaning externalism by explaining that
he intends psychological state to be understood in the narrow sense, according to which a psychological state implies the
existence of nothing but its possessor (1975, 21922). Another
equally significant qualification is that the argument will show
that meanings just aint in the head, as Putnam memorably
puts it, only if a terms meaning is taken to be (at least) an intension, that is, a function from possible circumstances (or worlds)
to its extension, or the set of objects to which the term applies
(Putnam 1975, 227; see intension and extension, reference and extension, and possible worlds semantics).
Although there is a good deal of contemporary skepticism
about the existence of narrow psychological states (see the following), Putnam assumes that at least some psychological states
are narrow; beliefs, thoughts, feelings, and interior monologue
are all given as examples (1975, 224).
Putnam argues for meaning externalism with his famous
Twin Earth thought experiment (1975, 2237). Suppose that
somewhere in the galaxy there is a planet, Twin Earth, which is
just like Earth save one detail: The liquid that flows from Twin
Earthian faucets, falls from Twin Earthian skies, and fills Twin
Earthian oceans is not water. It is macroscopically identical to
water, but, unlike water, it is not the chemical compound H2O.
Instead, it is some other complicated chemical compound
that can be abbreviated XYZ. Twin Earthians speak a dialect of
English, and Earthians and Twin Earthians both use the term
water, but the extension of water in their respective dialects is
different. In English, water applies to all and only samples of
H2O. In Twin Earth English, water applies to all and only samples of XYZ.
Now consider two subjects, Oscar1, an Earthian, and Oscar2,
a Twin Earthian, both of whom have interacted with, and have
beliefs and other psychological attitudes concerning, the waterlike liquid native to their respective planets. Suppose both to be
living in 1750, before anyone on their planets knew anything
about the underlying chemistry of the liquids found thereupon.
Putnam claims that it is possible for Oscar1 and Oscar2 to be in
the same narrow psychological state (1975, 224). Since both are
chemically unsophisticated, neither has beliefs characterizable
with H2O or XYZ that could potentially distinguish their narrow psychologies. Furthermore, given the macroscopic identity
between H2O and XYZ, it seems plausible to suppose that all of
Oscar1s attitudes, feelings, and sensations about the liquid that
is in fact H2O on his planet could be matched by exactly similar attitudes, and so on, of Oscar2s toward the liquid that is in
fact XYZ on his planet. Indeed, as Putnam suggests, Oscar1 and
Oscar2 could well be molecule for molecule Doppelgngers,
thus, it would seem, guaranteeing their narrow psychological
identity (1975, 227).
When Oscar1 says Water is odorless, however, does he
mean what Oscar2 means when he says Water is odorless? It
seems not. For what Oscar1 says is true if and only if H2O is odorless, while what Oscar2 says is true if and only if XYZ is odorless.
If, however, Oscar1 and Oscar2 mean precisely the same thing
by their utterances, then those utterances would be true under

precisely the same conditions. So they do not mean the same


thing by their utterances, and this difference in meaning appears
traceable to the term water. Water in English has a different
extension from water in Twin Earth English. Hence, since meanings are (at least) intensions, and intensions determine extensions, water means something different in Oscar1s mouth than
it does in Oscar2s. Externalism vindicated.
As Colin McGinn (1977) and Tyler Burge (1979) have pointed
out, the same thought experiment can be used to challenge the
view that beliefs and other propositional attitudes are
narrow psychological states. When Oscar1 says Water is odorless, he is expressing one of his beliefs, but this belief is different from the belief Oscar2 expresses via the same sentence. (The
two beliefs are true under different conditions; that, according to
most theorists, suffices to distinguish them.) Similar points could
be made about the other propositional attitudes. If we continue
to assume that there are narrow psychological states and that the
narrow psychologies of Oscar1 and Oscar2 could be identical, it
will follow that at least some beliefs and other propositional attitudes are not narrow.
Under the sway of Putnam (1975), McGinn (1977), and Burge
(1979), however, many theorists are now skeptical that there
are any narrow psychological states. Even sensations and other
phenomenally conscious psychological states narrow psychological states if any such there be have recently been argued
to be examples of wide psychological states. (See Dretske 1996;
Lycan 2001; and Tye 1995.) Putnam himself (1996) avows skepticism about narrow psychological states. What becomes of the
thesis of meaning externalism if we suppose that narrow psychological state is an empty term? If an ordinary human subject,
S, lacks narrow psychological states, then S's narrow psychology determines the meanings of S's terms is true, but vacuously.
To avoid this hollow victory for internalism, it is perhaps best to
recast the distinction between meaning externalism and internalism in terms of a distinction between intrinsic and extrinsic
properties: An intrinsic property is one that an object possesses
independently of its relations to other objects, whereas an extrinsic property is one that an object possesses in virtue of its relations to other objects.
Given the intrinsic/extrinsic properties distinction, we can
reformulate meaning externalism as the view that there are
terms whose meanings are not determined by their users intrinsic properties (regardless of whether there are any intrinsic psychological properties).
Max Deutsch
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Burge, Tyler. 1979. Individualism and the mental. Midwest Studies in
Philosophy 4: 73122.
Dretske, Fred. 1996. Phenomenal externalism. In Philosophical Issues.
Vol. 7. Ed. Enrique Villenueva, 14358. Atascadero, CA: Ridgeview.
Lycan, William. 2001. The case for phenomenal externalism.
Philosophical Perspectives 15: 1735.
McGinn, Colin. 1977. Charity, interpretation, and belief. Journal of
Philosophy 74: 52135.
Putnam, H. 1975. The meaning of meaning. In Mind, Language,
and Reality: Philosophical Papers, II: 21571. Cambridge: Cambridge
University Press.

473

Media of Communication
. 1996. Introduction. In The Twin Earth Chronicles: Twenty Years
of Reflection on Hilary Putnams The Meaning of Meaning, ed.
Andrew Pessin and Sanford Goldberg, xvxxii. London: M. E. Sharpe.
Tye, Michael. 1995. Ten Problems of Consciousness. Cambridge,
MA: Bradford Books, MIT Press.

MEDIA OF COMMUNICATION
This term refers to the means by which communication
takes place or the choice of substance for realizing a communicative act.
It has long been recognized that choosing between alternative means or media of communication can have consequences
for the linguistic form of the message. Thus, the choice of writing instead of speech as a medium of communication may entail
particular grammatical, syntactic, or lexical preferences over
others. Indeed some work suggests that certain kinds of linguistic
patterning may be distinctive to particular media (Biber 1988).
From Ferdinand de Saussure (1912) onward, much work
has been devoted to specifying the differences between speech
and writing. (See also oral composition; oral culture;
writing systems; writing, origin and history of).
One approach deals with speech and writing as alternative
expressions of the same underlying language system, realized
in differing ways depending on the medium adopted. Thus,
the written medium is associated with greater lexical density,
a wider range of grammatical structures, a greater degree of
embedding, and more varied forms of connectivity between sentences. Conversely, the spoken medium is associated with lexical repetition, low lexical density, vague or indefinite expressions
(thingymajig), high incidence of coordinated clauses linked by
common conjunctions (and, but), and selection of the active
rather than the passive voice. The character of these differences
has led one author to characterize speech as a process and writing as a product (Halliday 1985).
A more radical view suggests that sentence grammars generally have been implicitly biased toward the study of writing, and
grammars would be quite different if they were formulated from
the outset to take account of speech phenomena. As David Brazil
observes, if any part of the outcome [of a grammar of speech]
looks like a sentence, this comes about as an interesting by-product of the processes we are interested in, not as the planned outcome to which these processes owe their definition (1995, 39).
Certainly there is widespread agreement that the communicative potentialities of writing and speech are very different.
Speech typically takes place between interlocutors who are in
some way copresent to each other, and this enables them to
adjust their utterance in the light of the apparent reactions of
the other. The process of composing and planning speech goes
hand in hand with the act of speaking, and speaking, in turn,
goes hand in hand with the process of interpretation that must
keep pace with it. There is no time lag between production and
reception. Instead, speech is temporally bound, transient, and
dynamic, rooted in an unfolding context, with paralinguistic
behavior providing an important supplementary layer to communication (see paralanguage). Conversely, writing as a
semipermanent product enables a gap across time and space to
open up between participants. The process of composition may

474

be lengthy, involving several stages and many revisions. And


writing especially in printed or other permanent forms may
be received in quite different contexts from those in which it was
produced. The writer must anticipate how the effects of a displaced or unknown context might guide interpretation or lead
to misinterpretation. And readers, of course, must typically rely
on the written text alone in arriving at its sense. Writing, consequently, is forced to be less reliant on its immediate context for
its meaning.
Speech is often treated as the primary medium of communication and this for various reasons. In human evolutionary terms, it
is broadly universal and involves specific biological adaptation
unlike writing, which emerges as the product of particular historical societies and is not universal either within or across them.
Speech is acquired during a critical language-learning period
(see critical periods) very early in life. Writing is acquired
later and usually as the focus of explicit instruction. Nonetheless,
with the advent of a range of alternatives to speech and writing
as media of communication, it is difficult to insist upon a simple
dichotomy between oral, situated, face-to-face communication,
one the one hand, and visual, decontextualized, noninteractive
communication, on the other especially when technological
developments in communication media are considered.
We may distinguish broadly among three overlapping phases
in the development of alternatives to speech as media of communication: mechanical (writing, print), electrical (telegraphy and
wireless telegraphy, radio, and television) and digital (World
Wide Web and the Internet, cellular phones, and the convergence
or interaction between these and previous media of communication). Developments in communication at a distance for military
and commercial purposes using semaphore and other flag signaling systems are particularly evident in Europe in the late eighteenth and early nineteenth century. These were forerunners of
the electric telegraph initially designed by Samuel Morse in the
1830s. The use of electrical impulses to make possible communication at a distance then underpins the development of the
telephone in the 1870s, and forms of wireless telegraphy in the
1890s, to be followed by radio and television broadcasting in
the first and second half of the twentieth century. In most cases,
each technological development may be seen to favor particular
linguistic selections over others. The telegraph and subsequent
telegram, because of the cost of transmission and the premium
placed on time, tended to favor certain kinds of abbreviation
principally the deletion of grammatical function words, such as
articles, determiners, and verb auxiliaries.
The early electrical media of communication at a distance tend
to be dyadic and reciprocal, rather than one-way and noninteractive. But many of the subsequent and most far-reaching developments in communication at a distance in the twentieth century
tend to be one-to-many rather than two-way. Broadcasting is
perhaps the best term for these developments which include,
preeminently, radio and television; and in one form or another
these have become ubiquitous forms of communication in the
modern era.
Despite the ubiquity of radio and television, it is difficult
to characterize the language of broadcasting in any distinctly
homogenous fashion. Instead, it is best understood as a medley
of distinct genres, including news interviews and reports, comic

Media of Communication
monologue, soap opera, various kinds of reality programing,
commercials, commentary on public events including sporting
occasions, argument, drama, talk shows, and phone-ins (several
of which have begun to attract systematic study; see Hutchby
2005; Tolson 2005). Although there may be generic antecedents
to these in the world of real-time, face-to-face communication,
certain properties seem to set broadcast genres apart from everyday nonmediated communication. For one thing, the idealized
speaker and hearer of the canonical speech situation, reciprocally exchanging roles and utterances, no longer easily applies
except in grossly simplified ways.
Instead, as Erving Goffman (1981) observes, broadcast communication takes shape from complex production formats and
participation frameworks in which the discourse is sometimes
scripted, sometimes relatively spontaneous, sometimes spoken,
sometimes written, sometimes written to be spoken, sometimes
single authored, sometimes multiply authored, sometimes dialogue, and sometimes monologue. Indeed, Goffman suggests
replacing the term speaker with notions of author, animator,
and principal. The author is the one who has selected the sentiments that are being expressed and the words in which they are
encoded (Goffman 1981, 144). The animator is the one who gives
voice to the words that have been selected, sometimes by someone else. The principal is whoever is potentially held to account
for the sentiments expressed. In many situations, the three roles
coalesce, but in broadcast communication in news programs,
for instance the presenter who reads the news from the autocue
may merely be animating a script authored elsewhere, by the
editorial team, and the ultimately accountable source for the discourse the principal may be the organization itself. Thus, in
the case of a BBC news bulletin, it may be the director general
or members of the board of trustees who resign their positions
should an item be called into question, not necessarily the news
editor, and certainly not the news presenter.
Just as various alignments are possible in terms of the production of broadcast communication, important distinctions apply
in its reception, where the potential participation framework
is equally complex. As Goffman again observes, an utterance
does not carve up the world beyond the speaker into precisely
two parts, recipients and non-recipients, but rather opens up an
array of structurally differentiated possibilities, establishing the
participation framework in which the speaker will be guiding
his delivery (1981, 137). Broadcast communication is quite frequently oriented to two kinds of recipient. In studio interviews,
for instance, in chat shows or news programs, there is the immediate recipient of the talk the interviewee or the interviewer
but beyond them is the overhearing audience numbering in size
from thousands to millions. In this way, in posing a question to
an interviewee, the discourse of the interviewer is bidirectional. It
is oriented in the first instance to the interviewee, but the design
of the question will also be shaped by the assumed concerns of
the broadcast audience beyond. Talk for an overhearing mass
audience in this way assumes characteristics distinctive to the
medium that are different from ordinary talk or conversation.
It should be noted also that in the broadcast media, the
foundational distinction in considering media of communication between speech and writing becomes confounded. Within
a continuous stretch of discourse, a language user may switch

from reading a script to speaking ex tempore, from address to


the absent audience to address to a copresent interlocutor, from
script written to be read aloud as if unscripted to reading aloud
an e-mail from the audience.
In the movement from one phase to another in the development of technologies of communication, there are shifts of
emphasis between one-to-one and one-to-many. The emergence
of writing and print allows communication of the one to the
many. The emergence of wireless telegraphy allows one-to-one
but over extreme distances. Broadcasting prioritizes one-tomany and further collapses both temporal and spatial distances.
The recently launched digital phase that has followed in the wake
of broadcasting has allowed the most radical innovations regarding the configurations in time and space of participants to communication: Instantaneity over distance is possible, and extreme
forms of both one-to-one and one-to-many communication can
become blended in a single message.
Text messaging (SMS) and e-mail, for instance, can be oneto-one or forwarded to a larger audience; are primarily asynchronous, but single messages can develop into an extended
dialogue; often assume a fast response, but this may be delayed
if the recipient is offline; and seem transient or ephemeral but
may be archived (sometimes in hard copy, in the case of e-mail)
for later use. The language style of such communication may
well include extreme abbreviation, slang, contractions, phonetic
spelling, erratic punctuation, and short forms, and it seems
to operate in an unstable and fluctuating zone between speech
on the one hand and writing on the other. This might only be a
matter of linguistic curiosity except that variation between styles
of communication interact with questions of formality and the
quality of the social relationship. Many commentators have
pointed to growing informality in communication in the modern
era, using such terms to describe the shift as informalization
(Elias 1996), the democratisation (or conversationalisation) of
discourse (Fairclough 1992), intimacy at a distance and parasocial interaction (Horton and Wohl 1956), synthetic personalisation (Fairclough 2001), and broadcast sociability (Scannell
1996). The emphasis in these accounts varies between attention
to forms of the message and attention to forms of the relationship afforded by the message, but what generally seems to be
at stake is a changing sense of what counts as public space and
what counts as the appropriate linguistic and social demeanor
for it. While larger processes of social change may well underpin
these shifts, the changing media of communication have clearly
contributed to them.
Martin Montgomery
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aitchison, Jean, and D. Lewis. eds. 2003. New Media Language.
London: Routledge.
Biber, Douglas. 1988. Variation across Speech and Writing.
Cambridge: Cambridge University Press.
Brazil, David. 1995. A Grammar of Speech. Oxford: Oxford University
Press.
Crystal, David. 2001. Language and the Internet. Cambridge: Cambridge
University Press.
Elias, Norbert. 1996. The Germans. New York: Columbia University
Press.

475

Memes and Language


Fairclough, Norman.1992. Discourse and Social Change. Cambridge,
UK: Polity.
. 1995. Media Discourse. London: Arnold.
. 2001. Language and Power. 2d ed. London: Pearson Education.
Goffman, Erving. 1981. Forms of Talk. Oxford: Basil Blackwell.
Halliday, Michael Alexander Kirkwood. 1985. Speech and Writing.
Oxford: Oxford University Press.
Horton, Donald, and R. Richard Wohl. 1956. Mass communication
and para-social interaction: Observations on intimacy at a distance.
Psychiatry 19: 21529.
Hutchby, Ian. 2005. Media Talk: Conversation Analysis and the Study of
Broadcasting. Buckingham: Open University Press.
Saussure, Ferdinand de. 1916. Cours de linguistique gnrale. Ed. C. Bally
and A. Sechehaye, with the collaboration of A. Riedlinger. Lausanne
and Paris: Payot.
Scannell, Paddy. 1996. Radio, Television and Modern Life. Oxford: Basil
Blackwell.
Tolson, Andrew. 2005. Media Talk: Spoken Discourse on TV and Radio.
Edinburgh: Edinburgh University Press

MEMES AND LANGUAGE


Memes are information patterns that are culturally transmittable
and undergo Darwinian evolution: Variation among meme types
is created when patterns are altered, recombined, or transmitted imperfectly, and selection takes place when more stable or
more easily transmittable meme variants come to oust competitors that are less fit, that is, less stable or transmittable. As far as
their material implementation is concerned, memes are generally thought to exist in brains as constituents of human knowledge. Whether human behavior and artifacts should be regarded
as external expressions of memes or as alternative ways in which
memetic information can be implemented is still disputed,
although the former view seems to be gaining ground.
The term meme was coined by the evolutionary biologist
Richard Dawkins (1976, 192) to denote cultural counterparts
of genes. Dawkins introduced the concept to support the argument that Darwinian evolution is not limited to the biological
domain but represents an algorithmic process that will affect
any patterns that are sufficiently stable and copied in sufficient
numbers with sufficient fidelity. While in the evolution of species
those patterns are genes, the historical development of human
cultures might reflect the evolution of memes.
The concept of memes is linked to the idea that human culture is a Darwinian system that can be understood best on the
level of the elements on which selection operates. A memetic
view of culture regards humans as physiologically complex, yet
relatively passive meme hosts. Their instinctive inclination to
imitate one another turns them into meme vehicles with limited control over the memes they acquire, express in behavior,
or pass on to other humans. Of course, meme replication will
depend, to a considerable extent, on the physiological makeup,
the well-being, and the needs of their hosts, and memes that
inflict obvious harm on their human carriers are unlikely to
thrive. However, the ultimate reason why any Darwinian replicator exists is its capacity to get itself transmitted before it
disintegrates. Thus, a memetic approach to human behavior
challenges hermeneutic views (see philology and hermeneutics ), which derive it from the irreducibly subjective perspectives of intentional agents.

476

In linguistics, the plausibility of meme-based approaches is


supported by the increasing productivity of Darwinian thought
in language studies, which has inspired new efforts to explain
the evolution of language (see Hurford 2006) and to understand
the historical development of languages in Darwinian terms
(e.g., Croft 2001). Although the Darwinian algorithm depends on
the existence of replicating units, the potential of meme-based
approaches to language remains largely to be explored.
Memetic theories of language need to address at least three
fundamental questions. First, linguistic memes, or replicating constituents of linguistic competence, need to be plausibly conceptualized as material patterns with determinable and
empirically detectable structures. Second, the mechanics by
which memes are replicated need to be determined. Third, the
factors that determine the differential replication of meme variants need to be identified.
A study that adheres strictly to Dawkinss original proposal is that of Nikolaus Ritt 2004. Following connectionist
approaches to competence modeling, he sees language memes
as patterns acquired by neural networks during language
acquisition. Thus, a meme representing a phoneme contains
a) links to configurations underlying articulatory gestures and
b) links to areas that are excited by specific auditory impressions, as well as c) links to representations of morphemes for
whose distinction the phone-meme is relevant. Therefore,
phone-memes have both determinable internal structures (i.e.,
the links between auditory and articulatory configurations) and
determinable positions within the larger networks that implement linguistic competence. Memetic constituents coding for
phonotactic regularities, rhythmic configurations, morphs, or
syntactic categories and constructions are construed in similar terms.
The replication of language memes involves communication, acquisition, and accommodation. Since a speakers communicative behavior is constrained by his or her competence,
utterances will automatically express the memetic constituents
by whose activation they are caused. Then, the mind-brains of
recipients and those of children in particular will attempt to
assume organizations by which they can emulate the utterance
behavior they are exposed to. Thereby, copies of memes that are
expressed in utterances get created.
Among possible factors determining the differential replication of meme variants, three types are distinguished. First,
meme replication must be constrained by physiological properties of their hosts. Thus, meme variants that are easy to express in
articulation and whose expression is easy to perceive will be universally fitter than more costly and less easily perceivable competitors. Second, memes will be sensitive to such social factors
as power relations within and across groups. The more powerful
and prestigious that individuals or groups are perceived to be,
the more often will their behavior be imitated. Third, the replication of any meme will depend on other memes in the system.
Since utterances always express many memetic constituents
simultaneously, stable languages will contain mutually coadapted memes, which co-express with minimal distortion of one
anothers expressions.
While the predictions derived from physiological and social
constraints on meme selection seem to mirror those of speaker-

Memory and Language


based theories that derive the properties of languages from the
needs of their users, the co-adaptive pressures among memes
promise new explanations of long-term conspiracies in language
change, or the existence of typological classes. Thus, most Old
and Middle English sound changes that altered the metrical
weight or the syllabic structure of lexemes produced outputs that
were more trochee-like than their inputs. From a memetic perspective, they can be explained as morphotactic adaptations of
lexemes to rules coding for foot isochrony.
Strictly memetic approaches to language are still a minority program. While adherents regard memes as essential to any
truly Darwinian theory of language, even some of the linguists
who pursue explicitly Darwinian agendas (e.g., Croft 2001) prefer
to think of selection as being performed on utterance constituents and to attribute more active roles to speakers as agents of
change. Skeptics (e.g., Aunger 2001) also emphasize the need to
formalize memetics, the missing evidence of neural replicators,
and the paucity of empirical studies demonstrating the explanatory potential of the approach.
Nikolaus Ritt
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aunger, Robert. 2001. Conclusion. In Darwinizing Culture: The Status
of Memetics as a Science, ed. Robert Aunger, 20533. Oxford: Oxford
University Press.
Blackmore, Susan. 1999. The Meme Machine. Oxford: Oxford University
Press.
Croft, William. 2001. Explaining Language Change. London: Longman.
Dawkins, Richard. 1976. The Selfish Gene. Oxford: Oxford University
Press.
Dennett, Daniel. 1995. Darwins Dangerous Idea: Evolution and the
Meanings of Life. New York: Simon and Schuster.
Hurford, James R. 2006. Recent developments in the evolution of language. Cognitive Systems 7 (November): 2332.
Ritt, Nikolaus. 2004. Selfish Sounds and Linguistic Evolution.
Cambridge: Cambridge University Press.

MEMORY AND LANGUAGE


The study of memory is concerned with the way that information is represented and stored over time in the mind and how
it is retrieved and influences behavior. Memory is essential for
all cognitive functions, including those that are intentional and
under conscious control and those that occur automatically
without conscious control. The dominant theoretical approach
postulates several separate memory systems that vary in the
nature of their encoding and retrieval, their duration, and their
neural substrates, as well as how they are affected by different
variables, such as the age of the person or the level of processing during encoding (e.g., Tulving and Schacter 1990). For
example, semantic memory stores long-term conceptual
knowledge including linguistic knowledge, such as words used
to express concepts; procedural memory represents learned
skills and perceptual-motor routines that can be enacted with
little attentional control, for example, reading; working
memory represents the current content of consciousness and
enables manipulation of this information as when, for example, a reader develops the meaning of a text; episodic memory

represents the occurrence of specific events, for example, when


and where you last wrote a letter.
Despite the pervasiveness of memory in cognitive functions
such as language, historically memory systems have been investigated and theorized as separate and distinct from other cognitive systems. This approach, however, can be difficult to sustain
because of the centrality of memory to cognitive functions,
especially language. Some current models postulate that representations and processes used for memory and language
are inseparable and part of the same system at both a behavioral
and neural level (MacKay et al. 2007; see hippocampus). Here,
we review the relation between memory and language in behavior and consider evidence relevant to whether they are distinct
systems.

Semantic Encoding in Memory and Language


A long-standing feature of models of language processing is that
perception of a word activates its lexical representation and that
activation automatically spreads to associated conceptual information, including semantic properties of the word that constitute its meaning (e.g., Rapp and Goldrick 2000; see spreading
activation). A clear demonstration of this feature is seen in the
Stroop task in which participants are instructed to ignore a word
and simply name its ink color. Despite instructions to ignore the
word, color naming latency is faster when the base word is the
same as the ink color than when it is a different color, for example, the word blue written in red. This difference in latency can
occur only if the meaning of the base word is accessed, despite
instructions to ignore the word. This automatic semantic encoding of a word is clearly a process that is an essential part of a primary language function, namely, language comprehension, as it
is essential to understanding the meaning of a word, sentence
and discourse (see discourse analysis [linguistic]).
This language process is also part of encoding in episodic
memory. For example, in a variant of the Stroop task, color naming latency was slower for taboo base words (e.g., whore) than
for neutral base words (e.g., wrist). In a subsequent surprise
recall test, memory was better for taboo than neutral base words
(MacKay et al. 2004; see emotion and language and emotion words). This difference in the effects of taboo and neutral
base words demonstrates that the automatic semantic activation
that occurred during perception of the base word was the basis
not only for lexical comprehension (which slowed color naming latency for taboo base words) but also for the representation
involved in subsequent episodic memory recall. The strong influence of meaning on memory indicates an overlap of comprehension and memory representations.
The degree of semantic activation during encoding also has
a strong effect on how well verbal material is remembered.
Participants remember more words in a surprise episodic
memory test after making judgments about word meaning
compared to judgments about phonology or physical form
(Craik and Tulving 1975). The idea that semantic processing is
a deeper level of processing that improves memory has been
criticized for being a circular explanation. Nevertheless, participants had no advance knowledge that memory would be tested
and thus they did not engage in mnemonic strategies, and so
the findings demonstrate that semantic processes involved in

477

Memory and Language


understanding language form the basis for representing specific
occurrences of words in memory.

Semantic Activation and Memory Errors


Semantic activation during language comprehension is also the
basis for memory errors, especially constructive memory errors,
which occur when there is false memory for material that is conceptually related to presented material but was not actually presented. For example, implications of sentences are commonly
remembered as having been presented when they were not. The
target sentence, The hungry python caught the mouse is likely to
be remembered as The hungry python ate the mouse (Harris and
Monaco 1978). The implication is encoded in memory as part
of the presented sentence because it is activated during comprehension. What is remembered is what is computed by comprehension processes, not what was actually presented. This
integration of language comprehension and memory makes it
extremely difficult for people to remember language verbatim
and makes memory for what people said or wrote notoriously
unreliable.
False memories based on semantic activation processes have
also been demonstrated in the Deese/Roediger-McDermott
(DRM) experimental paradigm (e.g., Roediger and McDermott
1995). In the DRM paradigm, participants are asked to remember a list of words (e.g., snooze, wake, dream, blanket, etc.) that
are associated with an unpresented critical word (e.g., sleep).
Participants falsely remember the unpresented critical word at
rates equivalent to the presented items; their confidence in their
memory accuracy is as high for critical words as for presented
words. The high rate of false memory for a critical word in this
paradigm has generally been explained by semantic activation
of the list words during their presentation that spread to and
summated at the representation for the critical word. The high
level of activation of the critical word at the test produces a feeling of familiarity that leads to the false recognition. Consistent
with this explanation, increasing the number of related words on
the studied list increases the likelihood that the critical word is
falsely remembered (Robinson and Roediger 1997). There is also
evidence that semantic activation of the critical word may have
decayed before the test, but the critical word is reactivated at the
test because of its association with the list (Meade et al. 2007).

Memory Processes and Language


Frequency and recency of occurrence have strong effects on
memory. Classic forgetting curves show that the more recent the
presentation of material, the better the memory for it. Frequency
or repetition improves both episodic and procedural memory.
Parallel effects of frequency and recency are seen in language.
Words that are repeated frequently or more recently in natural
language are easier to perceive and to produce. The effect of
recency and frequency is demonstrated in a dramatic language
production failure known as the tip-of-the-tongue state (TOT) in
which a person is temporarily unable to produce a well-known
word. In the throes of a TOT, a person can produce semantic
information about the TOT target and sometimes partial information about the phonology of the word, such as number of
syllables or first phoneme, but the complete word remains
maddeningly out of reach. Alternate words related to the TOT

478

target word, especially in sound, sometimes persistently come to


mind, but these alternate words are a consequence, not a cause,
of the gap produced by the TOT (Burke et al. 1991).
TOTs are caused by a retrieval failure at the phonological
level of the representation of the word while semantic information is available for retrieval. Low-frequency words are more vulnerable to TOTs than high-frequency words, and recent use of a
word makes it less vulnerable to TOT. Clearly, TOTs represent a
memory retrieval failure and demonstrate the interdependence
of memory and language processes. TOTs can be explained in
terms of impaired spreading activation from lexical to phonological representations because of weak connections between
these representations, caused by disuse. Consistent with this
explanation, a TOT can be resolved by pronouncing phonological segments of the word, which increases phonological activation (James and Burke 2000). Memory and language processes
are indistinct here because identical representations and processes (spreading activation) are crucial for both memory and
language.
Language processes are also closely linked to working memory. Theories of working memory include a storage component
and a controlled attention or central executive component that
maintains or computes information that is the focus of attention.
Working memory has a limited capacity that constrains the ability to perform complex mental computations, including semantic and syntactic computations necessary for constructing
linguistic representations that are the basis for language comprehension and production (Just and Carpenter 1992; Caplan and
Waters 1999). For example, limited capacity causes reading time
to increase at points in a sentence where difficult syntactic computations are required. Moreover, a persons language ability is
correlated with the capacity of their working memory: Language
comprehension and production are better for people with large
rather than small working memory spans.
The theory that working memory is a separate construct from
linguistic processes that constrains language functions has been
challenged recently by a connectionist approach to language.
This approach postulates that computational efficiency in language processing is determined by the state of the network representing linguistic knowledge, not by the capacity of a separate
working memory system (MacDonald and Christiansen 2002).
Knowledge and experience influence the state of the representational network. For example, repetition strengthens connections
among representations so that they pass activation more quickly,
increasing processing efficiency. Consistent with this idea, language that is more frequent at either a lexical or syntactic level
is easier to process. Within this approach, complex syntax slows
reading not because it requires more working memory capacity
but because such syntax is infrequent, which weakens connections among relevant representations. Similarly, because linguistic experience increases the processing efficiency of the language
network, individuals with greater language experience are predicted to have larger verbal working memory spans than individuals with less language experience. That is, the observed relation
between language ability and working memory span is attributed
to a common cause: increased efficiency of the language network
because of increased language experience. This approach eliminates the architectural and computational distinction between

Mental Models and Language


working memory and the language system. It views working
memory limitations as emerging from the architecture of the
language network, rather than from a fixed capacity.
Elizabeth R. Graham and Deborah M. Burke
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Burke, Deborah M., Donald G. MacKay, Joanna S. Worthley, and
Elizabeth Wade. 1991. On the tip of the tongue: What causes word
finding failures in young and older adults? Journal of Memory and
Language 30: 54279.
Caplan, David, and Gloria S. Waters. 1999. Verbal working memory and
sentence comprehension. Behavioral and Brain Sciences 22: 77126.
Craik, Fergus I. M., and Endel Tulving. 1975. Depth of processing and
the retention of words in episodic memory. Journal of Experimental
Psychology: General 104: 26894.
Harris, Richard J., and Gregory E. Monaco. 1978. Psychology of pragmatic implication: Information processing between the lines. Journal
of Experimental Psychology: General 107: 122.
James, Lori E., and Deborah M. Burke. 2000. Phonological priming
effects on word retrieval and tip-of-the-tongue experiences in young
and older adults. Journal of Experimental Psychology: Learning,
Memory and Cognition 26: 137891.
Just, Marcel A., and Patricia A. Carpenter. 1992. A capacity theory of comprehension: Individual differences in working memory. Psychological
Review 99: 12249.
MacDonald, Maryellen C., and Morten H. Christiansen. 2002. Reassessing
working memory: Comment on Just and Carpenter (1992) and Waters
and Caplan (1996). Psychological Review 109: 3554.
MacKay, Donald G., Lori E. James, Jennifer K. Taylor, and Diane E.
Marian. 2007. Amnesic H. M. exhibits parallel deficits and sparing
in language and memory: Systems versus binding theory accounts.
Language and Cognitive Processes 22: 377452.
MacKay, Donald G., Meredith Shafto, Jennifer K. Taylor, Diane E. Marian,
Lise Abrams, and Jennifer R. Dyer. 2004. Relations between emotion,
memory, and attention: Evidence from taboo Stroop, lexical decision,
and immediate memory tasks. Memory and Cognition 32: 47488.
Meade, Michelle L., Jason M. Watson, David A. Balota, and Henry L.
Roediger, III. 2007. The roles of spreading activation and retrieval
mode in producing false recognition in the DRM paradigm. Journal of
Memory and Language 56: 30520.
Rapp, Brenda, and Matthew Goldrick. 2000. Discreteness and interactivity in spoken word production. Psychological Review 107: 46099.
Robinson, Kerry J., and Henry L. Roediger, III. 1997. Associative processes in false recall and false recognition. Psychological Science
8: 2317.
Roediger, Henry L., III, and Kathleen B. McDermott. 1995. Creating
false memories: Remembering words not presented in lists. Journal
of Experimental Psychology: Learning, Memory, and Cognition
21: 80314.
Tulving, Endel, and Daniel L. Schacter. 1990. Priming and human memory systems. Science 247: 3016.

MENTAL MODELS AND LANGUAGE


How do we represent discourse, and how do we reason from
its contents? One answer to both questions is that we rely on
mental models of the situations that discourse describes. The
Scottish psychologist Kenneth Craik (1943, 61) wrote that if
we construct a small-scale model of the world, we can use it
to make sensible decisions about our actions. Several thinkers
anticipated him, and the nineteenth-century American

logician C.S. Peirce formulated an account of verbal reasoning


based on diagrams that were models of assertions (Peirce
193158, vol. 4). In the 1970s, cognitive scientists converged
again on the idea of mental models. They argued that when we
understand discourse, we use sentence meaning and our
general knowledge in order to construct a mental model of the
situation under description (also known as a situation model).
Such a model is as iconic as possible; that is, its structure corresponds to the structure of the situation it represents. Hence, the
model represents each referent with a single mental token, the
properties of referents with properties of the tokens, and the relations among referents with relations among the tokens (JohnsonLaird 1983). The model captures what is common to the different
ways in which a possibility might occur, and so the theory is
analogous to possible worlds semantics and to discourse
representation theory (Kamp and Reyle 1993). However, these
approaches postulate that representations are logically correct,
whereas mental models have inbuilt shortcomings as a result of
the constraints of the human mind (see the following).
As an example of a model, consider a simple spatial description (see Byrne and Johnson-Laird 1989):
The office door is on the left of the elevator. The exit door is on
the right of the elevator.

We can construct a mental model of the spatial layout, which is


analogous to this diagram:
office-door

elevator

exit-door

The diagram is iconic in that its layout corresponds to a plan of


the three entities, but a mental model of the layout is likely to
represent the doors rather than to use verbal labels, which occur
in the diagram for simplicity.
Suppose that the discourse continues:
A man is standing in front of the office door. A woman is
standing in front of the exit door.
We incorporate this information in our model:
office-door
man

elevator

exit-door
woman

It follows that the man is on the left of the elevator and the
woman is on the right of the elevator. No alternative model of
the discourse is a counterexample to this conclusion, and so it
is logically valid; that is, it must be true, given the truth of the
premises.
Mental models govern our memory for discourse. Suppose
that this discourse continues:
The man standing in front of the office door was using a cell phone.

Later it states:
The man using the cell phone was wearing a suit.

Both assertions can be used to update our model. In an unexpected memory test, as Alan Garnham (1987) showed, we are not
likely to recall which of the following sentences occurred in the
discourse:

479

Mental Models and Language


The man standing in front of the office door was wearing a suit.
The man using a cell phone was wearing a suit.

We forget the sentences and recall only the situation represented


in our model. Hence, given an assertion that follows at once from
our model, we are also prone to suppose that it too occurred in
the discourse (Bransford, Barclay, and Franks 1972).
Models of discourse can be abstract. They can combine both
iconic elements and symbolic elements, such as negation. We
might translate negation into a set of alternative affirmative possibilities (Schroyens, Schaeken, and dYdewalle 2001). We represent the proposition that the man isnt in front of the exit door as
a set of affirmative possibilities: He is in front of the elevator, or
he is in front of the office door, or But this representation calls
for a procedure that interprets a set of models as alternatives. As
Peirce realized, this machinery is not iconic but symbolic. And,
often, there are too many affirmative possibilities to represent
negation in this way.
If we have a dynamic model of what happens in a story, then
changes in location should affect our ease of accessing referents.
Experiments have shown that if, say, the protagonist in a story
walks through a door into another room carrying an object, then
it is easier for us to access this object and the entities in the new
room than those in the room the protagonist has left. It takes us
longer to respond to questions or to a probe word; and these
effects occur for stories (e.g., Glenberg, Mayer, and Lindem
1987; Rinck and Bower 1995), movies (e.g., Magliano, Miller,
and Zwaan 2001), and virtual reality on a computer screen
(Radvansky and Copeland 2006). We therefore maintain a model
of discourse that has perceptual and spatial features that parallel
those in models that we construct from witnessing events, and
the model may rely on many of the same brain areas underlying
perception.
The principle of truth is an interpretative assumption governing mental models (Johnson-Laird 2006). It postulates that
they represent only what is true according to the discourse. As a
consequence, an assertion, such as
The man is wearing a suit or else the woman is wearing a suit,
but not both.

is represented in separate mental models of the two possibilities,


depicted here on separate lines:
man wears suit
woman wears suit

Again, these sentences stand in for mental models. What the


models do not represent, at least explicitly, is the falsity of the
woman is wearing a suit in the first possibility and the falsity of
the man is wearing a suit in the second possibility. The principle
of truth reduces the load on our memory, and it seems innocuous. Yet, it can lead us into the illusion that we understand a
description that, in fact, is beyond us.
A striking illusion of this sort occurs with the description:
If theres a king in the hand then theres an ace in the hand
or else theres an ace in the hand if there isnt a king in the hand.
There is a king in the hand.

480

The mental models of the first assertion are:


king ace
not(king) ace

where not represents negation. The second assertion eliminates the second of these possibilities, and so it seems that there
is an ace in the hand. However, the connective or else means that
at the very least, one of the propositions that it connects may be
false. Given, say, the falsity of if theres a king in the hand then
theres an ace in the hand, we realize that its possible that theres
a king in the hand without an ace. So, even though the second
assertion tells us that there is a king in the hand, no guarantee
exists that theres an ace, too. The inference is fallacious. This
analysis relies only on two well-attested facts about our understanding: 1) or else allows that one of the clauses it connects is
false, and 2) the falsity of a conditional allows that its if-clause
can be true and its then-clause false. A computer program implementing the principle of truth led to the discovery of a variety of
illusions, and subsequent studies have corroborated their occurrence (Johnson-Laird 2006).
When you read the earlier description of the man and woman
standing in front of the doors, you might have formed a visual
image of the layout. Spatial relations are usually easy to visualize.
You might, therefore, assume that mental models are nothing
more than visual images. This assumption is wrong. Some relations, such as Pam is better than Viv, are impossible to visualize. You can imagine, say, Pam as further up a ladder than Viv is,
but nothing in your image or in any possible image can make
explicit the meaning of better than. Not all relations are rooted in
a sensory modality or have a spatial interpretation. Some relations, such as the cat is cleaner than the dog, are easy to visualize
but do not invoke a spatial representation. Reasoning with these
visual relations, which elicit images rather than spatial models,
takes longer than reasoning with other sorts of relation, and it
activates a region of the brain underlying vision.
The hypothesis that we represent discourse in mental models is
uncontroversial in psycholinguistics, though not all accounts
stress the iconicity of models (cf. Kintsch 1988; and Gernsbacher,
1990). We make a dynamic representation of entities, their properties, and the relations among them. The heart of the problem
in building a mental model is to recover the appropriate referent
for each expression. Speakers refer back to entities that they have
already introduced in the discourse, and they can use different
noun phrases, demonstratives, or pronouns to do so. The correct
interpretation of such anaphora depends on many factors. Given
a sentence like The man confessed to the priest because he wanted
absolution, we understand that he refers to the man rather than
the priest. We make this attribution because we know the purpose
of confession, because we have a preference for locating the antecedents of pronouns in the subjects of clauses, and because we also
have a preference for a parallel grammatical role of antecedent and
anaphor (Stevenson, Nelson, and Stenning 1995).
In computational linguistics, centering theory shows how the
focus on a local segment of discourse determines the antecedents of anaphora, especially pronouns (e.g., Grosz, Joshi, and
Weinstein 1995; Webber et al. 2003) Another factor is the semantic difference between antecedent and anaphor (Almor 1999).
The information load on interpretation increases when, unlike

Mental Models and Language

Mental Space

normal cases, the anaphor is more specific than its antecedent,


for example, He had a beer, and the Guinness tasted good.
Within the framework of mental models, the most comprehensive account of anaphora is that of Garnham and his colleagues
(cf. Cowles and Garnham 2005). This theory takes into account
all the preceding factors, but also postulates that a crucial factor
is the number of potential antecedents for an anaphor. In looking
backward, an anaphor should have enough content to pinpoint its
antecedent among the candidates. But the choice of a particular
anaphor also signals the future direction of the discourse subsequent content may provide the information needed to pinpoint
the antecedent. And the content in the anaphor may also signal
a shift in theme. So, the theory is Janus-faced, looking both backward for antecedents and forward for thematic shifts. No current
theory, however, has led to a computer program that constructs
models for anything more than a fragment of the language.
P. N. Johnson-Laird
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Almor, Amit. 1999. Noun-phrase anaphora and focus: The informational
load hypothesis. Psychological Review 106: 74865.
Bransford, John D., J. R. Barclay, and J. J. Franks. 1972. Sentence
memory: A constructive versus an interpretive approach. Cognitive
Psychology 3: 193209.
Byrne, Ruth M. J., and P. N. Johnson-Laird. 1989. Spatial reasoning.
Journal of Memory and Language 28: 56475.
Cowles, Wind, and A. Garnham. 2005. Antecedent focus and conceptual
distance effects in category noun-phrase anaphora. Language and
Cognitive Processes 20: 72550.
Craik, Kenneth. 1943. The Nature of Explanation. Cambridge: Cambridge
University Press.
Garnham, Alan. 1987. Mental Models as Representations of Discourse and
Text. Chichester: Ellis Horwood.
. 2001. Mental Models and the Interpretation of Anaphora. Hove,
East Sussex: Psychology Press. A major statement of the theory of mental models for discourse.
Gernsbacher, Morton A. 1990. Language Comprehension as Structure
Building. Hillsdale, NJ: Erlbaum.
Glenberg, Arthur M., M. Meyer, and K. Lindem. 1987. Mental models
contribute to foregrounding during text comprehension. Memory &
Language 26: 6983.
Grosz, Barbara, A. Joshi, and S. Weinstein. 1995. Centering: A framework for modelling the local coherence of discourse. Computational
Linguistics 21: 20326.
Johnson-Laird, Philip N. 1983. Mental Models. Cambridge: Harvard
University Press.
. 2006. How We Reason. Oxford: Oxford University Press.
Kamp, Hans, and U. Reyle. 1993. From Discourse to Logic. Dordrecht, the
Netherlands: Kluwer.
Kintsch, Walter. 1988. The role of knowledge in discourse comprehension: A construction-integration model. Psychological Review
95: 16382.
Magliano, Joseph P., J. Miller, and R. A. Zwaan. 2001. Indexing space and
time in film understanding. Applied Cognitive Psychology 15: 53345.
Peirce, Charles S. 193158. Collected Papers of Charles Sanders Peirce.
8 vols. Ed. C. Hartshorne, P. Weiss, and A. Burks. Cambridge: Harvard
University Press.
Radvansky, Gabriel A., and D. E. Copeland. 2006. Walking through
doorways causes forgetting: Situation models and experienced space.
Memory & Cognition 34: 11506.

Rinck, Mike, and G. Bower. 1995. Anaphor resolution and the focus of
attention in situation models. Memory & Language 34: 11031.
Schroyens, Walter, W. Schaeken, and G. dYdewalle. 2001. The processing of negations in conditional reasoning: A meta-analytic case study
in mental model and/or mental logic theory. Thinking & Reasoning
7: 12172.
Stevenson, Rosemary J., A. W. R. Nelson, and K. Stenning. 1995. The role
of parallelism in strategies of pronoun comprehension. Language and
Speech 38: 393418.
Webber, Bonnie, M. Stone, A. Joshi, and A Knott. 2003. Anaphora and
discourse structure. Computational Linguistics 29: 54587.

MENTAL SPACE
What Is a Mental Space?
Mental spaces are partial assemblies constructed as we think
and talk, for purposes of local understanding and action. They
contain elements and are structured by frames and cognitive
models. Mental spaces are connected to long-term schematic
knowledge, such as the frame for walking along a path, and to
long-term specific knowledge, such as a memory of the time you
climbed Mount Rainier in 2001. The mental space that includes
you, Mount Rainier, and your climbing the mountain can be activated in many different ways and for many different purposes.
You climbed Mount Rainier in 2001 sets up the mental space in
order to report a past event. If you had climbed Mount Rainier
sets up the same mental space in order to examine a counterfactual situation and its consequences. Max believes that you
climbed Mount Rainier sets it up again, but now for the purpose
of stating what Max believes.
Mental spaces are constructed and modified as thought and
discourse unfold and are connected to each other by mappings,
such as identity and analogy. It has been hypothesized that
at the neural level, mental spaces are sets of activated neuronal
assemblies and that the connections between elements correspond to coactivation bindings. On this view, mental spaces
operate in working memory but are built up partly by activating structures available from long-term memory. Connections
link elements across spaces without implying that they have the
same features or properties. When I was six, I weighed fifty pounds
prompts us to build an identity connector between the adult and
the six-year-old despite the manifest and pervasive differences.
Mental spaces are built up dynamically in working memory but
can become entrenched in long-term memory.
An expression that names or describes an element in one
mental space can be used to access a counterpart of that element
in another mental space (access principle).

Exploring Mental Spaces


In the 1970s, it became clear that grammatical and semantic
structure provide evidence for general features of human conceptual systems and operations.
Logical phenomena, such as quantifier scope, anaphora,
opacity, and presupposition had been largely the province of
analytic philosophy. Bypassing the mind/brain, semantics was
framed in terms of an external theory of truth and reference.
cognitive linguistics embarked on a different course, placing mental constructs at the forefront of the study of language.
The initial motivation for mental space theory (Fauconnier

481

Mental Space

Merge

[1985] 1994, 1997) was that it provided simple, elegant, and general solutions to problems such as referential opacity or presupposition projection that had baffled logicians and formal
linguists. Opacity results from the application of the access principle across mental spaces as discourse unfolds. What emerged
was a unified cognitively based approach to anaphora, presupposition, conditionals, and counterfactuals. Additionally, the
gestural modality of signed languages revealed other ways in
which mental spaces could be set up and operated on cognitively
and physically.
Shortly thereafter, J. Dinsmore (1991) developed a powerful
approach to tense and aspect phenomena, based on mental
space connections. The approach was pursued and extended
in fundamental ways by M. Cutrer (1994), who made it possible
to understand the role of grammatical markers as prompts to
deploy vast networks of connected mental spaces. Further generalizations were achieved in areas exemplified by the diverse
contributions to Spaces, Worlds, and Grammar (Fauconnier and
Sweetser 1996). Sophisticated research continues to be done in
all of the areas where mental space theory was first applied, in
particular on conditionals (see Dancygier 1998; Dancygier and
Sweetser 1996, 2005), scoping phenomena on locative and temporal domains (see Huumo 1996), grammar of sign languages
(see Liddell 2003), discourse (see Epstein 2001), and frame shifting (see Coulson 2001). But at the same time, there has been an
explosion of research triggered by the discovery of wide-ranging
phenomena whereby mental spaces are assembled, connected,
and constructed within networks of conceptual integration (see
conceptual blending). This area of research links linguistic and nonlinguistic phenomena in systematic ways that begin
to explain how and why there can be imaginative emergent
structure in human thought in its everyday manifestations,
as well as in its most original and singular sparks of creativity.

Mental Spaces in Discourse: A Simple Example


Suppose the current president of our country is Nick, and that
someone says:
Thirty years ago, the president was a baby.

The base mental space, B, corresponds to the time at which the


statement is made and contains an element a which fills the
role president in a political frame and has the name Nick. The
space-builder thirty years ago sets up a new space M relative to
the base (30 years before now); a in B has a counterpart a in
the new space M; the president identifies a in B, and can therefore
access its counterpart a in M. The property baby is assigned
to a in M. The sentence is interpreted as saying that Nick was a
baby 30 years ago.
The expression the president, however, can equally well be
construed as directly identifying an element b in M: It fills the
role president in the political frame for M. The property baby
is now assigned to b. The sentence is now interpreted as saying
that a baby was president 30 years ago.
It is an empirical fact that the example sentence does
indeed have the two interpretations, and this fact, like many
others, follows from the accessing principles of mental space
configurations.
Gilles Fauconnier

482

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Coulson, Seana. 2001. Semantic leaps. Cambridge, UK: Cambridge
University Press.
Cutrer, M. 1994. Time and tense in narratives and everyday language.
Ph.D. diss., University of California, San Diego.
Dancygier, Barbara. 1998. Conditionals and Prediction. Cambridge:
Cambridge University Press.
Dancygier, Barbara, and Eve Sweetser. 2005. Mental Spaces in
Grammar: Conditional Constructions. Cambridge: Cambridge
University Press.
Dinsmore, J. 1991. Partitioned Representations. Dordrecht, the
Netherlands: Kluwer.
Epstein, Richard. 2001 The definite article, accessibility, and the construction of discourse referents. Cognitive Linguistics 12: 33378.
Fauconnier, Gilles. [1985] 1994. Mental Spaces. Cambridge: Cambridge
University Press.
. 1997. Mappings in Thought and Language. Cambridge: Cambridge
University Press.
Fauconnier, Gilles, and Eve Sweetser. 1996. Spaces, Worlds, and
Grammar. Chicago: University of Chicago Press.
Fauconnier, Gilles, and Mark Turner. 2002. The Way we Think. New
York: Basic Books.
Huumo, Tuomas. 1996. A scoping hierarchy of locatives. Cognitive
Linguistics 7: 26599.
Liddell, Scott K. 2003. Grammar, Gesture, and Meaning in American Sign
Language. Cambridge: Cambridge University Press.
Van Hoek, Karen. 1997. Anaphora and Conceptual Structure. Chicago:
University of Chicago Press.

MERGE
Merge is the primitive combinatorial operation in the most recent
version of transformational grammar known as minimalism. In its most austere variety, merge is a generalized transformation that simply turns its input elements into a set with the input
elements as members (set-merge). Unlike the earlier government
and binding model of the principles and parameters theory, Noam Chomskys (1995) minimalist model does not assume
a deep structure (see underlying structure) representation
as a starting point of the derivation; instead, syntactic computation
starts out from individual words. Merge combines words as well as
syntactic objects it has already formed in a recursive manner (see
recursion, iteration, and metarepresentation), generating an infinite array of discrete expressions with a hierarchical
constituent structure. In principle, merge can freely apply
to elements available to it, but its application is constrained by
principles of computational efficiency and by output conditions
imposed by external systems of sound/gesture and meaning.
movement is construed as merge of a syntactic object with a
syntactic object contained in it, whence the term internal merge
(vs. external merge; see Chomsky 2004). Consider, for instance,
the derivation of the passive sentence (1a), where the underlying object is moved to the subject position. Here the expression
(1b), constructed by recursive applications of merge, undergoes
merge with its subset {a, house}, yielding (1c). (2) is a tree diagram representation of (1c).
(1) a. A house will be built
b. {will, {be, {built, {a, house}}}}
c. {{a, house}, {will, {be, {built, {a, house}}}}}

Merge

Metaphor

(2)

. 2004. Beyond explanatory adequacy. In Structures and


Beyond: The Cartography of Syntactic Structures, ed. Belletti, 10431.
Oxford: Oxford University Press.

METALANGUAGE

house will

be
built
a

house

The syntactic object {a, house} has two copies or occurrences, but
it is realized phonetically only as a member of the (largest) set in
(1c). Note that the two occurrences resulting from movement are
not distinct syntactic objects. Rather, the same syntactic object
is a member of two sets, where one set is properly contained in
the other.
While there is currently no agreement regarding restrictions
on merge, or how to deduce them, the mainstream view holds
that merge is binary, always taking exactly two input elements
(entailing strict binary branching in syntactic trees), and it cannot alter set-membership relations that it has established before.
It is unresolved whether the output of merge should be enriched
to encode any asymmetry between its operands. On Chomskys
(1995) original definition, merge forms a set with the following
two members: the set of the input elements and the word functioning as the head of the constituent (also known as label),
thereby representing the asymmetry in the choice of the input
element that projects (see x-bar theory). Following this formulation, (1c) can be rewritten as (3). The structure in (3) is represented by the labeled tree diagram in (4).
(3) {will, {{a, {a, house}}, {will, {will, {be, {be, {built, {built, {a, {a,
house}}}}}}}}}}

(4)

will

will

a
a

house

will

be
be

built
built

a
a

house

Another asymmetry is that between an adjunct (e.g., an adverbial) and the host it is adjoined to (e.g., a verb phrase). Chomsky
(2004) suggests that when an adjunct and its host undergo
merge, the result is an ordered pair Adjunct, Host (also known
as pair-merge).
Balzs Surnyi
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chametzky, Robert A. 2000. Phrase Structure: From GB to Minimalism.
Oxford: Blackwell.
Chomsky, Noam. 1995. The Minimalist Program. Cambridge, MA: MIT
Press.

Languages may be used to talk about languages. In the first sentence of this entry, English is used to mention (or talk about) languages. (See use and mention.) In general, a language under
discussion, on a given occasion, is called the object language,
and the language being used in the discussion (to talk about the
object language) is the metalanguage.
Alfred Tarski proved that no classical language (language with
classical logic) can express its own truth predicate. Accordingly,
the semantics of a classical language (with standard syntax)
can only be done in a richer metalanguage.
J. C. Beall
SUGGESTION FOR FURTHER READING
Tarski, Alfred. 1990. Logic, Semantics, Metamathematics: Papers from
19231938. 2d ed. Ed. J. Corcoran. Indianapolis, IN: Hackett.

METAPHOR
This term derives from the Greek metapherein, indicating a
transfer of meaning from one linguistic expression to a second,
semantically different expression. According to standard dictionary definitions, a metaphor is a figure of speech in which a
word or phrase that ordinarily designates one thing is used to
designate another to which it is not literally applicable. Thus,
in metaphor, there is an implicit comparison between unlike
things, as in the Shakespearean metaphor Juliet is the sun.
Employing the terminology introduced by I. A. Richards (1936),
the target concept (Juliet) is labeled the topic or tenor, the metaphoric source (sun) is the vehicle and the emergent meaning,
the ground (see also source and target). Despite this seemingly simple definition, the task of actually identifying metaphor
has proven to be a difficult enterprise because metaphors can be
implied (there is a vehicle but no specified topic), dead (a usage
so conventionalized that the metaphoric transference is no longer actively recognized), mixed (the conflation of two distinct
metaphors), submerged (in which the vehicle is implied but not
stated), or extended (suggested throughout a text). This difficulty
has been especially apparent in recent attempts with computer
applications, such those based on the identification of metaphor
in text corpora or in computer systems capable of generating
representations of metaphorical meaning.

Metaphor as a Linguistic Phenomenon


THE INFLUENCE OF ARISTOTLE. The classic approach, as presented
in the dictionary definition, treats metaphor as a rhetorical
trope in which language is used in a way other than what may be
considered normal or literal. One can see in this approach the
influence of Aristotle. As Umberto Eco put it, [O]f the thousands
and thousands of pages written about the metaphor, few added
anything of substance to the first two or three fundamental concepts stated by Aristotle (1984, 88). Whether or not one agrees

483

Metaphor
with Eco, it is clear that interpretations of Aristotelean thoughts
have influenced much of the thinking on the topic for the past
several millennia, most notably a presumed distinction between
literal and nonliteral language and in framing the basic theoretical issues that have guided much of the subsequent scholarship: understanding the cognitive process that permits the
stretching of meaning from the literal to the metaphoric and in
determining the pragmatic reasons why metaphor would be
employed when literal counterparts could have been used.
In his Poetics and in Rhetoric, Aristotle situated metaphor
in the realm of language, a position that has been the basis for
subsequent theories but has been contested since 1980 by theorists working within cognitive linguistics (described later).
Aristotles basic premise is that with metaphor, one word (or
expression) is substituted for another. He described several categories of substitution, though the forms most studied are what
we would call today nominal and predicative metaphor, the
former in which one noun is substituted for another and the latter in which the substitution is of verbs. Aristotle provided some
explanation of the process involved in metaphor comprehension, namely, an innate tendency to see likeness in objects and
events that are, on the face of it, dissimilar (or in which the similarity is not transparent). Moreover, he provided some reasons
why metaphor might be employed, primarily, to serve a stylistic
and aesthetic function wherein the listener, forced to decode the
message, experiences a pleasurable reaction; and secondarily,
to serve the creative cognitive function of providing a name to
things that do not have proper names of their own. There has
been considerably elaboration of the seminal ideas of Aristotle in
the twentieth century, despite what some would consider a fatal
flaw in the logic of substitution as the basis for metaphor: If, for
example, with the nominal Shakespearean metaphor Juliet is
the sun the vehicle, sun is a substitution for another word that
falls within the same genus= as the topic, Juliet, what could
that word be?
TWENTIETH-CENTURY ELABORATIONS. Two interpretations consistent with Aristotle have been most influential. According to
the substitution position, the transfer from vehicle to topic is an
ornamental means of presenting some intended literal meaning,
so that when one states George is a wolf, it is merely an aesthetic
way of saying that George is fierce. The comparison approach is
less ornamental and closer to the second function described by
Aristotle in that, here, the listener must construct a way in which
properties of the vehicle are applicable to the topic. There have
been several variants of comparison theories proposed in the
literature over the past 50 years, but all include the notion that
the comprehension process involves the identification of a relevant set of preexisting features shared by topic and vehicle. These
theories all have shortcomings, including the failure to encompass the creation (and not mere identification) of similarity, the
problem in identifying the mechanisms that would select the
features assumed to be important for interpretation, and the failure in such theories to explain the asymmetry in meaning that
occurs when the topic and vehicle are reversed, as occurs when
one contrasts my lawyer is a shark with my shark is a lawyer.
Max Black attempted to address at least some of these shortcomings by postulating an interactive theoretical perspective in

484

which the novel meaning of a metaphor is not based on identifying a shared set of (possibly marginal) meanings of the words
being compared. Meaning, he argues, is generated by the interaction between a principal subject (the more literal usage of the
word, similar to the topic) and the complex of associations connected with a subsidiary subject (analogous to the vehicle). The
process is interactive inasmuch as reciprocal action between the
principal and subsidiary subject selects, emphasizes, suppresses
and organizes features of the principal subject by implying statements about it that normally apply to the subsidiary subject
(Black 1962, 46). The outcome is the creation of novel meaning
formed by a parallel implicational complex in which the topic
can be viewed in a radically different light and in which novel
emergent meanings can be created between words. Despite the
popularity of this general approach, it, too, has been subject to
various criticisms, notably regarding the ambiguity in defining
the theoretical terms employed and in determining which of the
terms is the principal and which is the subsidiary subject.
Subsequent psycholinguistic theories have attempted
to describe cognitive mechanisms that are consistent with
the interactive approach. Salience imbalance theory is a variant of traditional comparison models aimed at describing why
some statements are seen as literal and others as metaphoric by
assuming that the features shared by topic and vehicle differ in
relative level of salience: Literal statements are those in which
the shared features are salient to both terms, whereas with metaphor, they are salient to the vehicle but not the topic. Domain
interaction theory is an extension of a computational model of
analogy and assumes that metaphor involves the finding of
similarity both within and between the conceptual domains
evoked by words. Thus, a metaphor such as George Bush is a
hawk would be comprehended by finding a spot in semantic
space that would be consistent with the analogy George Bush
is to world leaders as hawks are to birds. In this model, ease
of comprehension is a function of the ease of finding a shared
similarity (i.e., in determining ways in which Bush is similar to
a hawk), whereas a sense of metaphor aptness increases as
the distance between conceptual domains, such as leaders and
birds, becomes greater. Finally, structure-mapping theory, also
emerging from computational work in analogy, is based on identifying a system of shared relations between the target and source
domains and not by merely identifying a feature shared by topic
and vehicle. Although there are psycholinguistic studies that
support each theory, each is based ultimately on finding similarity between the words presented as topic and vehicle and, consequently, heir to all of the criticisms of such models (reviewed, for
instance, in Glucksberg 2001).
Sam Glucksberg proposes a novel solution that rejects similarity as the basis for metaphor by arguing that in metaphor, one
does not look for a similarity between topic and vehicle (i.e., by
treating the comparison as an implicit simile). He avers, rather,
that metaphor should be understood as a class inclusion statement analogous in how we treat such statements as my dog is a
collie. He argues, and has presented convincing evidence, that
with metaphor, the vehicle has dual reference (both as the literal object and as indicative of higher-order categories) and that,
in comprehension, one classifies the topic to the category suggested by the vehicle. That is, in a metaphor such as my lawyer is

Metaphor
a shark, the vehicle shark stands for or exemplifies a category
to which lawyers could be assigned (such as aggressive, predatory, tenacious entities). In more recent expansion of the theory,
he and his colleagues have indicated how the topic plays a role
in identifying the appropriate category for which categorization
is appropriate.
THE PROCESS OF METAPHOR COMPREHENSION. Much of the theory and research described here is based on offline methodology.
Beginning in the late 1970s, researchers started to examine the
processing of metaphor online, measuring processes that were
happening during the act of comprehension. Most of the early
studies were based on the indexing of reading time or some other
measure of response latency; lately, studies have also employed
neurocognitive imaging techniques such as EEG and fMRI (see
neuroimaging). Much of the initial theorizing has been based
on speech-act theory, especially as espoused in the work of
John Searle. Following from the distinction between literal and
nonliteral language, the assumption has concerned the processing priority of literal meaning. According to what is now called the
standard pragmatic model, the model would be that the default
processing of language would be to its literal meaning, and that
those inferential processes that seek an alternative, nonliteral
interpretations are only triggered if one fails to find a contextappropriate literal interpretation. These assumptions have been
tested in psycholinguistic research that has concretized the standard model: It assumes that priority to a default literal meaning
would be demonstrated by more rapid reading (or other indices
of processing) of a metaphor in a discourse context that is consistent with its literal sense than in a context that is consistent with
the nonliteral sense; it also assumes that one should not find the
processing of metaphoric meaning in conditions in which the literal sense of an utterance is context appropriate.
More than 20 years of research have failed, in the main, to support the predictions arising from the standard pragmatic model,
instead showing that in appropriately elaborated contexts, one
can process the metaphoric sense as rapidly as the literal sense of
an utterance and that, using Stroop-like procedures and speedaccuracy analyses, the initiation of metaphoric interpretation
does not depend on a failure to find an appropriate literal interpretation. These findings, though sometimes complicated by the
level of conventionality of the metaphoric expression, have led
to a set of competing theories, all of which have some support,
including models based on the notion of resolving constraint
satisfaction and those that assume that the initial processing of
a word is at an underspecified schematic level. An increasingly
popular processing model by Rachel Giora attempts to maintain processing priority but places the emphasis not on literal
meaning (as Searle had it) but on the saliency of a word (as concretized by familiarity, conventionality, and frequency of use).
According to this theory, one is obligated to process the salient
sense of a word (or expression), regardless of context; contextual
constraints can boost the activation and meaning access of less
salient meanings but will not do so at a cost to the activation and
access of the more salient sense. The ultimate success and test of
these various theories are being contested, more often these days
with neuroimaging techniques that give a more fine-grained
analysis of online processing than available in the past.

CONCEPTUAL METAPHORS AND THE CONTEMPORARY THEORY OF


METAPHOR. The research described in the previous sections has
undercut the difference between literal and nonliteral language
(see also Gibbs 1994), a challenge extended most notably by
cognitive linguists, especially by George Lakoff, starting with the
publication in 1980 of Metaphors We Live By, co-authored with
Mark Johnson. The main thrust of this theory is that metaphors
are matters of thought and not merely of language, thus distinguishing the basic mapping of a source conceptual domain to a
target conceptual domain (conceptual metaphor) from the
linguistic expression of this mapping. The true source of metaphor is at the conceptual level. According to this theory, conceptual metaphors motivate and underlie understanding of the
world, such that most of what we call literal is, by this theory,
based on underlying metaphorical mappings. Thus, conceptual
metaphors are the basis for understanding literal and nonliteral, novel and conventional, poetic and mundane language
alike. Evidence for a conceptual metaphor, such as the mapping
between the conceptual domains of life and journeys (LIFE IS A
JOURNEY), is reflected in a set of seemingly unrelated linguistic
expressions, such as His life is at an important crossroad and
She knows where she is going. Mappings elucidate the systematic set of correspondences that exist between constituent elements of the source and the target domain. For example, with
the LIFE IS A JOURNEY mapping, the person is analogous to a
traveler, purposes are destinations, means are routes, difficulties
are obstacles, achievements are landmarks, choices are crossroads, and so on, allowing for novel extensions of elements from
the source domain to elements of target concepts.
The theory has had widespread acceptance, and the task of
identifying the presence and force of underlying (and hence
unconscious) cognitive mappings has entered the debates of
linguistics, cognitive science, philosophy, literary theory, and
criticism, among other disciplines. Nonetheless, the claims in
the literature for an ever-increasing number of conceptual metaphors indicate looseness in the theory that may make it incapable
of being disprovable and, thus, an inadequate scientific explanation. Moreover, one testable prediction made by the theory,
namely, that conceptual metaphors are activated on line during
comprehension, has not been supported consistently, with the
strongest support coming from the examination of orientational
and temporal metaphors (e.g., Boroditsky 2000).
CONCEPTUAL BLENDING. A more recent framework, proposed by
Gilles Fauconnier and by Mark Turner, seeks to explain much
of the same linguistic data discussed in the conceptual metaphor literature and shares with that approach the assumption
that metaphor is a conceptual, not a linguistic, phenomenon. In
contrast with conceptual metaphor theory, however, conceptual blending theory is not limited to entrenched conceptual
relations or to the unidirectional mapping from source to target
or the mapping between only two mental domains. Rather, the
basic units are mental spaces representing particular scenarios
recruited from the knowledge of specific domains constructed
while thinking or talking about situations. As such, the theory
emphasizes blending as an on-line process, which both instantiates entrenched metaphors and can yield short-lived and novel
conceptualizations. This theory, too, has entered the literatures

485

Metaphor

Metaphor, Acquisition of

of a number of diverse disciplines, and although the on-line processing implications of the theory are still ongoing, the tests to
date have been encouraging, often employing brain neuroimaging techniques such as event-related potential (ERP) measurement (see Coulson 2001). Nonetheless, this theory has also been
subjected to criticisms that it, too, is incapable of being disproved
and that it is too indiscriminate, inasmuch as almost anything
that enters working memory can be considered a blend.

Evaluation
Treatments of metaphor as a linguistic and as a cognitive phenomenon coexist today, in much the same way that two species
of hominid have coexisted in our evolutionary history. It remains
to be seen whether the offspring of one approach will disappear. Despite the differences, there is a convergence between
approaches that should not be undervalued: This convergence
includes the undercutting of the distinction between literal and
nonliteral language; a recognition by both approaches of the
need to consider the richness of examples coming from literary or philosophical analysis, as well as the controlled rigor that
comes from experimental studies; an emphasis on the role of
cognition and pragmatics (see Carston 2002, for instance, for
an exposition from a relevance theory perspective); and
the growing sentiment (however conceptualized) that the construction of metaphoric meaning is flexible and involves more
of an active on-line interpretive process and less of a mere
arousal of entrenched meaning and that, ultimately, the battleground for theoretical supremacy (or for synthesis of the two
approaches) will depend on data generated and based in the
neurosciences.
Albert N. Katz
WORKS CITED AND SUGGESTED FURTHER READINGS
Black, M. 1962. Models and Metaphor. Ithaca, NY: Cornell University
Press.
Boroditsky, L. 2000. Metaphoric structuring: Understanding time
through spatial metaphors. Cognition 75: 128.
Carston, R. 2002. Thoughts and Utterances: The Pragmatics of Explicit
Communication. Oxford: Blackwell.
Coulson, S. 2001. Semantic Leaps. New York: Cambridge University
Press.
Eco, U. 1984. Semiotics and the Philosophy of Language.
Bloomington: Indiana University Press.
Fauconnier G. 1997. Mappings in Thought and Language.
Cambridge: Cambridge University Press.
Gibbs, R. 1994. The Poetics of Mind. New York: Cambridge University
Press.
Giora, R. 2003. On Our Mind: Salience, Context and Figurative Language.
New York: Oxford University Press.
Giora, R., ed. 2001. Models of figurative language. Metaphor and
Symbol 16.3/4 (Special Issue): 145333.
Glucksberg, S. 2001. Understanding Figurative Language. New
York: Oxford University Press.
Katz, A., C. Cacciari, R. Gibbs, and M. Turner. 1998. Figurative Language
and Thought. New York: Oxford University Press.
Lakoff, G., and M. Johnson. 1980. Metaphors We Live By. Chicago:
University of Chicago Press.
Ortony, A., ed. 1993. Metaphor and Thought. 2d ed. New York: Cambridge
University Press.

486

Richards, I. A. 1936. The Philosophy of Rhetoric. New York: Oxford


University Press.

METAPHOR, ACQUISITION OF
Metaphor is a pervasive aspect of human language (Lakoff and
Johnson 1997). Metaphor also plays a central role in abstract
thought by structuring concepts (Gibbs 1994) and leading to
conceptual change (Gentner and Wolff 2000). As such, its rudimentary manifestation at the early ages and its continued growth
over developmental time has been the focus of scientific inquiry
for several decades. Research on the acquisition of metaphor
followed two main lines of inquiry, each bearing on a different
definition of the term. One approach defined metaphor as a similarity comparison between the perceptual features of objects or
actions, and explored how early children would understand and
produce these so-called perceptual metaphors (e.g., butterfly is
(like) rainbow). Another approach defined metaphor as a conceptual-linguistic mapping between the structural features of two
disparate knowledge domains a source domain, which serves
as the source of vocabulary and conceptual inferences, and a
target domain, to which vocabulary and inferences are extended
metaphorically, and it examined the age at which children begin
to develop an integrated understanding of such structural metaphors (e.g., time is motion along a path) as an amalgam of both
source and target domain meanings.
Following is a brief summary of the developmental changes
in childrens metaphorical ability, from the early onset of simple
perceptual metaphors to the later emergence of more complex
structural metaphors.

Metaphor as Similarity: Childrens Early Comprehension


and Production of Simple Perceptual Metaphors
Children can spontaneously produce a variety of perceptual
metaphors that highlight similarities between objects and events
during preschool years (~ages 2.05.0; e.g., Billow 1981; Gardner
et al. 1978; Winner, McCarthy, and Gardner 1980; Winner 1979).
For example, they hold up a half-peeled banana and call it a
flower (Elbers 1988), place a foot in the wastebasket and call
it a boot (Winner 1979), point to a mushroom and say like ice
cream cone (zalkan and Goldin-Meadow 2006), or describe
a ship sailing in the far distance as taking a bath (Chukovsky
1968). These early perceptual metaphors typically arise in emerging symbolic play contexts, in which children first engage in
imaginative object substitutions (e.g., using a banana as if it were
a phone), and later on, they express similarities between such
objects explicitly in speech (banana is like a phone; Gardner
et al. 1978; see also Sinclair and Stambak 1993 for more information on early symbolic play).
Children can use perceptual similarity to sort objects into
categories as early as 18 months (e.g., boxes vs. balls; see Oakes
and Madole 2000 for a review). By preschool age, they can understand and make comparisons between two categorically different
objects based on feature-based similarities (Billow 1975; Epstein
and Gamlin 1994; Gardner et al. 1975; Mendelsohn et al. 1984;
Vosniadou and Ortony 1983; Winner, McCarthy, and Gardner
1980) and between two events based on action-based similarities
(Dent 1984). For example, when asked to pick two objects that

Metaphor, Acquisition of
go together, children were more likely to group a cherry lollipop
with a toy stop sign, which was similar in shape and color, rather
than matching it with a dissimilar object in the same category
(Mendelsohn et al. 1984). Similarly, when presented with event
triads, children were more likely to pair two events that were
alike (ballerina spinningtop spinning) than to match two events
that were of different types (ballerina leapingtop spinning; Dent
1984).
Moreover, five-year-old children could provide similaritybased explanations when asked about metaphorical expressions
that involve comparisons between objects such as a butterfly is a
flying rainbow, or a cloud is like a sponge (Gardner et al. 1975;
Gentner 1988; Billow 1975; Malgady 1977). For example, they
would explain the statement a cloud is like a sponge by saying that both clouds and sponges are round and fluffy (Gentner
1988), or they would complete the statement he looks as gigantic as by selecting from among multiple choice alternatives
an ending that draws on a feature-based comparison: he looks
as gigantic as a double-decker cone in a babys hand (Gardner
et al. 1975).
Thus, preschool children can both understand similarity
comparisons between two objects or events that are perceptually alike and spontaneously produce perceptual metaphors and
explanations based on such comparisons in their early communications. This ability constitutes an important milestone in
childrens language development. The ability to express similarities between objects and events based on shared perceptual
features is considered the earliest sign of metaphorical ability in
young children, and accordingly, children are believed to have a
rudimentary level of metaphorical ability as early as preschool
age (Billow 1981; Gardner et al. 1978; Vosniadou 1987; Winner
1979).

Metaphor as Conceptual-Linguistic Mapping: Childrens


Comprehension and Production of Complex Structural
Metaphors
Childrens early ability to produce feature-based similarity comparisons is considered the first step in the development of more
complex metaphorical abilities, particularly those that involve
structural comparisons between disparate domains (Gardner
et al. 1978; Gentner 1988; Winner 1979). Not surprisingly, childrens mastery of such structural metaphors takes several more
years, extending well into early adolescent years (Asch and
Nerlove 1960; Vosniadou 1987; Winner, Rosenstiel, and Gardner
1976), and different researchers propose different views concerning how children make this transition.
Some researchers propose a developmental progression
from mappings based on feature-based similarities to mappings based on relational structure in childrens metaphorical
abilities (Billow 1975; Gentner 1988, Gentner and Rattermann
1991, Vosniadou and Ortony 1983). For example, in explaining
the metaphorical statement a cloud is like a sponge, five-yearold children typically rely on feature-based similarities between
the two objects (e.g., Both clouds and sponges are round and
fluffy), while older children and adults opt for more relational
explanations (e.g., Both clouds and sponges contain water;
Gentner 1988). In this view, what drives development is the
shift in focus from feature-based commonalities to relational

commonalities: Children, at all ages, have no difficulty understanding feature-based similarities between objects, but it is with
increasing age that they begin to understand cross-domain mappings based on relational structure and, accordingly, produce
explanations that reflect this understanding.
Others propose a developmental progression from an understanding of metaphor as involving only one domain to a conceptualization of metaphor as involving two domains (Asch
and Nerlove 1960; Cicone, Gardner, and Winner 1981; Schecter
and Broughton 1991; Winner, Rosenstiel, and Gardner 1976).
Thus, children initially focus only on the source domain of the
metaphorical mapping and gradually develop a more integrated understanding of metaphor as involving both a source
and a target domain. For example, in explaining the metaphorical statement the prison guard is a hard rock, children six
to eight years of age focused exclusively on the source domain
meaning of the mapping and provided literal interpretations for
metaphorical statements (e.g., The guard has hard muscles),
whereas older children and adults were able to consider both the
source and target domain meanings of the mapping, thus providing explanations that captured the metaphorical meaning
(e.g., The guard was mean and did not care about the feelings
of prisoners; Winner, Rosenstiel, and Gardner 1976). Similarly,
when asked to extend physical sensation terms onto psychological traits (e.g., Can a person be warm/ sweet/ soft?), children
three to seven years of age focused only on the source domain
of the metaphorical mapping and provided literal explanations
for metaphorical statements (e.g., Mommy is sweet because she
cooks sweet things), whereas older children focused on both
domains simultaneously and provided explanations that treated
metaphorical meaning as a different but related extension of
the literal meaning (e.g., Hard things and hard people are both
unmanageable; Asch and Nerlove 1960).
Yet another group of researchers argue that the ability to
understand structural metaphors is not determined solely by a
childs age but by a host of other factors, such as the nature of
the source or the target domain (Keil 1986), and the familiarity
of the metaphorical mapping or the source domain (zalkan
2007). For example, five-year-old children can correctly map
animate terms onto cars (e.g., the car is thirsty), but have difficulty understanding metaphors that involve mappings between
taste terms and people (e.g., she is a bitter person; Keil 1986).
Similarly, preschool children can both understand and explain
metaphors that are structured by motion (e.g., Time flies by,
Ideas cross my mind; zalkan 2005, 2007) a domain that
structures a wide range of abstract concepts across different languages of the world but have difficulty deciphering the meaning of metaphors that involve extensions of object properties
(e.g., The prison guard is a hard rock; Winner, Rosenstiel, and
Gardner 1976). From this perspective, the development of metaphorical ability shows different trajectories for different conceptual domains and metaphorical mappings, based on ones
knowledge of the source and/or the target domain and the familiarity of the metaphorical mapping.
In summary, research on childrens metaphor comprehension and production shows that children can both understand
and spontaneously produce perceptual metaphors that involve
similarity comparisons by preschool age. However, the ability

487

Metaphor, Acquisition of

Metaphor, Information Transfer in

to understand and explain more complex metaphors, namely,


those that involve structural mappings between different knowledge domains, emerges in late childhood, somewhere between
ages 11.0 to 14.0. Nevertheless, at the same time, childrens early
metaphorical ability is strongly influenced by the familiarity of
the source and target domains of the metaphor, with more familiar domains and metaphorical relations leading to earlier onset
of metaphor comprehension and production.
eyda zalkan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Asch, S., and H. Nerlove. 1960. The development of double function
terms in children. In Perspectives in Psychological Theory, ed. B. Kaplan
and S. Wapner, 4760). New York: International Universities Press.
Billow, R. M. 1975. A cognitive developmental study of metaphor comprehension. Developmental Psychology 11.4: 41523.
. 1981. Observing spontaneous metaphor in children. Journal of
Experimental Child Psychology 31: 43045.
Chukovsky, K. 1968. From Two to Five. Berkeley: University of California
Press.
Cicone, M., H. Gardner, and E. Winner. 1981. Understanding the psychology in psychological metaphors. Journal of Child Language
8: 21316.
Dent, C. H. 1984. The developmental importance of motion information in perceiving and describing metaphoric similarity. Child
Development 55: 160713.
Elbers, L. 1988. New names from old words: Related aspects of childrens metaphors and word compounds. Journal of Child Language
15: 591617.
Epstein, R. L., and P. J. Gamlin. 1994. Young childrens comprehension
of simple and complex metaphors presented in pictures and words.
Metaphor and Symbolic Activity 9.3: 17991.
Gardner, H., M. Kircher, E. Winner, and D. Perkins. 1975. Childrens
metaphoric productions and preferences. Journal of Child Language
2: 12541.
Gardner, H., E. Winner, R. Bechhofer, and D. Wolf. 1978. The development of figurative language. In Childrens Language. Vol. 1. Ed. K.
Nelson, 138. New York: Gardner Press.
Gentner, D. 1988. Metaphor as structure mapping: The relational shift.
Child Development 59: 4759.
Gentner, D., and M. J. Rattermann. 1991. Language and the career of
similarity. In Perspectives on Language and Thought: Interrelations
in Development, ed. S. A. Gelman and J. P. Byrnes, 22577. New
York: Cambridge University Press.
Gentner, D., and P. Wolff. 2000. Metaphor and knowledge change. In
Cognitive Dynamics: Conceptual Change in Humans and Machines, ed.
E. Dietrick and A. Markman, 295342. Mahwah, NJ: Lawrence Erlbaum.
Gibbs, R. 1994. The Poetics of Mind. Cambridge: Cambridge University
Press.
Keil, F. C. 1986. Conceptual domains and the acquisition of metaphor.
Cognitive Development 1: 7396.
Lakoff, G., and M. Johnson. 1997. Philosophy in the Flesh. New York: Basic
Books.
Malgady, R. G. 1977. Childrens interpretation and appreciation of similes. Child Development 48: 17348.
Mendelsohn, E., S. Robinson, H. Gardner, and E. Winner. 1984. Are preschoolers renamings intentional category violations? Developmental
Psychology 20.2: 18792.
Oakes, L. M., and K. L. Madole. 2000. The future of infant categorization research: A process-oriented approach. Child Development
71.1: 11926.

488

zalkan, S. 2005. On learning to draw the distinction between physical and metaphorical motion: Is metaphor an early emerging cognitive
and linguistic capacity? Journal of Child Language 32.2: 128.
. 2007. Metaphors we move by: Childrens developing understanding of metaphorical motion in typologically distinct languages.
Metaphor and Symbol 22.2:14768.
zalkan, S., and S. Goldin-Meadow. 2006. X is like Y: The emergence of similarity mappings in childrens early speech and gesture.
In Cognitive Linguistics: Foundations and Fields of Application, ed. G.
Kristianssen, M. Achard, R. Dirven, and F. Ruiz de Mendoza, 22962.
Berlin: Mouton de Gruyter.
Schecter, B., and J. Broughton. 1991. Developmental relationships
between psychological metaphors and concepts of life and consciousness. Metaphor and Symbolic Activity 6.2: 11943.
Sinclair, M., and M. Stambak. 1993. Pretend play among three-year-olds.
Mahwah, NJ: Lawrence Erlbaum.
Vosniadou, S. 1987. Children and metaphors. Child Development
58: 87085.
Vosniadou, S., and A. Ortony. 1983. The emergence of the literal-metaphorical anomolous distinction in young children. Child Development
54: 15461.
Winner, E. 1979. New names for old things: The emergence of metaphoric language. Journal of Child Language 6: 46991.
. 1997. The Point of Words: Childrens Understanding of Metaphor
and Irony. Cambridge: Harvard University Press.
Winner, E., M. McCarthy, and H. Gardner. 1980. The ontogenesis of metaphor. In Cognition and Figurative Language, ed. R. P. Honeck and
R. Hoffman, 34161. Hillsdale, NJ: Lawrence Erlbaum.
Winner, E., A. K. Rosenstiel, and H. Gardner. 1976. The development of
metaphoric understanding. Developmental Psychology 12: 28997.

METAPHOR, INFORMATION TRANSFER IN


The study of metaphor is currently dominated by conceptual metaphor theory. One alternative was put forth by Amos
Tversky, then further developed by Andrew Ortony and others.
This account begins with the idea that we understand metaphors
by scanning entries in our mental lexicon, transferring relevant
features from a source to a target (see source and target). In
some versions, the process is viewed as involving a wider range
of information and components of cognitive architecture
beyond semantic features.
Consider the following situation. Smith monopolizes discussion in a department meeting. Afterward, Doe asks Jones
what she thought of the debate. She replies, Smith is a braying
donkey. Using standard cognitive architecture, we might analyze this as follows: Jones and Doe both have lexical entries for
donkey, bray, and Smith. They also have episodic memories of
the recent department meeting. The recent events are primed
or partially activated (see priming, semantic; spreading
activation). The mention of Smith serves to further activate
the episodic memories of Smith in the meeting. The lexical entry
for bray involves such elements as produce a sound using vocal
chords. This serves to further activate the episodic memories
of vocal chord sounds in the meeting. Specifically, in conjunction with Smith, it serves to strongly activate episodic memories
of Smith speaking. Following principles of conversational
implicature, Doe assumes that Jones is making some positive
contribution to the conversation. Thus, Doe looks for new information in Joness statement. There is no new information in what

Metaphor, Information Transfer in


we have isolated thus far that Smith used his vocal chords to
make a sound. The new information comes with distinctive features of the metaphorical source. Specifically, braying does not
apply to every use of vocal chords. It applies only to a particular sort of nonlinguistic thus, meaningless and inarticulate
sound. Doe synthesizes this information in working memory.
He understands, roughly, that (in Joness view) Smiths speech
was meaningless and inarticulate.
Thus far, however, the analysis does not distinguish the
understanding of metaphor from that of literal statements. In
both cases, there is a complex synthesis of lexical and episodic
information in working memory; this leads to contextually relevant inference. What, then, is the difference between a metaphorical statement and a literal one?
One account begins by making metaphor a matter of interpretation, rather than a matter of some intrinsic linguistic property. Specifically, a speaker intends an utterance metaphorically
when he or she intends the addressee to interpret the utterance metaphorically. What, then, constitutes metaphorical
interpretation?
Our mental lexicons are organized into clusters of information bearing on particular objects and types of objects (see
schema, prototype ). This information is arranged hierarchically. There are certain things that we take to be more crucial or definitive features of a given type of object. For example,
being made from milk is a more important property of cheese
than being white or yellow. Moreover, a range of high-level
properties are default properties. If a default does not apply,
then we commonly have specifiable alternatives. Thus, we
assume (as a default) that an unknown person say, Jones
has two arms. But if we learn that she does not, we assume that
she suffered some birth defect or is an amputee, these being the
standard alternatives.
When interpreting a statement literally, we assume that all
default information applies unless it is specifically contradicted.
Moreover, if a default is contradicted, we assume that one of the
standard alternatives applies. In contrast, when interpreting a
statement metaphorically, we do not assume that default information applies. That is the definitive difference. In interpreting a
statement either metaphorically or literally, we scan lexical information to glean what is most relevant to the topic at hand. But
when interpreting literally, we assume that unselected, default
information applies as well. We do not assume this when interpreting metaphorically.
The basic difference has several consequences. One is worth
mentioning. All interpretation involves drawing on a range of
associated information, not only that included in the lexical
entries for the source and target items. In metaphorical interpretation, the loosening of hierarchical structures (e.g., through the
nonassumption of defaults) may encourage the incorporation
of more distant associations, including primed emotional associations. For example, when Jones refers to Smith as a braying
donkey, she not only characterizes Smith but also expresses and
tries to communicate a certain feeling.
This account is similar to conceptual metaphor theory in
stressing cognition. However, it suggests that the cognitive
effects of metaphors need not be profound. Writers adopting this
account commonly view metaphor as operating more locally.

Metaphor, Neural Substrates of


Indeed, they see many conceptual metaphors as lexicalized. For
example, pass away just has die as one of its literal (lexicalized) meanings. It does not operate metaphorically.
One obvious advantage of this account is that it explains the
prominence of mixed metaphors. Consider a sentence such as I
tapped into the good life on the road to acing my degree. Some
elements are lexicalized here. Others are interpreted metaphorically, but only to the extent required by context. In contrast,
conceptual metaphor theory might lead us to expect greater
consistency in the use of standard metaphorical mappings. The
present account does have more difficulty explaining consistency when it does occur, as when someone says I followed the
straight and narrow path to the reach my destination a degree.
However, it may be possible to account for such consistency by
ordinary processes of priming, both current and historical (see
Hogan 2002).
One future task is to develop this account in terms of neural
substrates. Consistent with the preceding analysis, neuroscientific research indicates that there is no sharp metaphorical/
literal division. Certain interpretive tasks demand greater activation of a broad range of meanings before selection. These tasks
often involve metaphorical interpretation, but not invariably
(see metaphor, neural substrates of). Currently, we are
not in a position to examine semantic processing in a sufficiently
fine-grained way to consider the processes posited here. We may
distinguish different categories of information (e.g., perceptionrelated versus motor-related; see semantics, neurobiology
of), but not precise features, defaults, and so on. Possibilities for
future research may be suggested by modeling these processes
in connectionist networks particularly the key difference
between assuming that defaults apply and assuming that they
do not.
Patrick Colm Hogan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Hogan, Patrick Colm. 2002. A minimal, lexicalist/constituent transfer
account of metaphor. Style 36.3: 484502.
Ortony, Andrew. 1988. Are emotion metaphors conceptual or lexical?
Cognition and Emotion 2: 95104.
. 1993. The role of similarity in similes and metaphors. In
Metaphor and Thought (2d ed.), ed. Andrew Ortony, 34256. New York:
Cambridge University Press.
Tversky, Amos. 1977. Features of similarity. Psychological Review
84: 32752.

METAPHOR, NEURAL SUBSTRATES OF


The interest in how the brain processes metaphors traces its
origins back to a tradition that regarded figurative language as
poetic and, hence, the opposite of literal language. Despite its
ubiquity (Lakoff and Johnson 1980), the underlying assumption
has been that this difference should be reflected both in behavioral (Grice 1975; Searle 1979) and brain mechanisms. In this
entry, we examine this and other long-standing assumptions,
suggesting that the interactions of linguistics with empirical,
neuropsychological, and neuroscientific research have drawn
a far more complex and, arguably, fascinating picture, not only
about metaphor but also about the brain.

489

Metaphor, Neural Substrates of

Is Metaphor Really So Different?


Since the 1970s, the assumption that metaphors are processed
differently from literals has come under close scrutiny. For
example, on the basis of psycholinguistic experiments, it has
been argued that in the presence of rich and supportive context, metaphors and literals are processed along the same routes
(Gibbs 1994; Ortony et al. 1978).
Although some metaphoric and literal expressions require
similar processes (Glucksberg 2001), it has also become increasingly evident that the categories used are in themselves heterogeneous. For instance, some literals (the ring was made of tin, with
a pebble instead of a gem) require more complex (metaphor-like)
conceptual mapping processes than others (That stone we saw
in the natural history museum is a gem; Coulson and Van Petten
2002). Others (curl up and dye) are more appealing though harder
to process than metaphoric equivalents (curl up and die; Giora
2003). Metaphors are not all alike either; some are novel, having nonsalient metaphoric interpretations that are usually more
appealing yet harder to process than those that are conventional
and salient (Giora et al. 2004). Furthermore, some metaphoric
stimuli, though relatively conventional, may still be more openended than others and, when functioning as a context, give rise
to a wider range of associations (Stringaris et al. 2006).
In fact, recent findings indicate that notions such as degree
of salience, complexity, or open-endedness may be more suitable for describing the complexity of some of the phenomena
in question and span the metaphor-literal divide. Furthermore,
while these notions may, to an extent, overlap, none of them is
specific to metaphor.

Is Metaphor Processed Differently in the Brain?


Consistent with the prevailing view of the right hemisphere
(RH) as being more adept at creativity than the left hemisphere (LH), early lesion studies have been interpreted as
evidence that metaphors rely more heavily than their literal
counterparts on regions in the RH (Winner and Gardner 1977).
However, Ellen Winner and Howard Gardners study actually
reveals that patients with RH lesions were not insensitive to metaphor (1977, 725) when offering verbal explications to figurative
stimuli, although they tended to erroneously select literal over
metaphoric interpretations in a picture-matching task. Similarly,
the results of the earliest imaging study in the field (Bottini et al.
1994) were also seen as supporting a RH predominance for metaphor comprehension. However, alternative explanations may be
more appropriate, given that the linguistic items used also differed on categories other than sensu strictu metaphoricity.
Indeed, subsequent studies have challenged the purported
predominance of the RH by demonstrating that when conventional metaphors compared to literals are processed, the LH is
more active (Ahrens et al. 2007; Lee and Dapretto 2006; Oliveri,
Romero, and Papageno 2004), perhaps reflecting retrieval from
semantic stores. In fact, most recent research suggests that in
the absence of a rich biasing context, the hemispheres are insensitive to figurativeness. Rather, the RH is more sensitive than the
LH to novel, nonsalient interpretations and poetic associations,
to complexity, and to open-endedness (Blasko and Kazmerski
2006; Giora 2007). This is corroborated by a recent fmri study
showing that failure to recruit RH areas when processing novel

490

metaphors distinguishes patients with schizophrenia from


healthy controls (Kircher et al. 2007).
Taken together, these findings suggest that lateralization
in the brains hemispheres is contingent upon such factors as
novelty, semantic and conceptual mapping complexity, and
evoked range of associations, all of which seem to act independently of figurativeness, thus challenging as too simplistic the
notion of a preferential RH processing of stimuli solely by virtue of their metaphoricity. These factors, however, are in accordance with an alternative account the finecoarse semantic
coding hypothesis (Beeman 1998; Jung-Beeman 2005) which
views the LH as adept at processing finely tuned semantic relations and the RH as specialized in processing distant semantic
relationships.
NOVELTY. Recent studies indicate that the degree of novelty of
an expression is an important determinant of neural processing. For instance, lesion studies (Giora et al. 2000; Kaplan et
al. 1990) and studies of individuals with Alzheimers disease
(Amanzio et al. 2008), as well as functional magnetic resonance
imaging (fMRI) studies involving healthy participants (Eviatar
and Just 2006), demonstrated that processing non-salient
(ironic, metaphoric) interpretations relied more heavily
on the RH; processing conventional (metaphoric) meanings
involved the LH. Similarly, a series of fMRI, divided visual field
(DVF), and event-related potential (ERP) studies demonstrated
increased activation of RH areas during processing of nonsalient interpretations of novel metaphors (Arzouan, Goldstein,
and Faust 2007; Faust and Mashal 2007; Mashal and Faust 2008;
Mashal, Faust, and Hendler 2005; Mashal et al. 2007) and literal/compositional interpretations of idioms (Mashal et al.
2008). And while RH advantage was demonstrated in processing nonsalient interpretations of novel metaphors during first
exposure, repeated exposure benefited the LH (Mashal and
Faust 2009).
COMPLEXITY. That RH recruitment increases with complex sentences has been demonstrated by a number of studies (JungBeeman 2005). This has also been seen as typifying conceptual
mapping complexity (Coulson and Van Petten 2002), thus introducing another parameter that may determine processing and
operate regardless of metaphoricity. Further work is awaited to
establish this view.
RANGE OF SEMANTIC ASSOCIATIONS. Range of semantic associations, also termed degree of open-endedness, can be seen
as determined by the extent to which a stimulus evokes a wide
network of semantic associations (Black 1993). In a fMRI study,
Stringaris et al. (2006) showed that deciding that a given probe
was unrelated to a previous neutral context triggered activation
of frontal RH areas following open-ended (metaphoric) contexts
(Some answers are straight) but not following more restricted
(literal) contexts (Some answers are emotional). In the case of
the open-ended primes (see priming, semantic), both negative and positive decisions elicited the same neural responses.
Indeed, higher degree of open-endedness may lead to increased
RH activation, probably because of the evocation of remotely
related associations (Jung-Beeman 2005). As shown by Mashal

Metaphor, Neural Substrates of


et al. (in press), RH areas were uniquely involved when novel literal interpretations of familiar idioms (involving their familiar
idiomatic meanings as well) were deliberated on.
CONTEXTUAL INFORMATION. Contextual factors involved in
processing (such as biasing information, task, mood, or experience) further argue against a specific and invariant brain
locus for metaphor (Kutas 2006). They show that recruitment
of neural networks depends upon factors other than metaphoricity per se. For instance, in Coulson and Van Petten (2007),
RH advantage in processing novel metaphors disappears in
the presence of biasing information. In Kacinik and Chiarello
(2007), both hemispheres were activated by metaphors, but
only the LH-response was context sensitive, thereby restricting
the range of possible alternatives. Conversely, the response in
the RH indicated retention of alternatives available for processing. Findings in Rapp et al. (2007) indicate that the type of task
is an additional determinant of processing. When participants
had to judge the emotional valence of connotations, metaphors elicited LH regions, despite their novelty. In Stringaris
et al. (2006), familiar metaphors activated RH areas when a
coherence judgment was required; however, when a meaningfulness judgment was required, same stimuli evoked LH areas
(Stringaris et al. 2007). In Blasko and Kazmerski (2006), it was
individual differences in experience that mattered: Poets and
nonpoets differed as to which brain areas were recruited when
reading poetry.
In sum, recent research, involving a wide range of methodologies, does not provide support for the long-assumed special
status of metaphor in language. Instead, it shows that the processing of metaphors in the brain depends on a great number of
factors beyond figurativeness.
Rachel Giora and Argyris K. Stringaris
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ahrens, Kathleen, Ho-Ling Liu, Chia-Ying Lee, Shu-Ping Gong, Shin-Yi
Fang, and Yuan-Yu Hsu. 2007. Functional MRI of conventional and
anomalous metaphors in Mandarin Chinese. Brain and Language
100: 16371.
Amanzio, Martina, Giuliano Geminiani, Daniela Leotta, and Stefano
Cappa. 2008. Metaphor comprehension in Alzheimers disease: Novelty matters. Brain and Language 107.1: 110.
Arzouan, Yossi, Abraham Goldstein, and Miriam Faust. 2007. Brain
waves are stethoscopes: ERP correlates of novel metaphor comprehension. Brain Research 1160: 6981.
Beeman, Mark. 1998. Coarse semantic coding and discourse comprehension. In Right Hemisphere Language Comprehension: Perspectives
from Cognitive Neuroscience, ed. Mark Beeman and Christine Chiarello,
25584. Mahwah, NJ: Erlbaum.
Black, Max. 1993. More about metaphor. In Metaphor and Thought (2d
ed.), ed. Andrew Ortony, 1941. Cambridge: Cambridge University
Press.
Blasko, G. Dawn, and Victoria A. Kazmerski. 2006. ERP correlates of
individual differences in the comprehension of nonliteral language.
Metaphor and Symbol 21.4: 26784.
Bottini, Gabriella., Corcoran Rhiannon, Roberto Sterzi, Eraldo Paulesu,
P. Schenone, P. Scarpa, et al. 1994. The role of the right hemisphere
in the interpretation of figurative aspects of language: A positron emission tomography activation study. Brain 117: 124153.

Coulson, Seana, and Cyma Van Petten. 2002. Conceptual integration


and metaphor comprehension: An ERP study. Memory & Cognition
30: 95868.
. 2007. A special role for the right hemisphere in metaphor comprehension? ERP evidence from hemifield presentation. Brain
Research 1146: 12845.
Eviatar, Zohar, and Marcel Just. 2006. Brain correlates of discourse
processing: An fMRI investigation of irony and metaphor comprehension. Neuropsychologia 44: 234859.
Faust, Miriam, and Nira Mashal. 2007. The role of the right cerebral
hemisphere in processing novel metaphoric expressions taken from
poetry: A divided visual field study. Neuropsychologia 45: 86070.
Gibbs, W. Raymond, Jr. 1994. The Poetics of Mind. Cambridge: Cambridge
University Press.
Giora, Rachel. 2003. On Our Mind: Salience, Context and Figurative
Language. New York: Oxford University Press.
Giora, Rachel, ed. 2007. Is Metaphor Unique? Neural Correlates of
Nonliteral Language. Brain and Language 100: 2.
Giora, Rachel, Ofer Fein, Ann Kronrod, Idit Elnatan, Noa Shuval, and
Adi Zur. 2004. Weapons of mass distraction: Optimal innovation and
pleasure ratings. Metaphor and Symbol 19: 11541.
Giora, Rachel, Eran Zaidel, Nachum Soroker, Gila Batori, and Asa
Kasher. 2000. Differential effects of right- and left-hemisphere damage on understanding sarcasm and metaphor. Metaphor and Symbol
15: 6383.
Glucksberg, Sam. 2001. Understanding Figurative Language: From
Metaphors to Idioms. New York: Oxford University Press.
Grice, H. Paul. 1975. Logic and conversation. In Speech Acts: Syntax
and Semantics. Vol. 3. Ed. Peter Cole and Jerry Morgan, 4158. New
York: Academic Press.
Jung-Beeman, Mark. 2005. Bilateral brain processes for comprehending
natural language. Trends in Cognitive Sciences 9: 51218.
Kacinik, A. Natalie, and Christine Chiarello. 2007. Understanding metaphors: Is the right hemisphere uniquely involved? Brain and Language
100: 188207.
Kaplan, Joan A., Hiram H. Brownell, Janet R. Jacobs, and Howard
Gardner. 1990. The effects of right hemisphere damage on the pragmatic interpretation of conversational remarks. Brain and Language
38: 31533.
Kircher, T. J. Tilo, Dirk T. Leube, Michael Erb, Wolfgang Grodd, and
Alexander M. Rapp. 2007. Neural correlates of metaphor processing
in schizophrenia. NeuroImage 34: 2819.
Kutas, Marta. 2006. One lesson learned: Frame language processing
literal and figurative as a human brain function. Metaphor and
Symbol 21: 285325.
Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By.
Chicago: University of Chicago Press.
Lee, S. Susan, and Mirella Dapretto. 2006. Metaphorical vs. literal word
meanings: fMRI evidence against a selective role of the right hemisphere. NeuroImage 29: 53644.
Mashal, Nira, and Miriam Faust. 2008. Right hemisphere sensitivity to
novel metaphoric relations: Application of the signal detection theory. Brain and Language 104.2: 10312.
. 2009. Conventionalization of novel metaphors: A shift in hemispheric asymmetry. In Laterality 14.6:57389.
Mashal, Nira, Miriam Faust, and Talma Hendler. 2005. The role of
the right hemisphere in processing nonsalient metaphorical meanings: Application of principal components analysis to fMRI data.
Neuropsychologia 43.14: 2084100.
Mashal, Nira, Miriam Faust, Talma Hendler, and Mark Jung-Beeman.
2007. An fMRI investigation of the neural correlates underlying the
processing of novel metaphoric expressions. Brain and Language
100: 11526.

491

Metaphor, Universals of
. 2008. Hemispheric differences in processing the literal interpretation of idioms: Converging evidence from behavioral and fMRI studies. Cortex 44.7: 84860.
Oliveri, Massimiliano., Leonor Romero, and Costanza Papagno. 2004.
Left but not right temporal involvement in opaque idiom comprehension: A repetitive transcranial magnetic stimulation study. Journal of
Cognitive Neuroscience 16: 84855.
Ortony, Andrew, Diane L. Schallert, Ralph E. Reynolds, and Stephen J.
Antos. 1978. Interpreting metaphors and idioms: Some effects of
context on comprehension. Journal of Verbal Learning and Verbal
Behavior 17: 46577.
Rapp, M. Alexander, Dirk T. Leube, Michael Erb, Wolfgang Grodd, and
Tilo T. J. Kircher. 2007. Laterality in metaphor processing: Lack of evidence from functional magnetic resonance imaging for the right hemisphere theory. Brain and Language 100: 1429.
Searle, John. 1979. Expression and Meaning. Cambridge: Cambridge
University Press.
Schmidt, L. Gwen, Casey J. DeBuse, and Carol A. Seger. 2007. Right
hemisphere metaphor processing? Characterizing the lateralization
semantic processes. Brain and Language 100: 12741.
Stringaris, K. Argyris, Nicholas C. Medford, Vincent C. Giampietro,
Michael J. Brammer, and Anthony S. David. 2007. Deriving meaning: Distinct neural mechanisms for metaphoric, literal, and nonmeaningful sentences. Brain and Language 100: 15062.
Stringaris, K. Argyris, Nicholas C. Medford, Rachel Giora, Vincent C.
Giampietro, Michael J. Brammer, and Anthony S. David. 2006. How
metaphors influence semantic relatedness judgments: The role of the
right frontal cortex. NeuroImage 33: 78493.
Winner, Ellen, and Howard Gardner. 1977. The comprehension of metaphor in brain-damaged patients. Brain 100: 71729.

of evidence from a number of linguists who are native speakers of the respective languages, Zoltn Kvecses (2000) points
out that English, Japanese, Chinese, Hungarian, Wolof, Zulu,
Polish, and others possess the metaphor AN ANGRY PERSON
IS A PRESSURIZED CONTAINER, to various degrees. Ning Yus
(1995, 1998) work indicates that that the metaphor HAPPINESS
IS UP is also present not only in English but also in Chinese. The
system of metaphors called the event structure metaphor (Lakoff
1993) includes such submetaphors as CAUSES ARE FORCES,
STATES ARE CONTAINERS, PURPOSES ARE DESTINATIONS,
ACTION IS MOTION, DIFFICULTIES ARE IMPEDIMENTS
(TO MOTION), and so forth. Remarkably, this set of submetaphors occurs in such widely different languages and cultures as
Chinese (Yu 1998) and Hungarian (Kvecses 2005), in addition
to English. Eve Sweetser (1990) noticed that the KNOWING IS
SEEING and the more general the MIND IS THE BODY metaphors can be found in many European languages and are probably good candidates for (near-)universal metaphors. As a final
example, George Lakoff and Mark Johnson (1999) describe the
metaphors used for ones inner life in English. It turns out that
metaphors such as SELF-CONTROL IS OBJECT POSSESSION,
SUBJECT AND SELF ARE ADVERSARIES, and THE SELF IS A
CHILD are shared by English, Japanese, and Hungarian. Given
that ones inner life is a highly elusive phenomenon and, hence,
would seem to be heavily culture and language dependent, one
would expect a great deal of significant cultural variation in such
a metaphor. All in all, then, we have a number of cases that constitute near-universal or potentially universal conceptual metaphors, though not universal metaphors in the strong sense.

METAPHOR, UNIVERSALS OF
Universal Metaphors?
Native speakers of all languages use a large number of metaphors when they communicate about the world (Lakoff and
Johnson 1980). Such metaphorically used words and expressions may vary considerably across different languages. For
example, the idea that is expressed in English with the words
spending your time is expressed in Hungarian as filling your time.
The images that different languages and cultures employ to
code meanings can be extremely diverse. Given this diversity, it
is natural to ask: Are there any universal metaphors at all, if by
universal we mean those linguistic metaphors that occur in each
and every language? This question is difficult because it goes
against our everyday experiences and intuitions as regards metaphorical language in diverse cultures; it would also be extremely
difficult to study, given that there are 4,0006,000 languages spoken around the world today.
If we go beyond looking at metaphorically used linguistic
expressions in different languages, however, and look at conceptual metaphors instead of linguistic metaphors, we begin
to notice that many conceptual metaphors appear in a wide
range of languages. For example, Hoyt Alverson (1994) found
that the TIME IS SPACE conceptual metaphor can be found
in such diverse languages and cultures as English, Mandarin
Chinese, Hindi, and Sesotho. Many other researchers suggested
that the same conceptual metaphor is present in a large number of additional languages. Several other conceptual metaphors
appear in a large number of different languages. On the basis

492

How Can We Have (Near-)Universal Metaphors?


How is it possible that such conceptual metaphors exist in diverse
languages and cultures? After all, the languages belong to very
different language families and represent very different cultures
of the world. Several answers to this question lend themselves
for consideration. First, we can suggest that by coincidence, all
these languages developed the same conceptual metaphors for
happiness, time, purpose, and so on. Second, we can consider
the possibility that languages borrowed the metaphors from
one another. Third, we can argue that there may be some universal basis for the same metaphors to develop in the diverse
languages.
Let us take as an example the HAPPINESS IS UP conceptual metaphor, first discussed by Lakoff and Johnson (1980) in
English. This conceptual metaphor can be seen in such linguistic expressions as feeling up, being on cloud nine, being high, and
others. Yu (1995, 1998) noticed that the conceptual metaphor
can also be found in Chinese. And evidence shows that it also
exists in Hungarian. Following are some linguistic examples (Yu
used the grammatical abbreviations PRT = particle and ASP =
aspect marker):
Chinese:
happy is up
Ta hen gao-xing.
he very high-spirit
He is very high-spirited/happy.

Metaphor, Universals of
Ta xing congcong de.
he spirit rise-rise PRT
His spirits are rising and rising./Hes pleased and excited.
Zhe-xia tiqi le wo-de xingzhi.
this-moment raise ASP my mood
This time it lifted my mood/interest.
Hungarian:
happiness is up
Ez a film feldobott.
this the film up-threw-me
This film gave me a high.-This film made me happy.
Majd elszll a boldogsgtl.
almost away-flies-he/she the happiness-from
He/she is on cloud nine.
English, Mandarin Chinese, and Hungarian (a Finno-Ugric
language) belong to different language families, which
developed independently for much of their history. It is also
unlikely that the three languages had any significant impact on
one another in their recent history. This is not to say that such an
impact never shapes particular languages as regards their metaphors (e.g., the processes of globalization and the widespread
use of the Internet may popularize certain conceptual metaphors, such as TIME IS A COMMODITY), but only to suggest
that the particular HAPPINESS IS UP metaphor does not exist in
the three languages because, say, Hungarian borrowed it from
Chinese and English from Hungarian.
So how did the same conceptual metaphor emerge, then, in
these diverse languages? The best answer seems to be that there is
some universal bodily experience that led to its emergence. Lakoff
and Johnson argued early that English has the metaphor because
when we are happy, we tend to be physically up, move around, be
active, jump up and down, smile (i.e., turn up the corners of the
mouth), rather than down, inactive, and static, and so forth. These
are undoubtedly universal experiences associated with happiness
(or, more precisely, joy), and they are likely to produce potentially
universal (or near-universal) conceptual metaphors. The emergence of a potentially universal conceptual metaphor does not,
of course, mean that the linguistic expressions themselves will be
the same in different languages that possess a particular conceptual metaphor (Barcelona 2000; Maalej 2004).
Kvecses (1990, 2000) proposed, furthermore, that the universal bodily experiences can be captured in the conceptual
metonymies associated with particular concepts. Specifically,
in the case of emotion concepts such as happiness, anger, love,
pride, and so forth, the metonymies correspond to various kinds
of physiological, behavioral, and expressive reactions. These
reactions provide us with a profile of the bodily basis of emotion
concepts. Thus, the metonymies give us a sense of the embodied
nature of concepts, and the embodiment of concepts may be
overlapping, that is, (near-)universal, across different languages
and language families. Such universal embodiment may lead to
the emergence of shared conceptual metaphors.
Joseph Grady (1997a, 1997b) developed the Lakoff-Johnson
view further by proposing that we need to distinguish complex

metaphors from primary metaphors. His idea was that complex metaphors (e.g., THEORIES ARE BUILDINGS) are composed of primary metaphors (e.g., LOGICAL ORGANIZATION IS
PHYSICAL STRUCTURE). The primary metaphors consist of correlations of a subjective experience with a physical experience.
As a matter of fact, it turned out that many of the conceptual
metaphors discussed in the cognitive linguistic literature
are primary metaphors in this sense. For instance, HAPPY IS UP
is best viewed as a primary metaphor, wherein being happy is a
subjective experience and being physically up is a physical one
that is repeatedly associated with it. Other primary metaphors
include MORE IS UP, PURPOSES ARE DESTINATIONS, and
INTIMACY IS CLOSENESS. On this view, it is the primary metaphors that are potentially universal.
Primary metaphors function at a fairly local and specific level
of conceptualization, and, hence, in the brain. At the same time,
we can also assume the existence of much more global metaphors (see also generic- and specific-level metaphors).
For example, animals are commonly viewed as humans and
humans as animals; humans are commonly conceptualized as
objects and objects as humans, and so on. A famous example of
the objects as humans metaphor was described by Keith Basso
(1967), who showed that in the language of the Western Apache,
cars are metaphorically viewed in terms of the human body. In
addition, the work of Bernd Heine and his colleagues (Heine,
Claudi, and Hnnemeyer 1991; Heine 1995; Heine and Kuteva
2002) reveals other large-scale metaphorical processes that people seem to employ nearly universally; for example, spatial relations are commonly understood as parts of the human body (e.g.,
the head means up and the feet mean down). These conceptual
metaphors seem to be global design features of the brain/mind
of human beings.
It seems to be clear at this point that commonality in human
experience is a major force shaping the metaphors we have. It is
this force that gives us many of the metaphors that we can take
to be near-universal or potentially universal. But commonality
in human experience is not the only force that plays a role in
the process of establishing and using metaphors. There are also
countervailing forces that work against universality in metaphor
production.

Causes of Metaphor Variation


Heines work also shows that not even such global metaphors as
SPATIAL RELATIONS ARE PARTS OF THE BODY are universal in
an absolute sense. There are languages in which spatial relations
are conceptualized not as the human but as the animal body.
Heine points out that such languages function in societies where
animal husbandry is a main form of subsistence. This leads us
to the question: What causes our metaphors to vary as they do?
It is convenient to set up two large groups of causes: differential
experience and differential cognitive preferences. Differential
experience involves differences in the social-cultural context, in
social and personal history, and in what we can term social and
personal concern or interest (see Kvecses 2005).
One example of how the social-cultural context can shape
conceptual metaphors is provided by Dirk Geeraerts and Stephan
Grondelaers (1995). They note that in the Euro-American tradition, it is the classical-medieval notion of the four humors from

493

Metaphor, Universals of
which the Euro-American conceptualization of anger (as well as
that of emotion in general) derived. The humoral view maintains
that the four fluids (phlegm, black bile, yellow bile, and blood)
and the temperatures associated with them regulate the vital
processes of the human body. The humors were also believed to
determine personality types (such as sanguine, melancholy, etc.)
and account for a number of medical problems. The humoral
view exerted a major impact on the emergence of the European
conception of anger as a hot fluid in a pressurized container. By
contrast, Brian King (1989) and Yu (1995 and 1998) suggest that
the Chinese concept of nu (corresponding to anger) is bound up
with the notion of qi, that is, the energy that flows through the
body. Qi in turn is embedded in not only the psychological (i.e.,
emotional) but also the philosophical and medical discourse of
Chinese culture and civilization. When qi rises in the body, there
is anger (nu). Without the concept of qi, it would be difficult to
imagine the view of anger in Chinese culture. Thus, emotion
concepts, such as anger in English, dh in Hungarian (the two
representing European culture), and nu in Chinese, are in part
explained in the respective cultures by the culture-specific concepts of the four humors and qi, respectively. It appears that the
culture-specific key concepts that operate in particular cultures
account for many of the specific-level differences among the various anger-related concepts and the PRESSURIZED CONTAINER
metaphor.
As an example of how differences in human concern can
create new metaphors, consider some well-known conceptual metaphors for sadness: SADNESS IS DOWN, SADNESS IS
A BURDEN, and SADNESS IS DARK. The counterpart of sadness is depression in a clinical context. Linda McMullen and
John Conway (2002) studied the metaphors that people with
episodes of depression use and, with one exception, found the
same conceptual metaphors for depression that nondepressed
people use for sadness. They identified the unique metaphor as
DEPRESSION IS A CAPTOR. Why dont merely sad people talk
about sadness as being a captor? Most people do not normally
talk about being trapped by, wanting to be free of, or wanting
to break out of sadness, although these are ways of talking and
thinking about depression in a clinical context. It makes sense to
suggest that people with depression use this language and way of
thinking about their situation because it faithfully captures what
they experience and feel. Their deep concern is with their unique
experiences and feelings that set them apart from people who
do not have them. It is this concern that gives them the CAPTOR
metaphor for depression (see also emotion and language).
People can employ a variety of different cognitive operations
in their effort to make sense of experience. For example, what I
call experiential focus can have an impact on the specific details
of the conceptual metaphors used, and what is conceptualized
metaphorically in one culture can predominantly be conceptualized by means of metonymy in another (Kvecses 2005). The
universal bodily basis on which universal metaphors could be
built may not be utilized in the same way or to the same extent
in different languages. What experiential focus means is that
different peoples may be attuned to different aspects of their
bodily functioning in relation to a metaphorical target domain
(see source and target) or that they can ignore or downplay
certain aspects of their bodily functioning with respect to the

494

metaphorical conceptualization of a target domain. A case in


point is the conceptualization of anger in English and Chinese.
As studies of the physiology of anger across several unrelated
cultures show, increase in skin temperature and blood pressure are universal physiological correlates of anger (Levenson
et al. 1992). This accounts for the ANGER IS HEAT metaphor in
English and in many other languages. However, Kings and Yus
work mentioned earlier suggest that the conceptualization of
anger in terms of heat is much less prevalent in Chinese than it
is in English. In Chinese, the major metaphors of anger seem to
be based on pressure not heat. This indicates that speakers of
Chinese have relied on a different aspect of their physiology in
the metaphorical conceptualization of anger than speakers of
English. The major point is that in many cases, the universality
of the experiential basis does not necessarily lead to universally
equivalent conceptualization at least not at the specific level of
hot fluids.
Are there any differences in the way the cognitive processes
of metaphor versus metonymy are used in different languages
and cultures? Jonathan Charteris-Black (2003) examined in great
detail how and for what purpose three concepts mouth, tongue,
and lip are figuratively utilized in English and Malay. He found
similarities in metaphorical conceptualization. For example,
in both languages, the same underlying conceptual metaphor
(e.g., MANNER IS TASTE) accounts for expressions like honeytongued and lidah manis (tongue sweet), and in both languages
such expressions are used for the discourse function of evaluating (especially negatively) what a person says. However, he also
found that the figurative expressions involving the three concepts
tended to be metonymic in English and metaphoric in Malay. In
English, more than half of the expressions were metonyms, while
in Malay the vast majority of them showed evidence of metaphor
(often in combination with metonymy). For example, while metonymic expressions like tight-lipped abound in English, such
expressions are much less frequent in Malay. It seems that, at
least in the domain of speech organs, the employment of these
concepts by means of figurative processes is partially culture
specific.
In sum, metaphorical linguistic expressions may vary widely
cross-culturally, but many conceptual metaphors appear to be
potentially universal or near-universal. This happens because
people across the world share certain bodily experiences.
However, even such potentially universal metaphors may display variation in their specific details because people do not
use their cognitive capacities in the same way from culture to
culture. Moreover, shared conceptual metaphors may vary
cross-culturally in the frequency of their use. Finally, many conceptual metaphors are unique to particular (sub)cultures or sets
of cultures because of differences in such factors as the socialcultural context, history, or human concern that characterize
these cultures.
Zoltn Kvecses
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alverson, Hoyt. 1994. Semantics and Experience: Universal Metaphors
of Time in English, Mandarin, Hindi, and Sesotho. Baltimore: Johns
Hopkins University Press.

Metaphor, Universals of
Barcelona, Antonio. 2000. On the plausibility of claiming a metonymic
motivation for conceptual metaphor. In Metaphor and Metonymy at
the Crossroads, ed. A. Barcelona, 3158. Berlin: Mouton de Gruyter.
Basso, Keith H. 1967. Semantic aspects of linguistic acculturation.
American Anthropologist, n.s., 69.5: 4717.
Charteris-Black, Jonathan. 2003. Speaking with forked tongue: A comparative study of metaphor and metonymy in English and Malay
phraseology. Metaphor and Symbol 18.4: 289310.
Geeraerts, Dirk, and Stephan Grondelaers. 1995. Looking back at
anger: Cultural traditions and metaphorical patterns. In Language
and the Cognitive Construal of the World, ed. J. Taylor and R. MacLaury,
15379. Berlin: Gruyter.
Grady, Joseph. 1997a. Foundations of meaning: Primary metaphors and
primary scenes. Ph.D. diss., University of California at Berkeley.
. 1997b. Theories are buildings revisited. Cognitive Linguistics
8: 26790.
Haspelmath, Martin. 1997. From Space to Time: Temporal Adverbials in
the Worlds Languages. Munich and Newcastle: Lincom Europa.
Heine, Bernd. 1995. Conceptual grammaticalization and prediction. In
Language and the Cognitive Construal of the World, ed. J. Taylor and R.
MacLaury, 11935. Berlin: Mouton de Gruyter.
Heine, Bernd, Ulrike Claudi, and Friederike Hnnemeyer. 1991.
Grammaticalization: A Conceptual Framework. Chicago: University of
Chicago Press.
Heine, Bernd, and Tania Kuteva. 2002. World Lexicon of
Grammaticalization. Cambridge: Cambridge University Press.
King, Brian. 1989. The conceptual structure of emotional experience in
Chinese. Ph.D. diss., Ohio State University.
Kvecses, Zoltn. 1990. Emotion Concepts. Berlin and New York: SpringerVerlag.
. 2000. Metaphor and Emotion. New York and Cambridge:
Cambridge University Press.
. 2005. Metaphor in Culture: Universality and Variation. Cambridge
and New York: Cambridge University Press.
Lakoff, George. 1993. The contemporary theory of metaphor. In
Metaphor and Thought, ed. A. Ortony, 20251. Cambridge: Cambridge
University Press.
Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By.
Chicago: University of Chicago Press.
. 1999. Philosophy in the Flesh. New York: Basic Books.
Levenson, R. W., P. Ekman, K. Heider, and W. V. Friesen. 1992. Emotion
and autonomic nervous system activity in the Minangkabau of West
Sumatra. Journal of Personality and Social Psychology 62: 97288.
Maalej, Zouhair. 2004. Figurative language in anger expressions in
Tunisian Arabic: An extended view of embodiment. Metaphor and
Symbol 19.1: 5175.
McMullen, Linda, and John Conway. 2002. Conventional metaphors for
depression. In Verbal Communication of Emotion: Interdisciplinary
Perspectives, ed. S. Fussell, 16781. Mahwah, NJ: Lawrence
Erlbaum.
Mithen, Steven. 1996. The Prehistory of the Mind: A Search for the Origin
of Art, Science and Religion. London and New York: Thames and
Hudson.
. 1998. A creative explosion? Theory of mind, language, and the
disembodied mind of the Upper Paleolithic. In Creativity in Human
Evolution and Prehistory, ed. S. Mithen, 16591. London and New
York: Routledge.
Sweetser, Eve. 1990. From Etymology to Pragmatics. Cambridge and New
York: Cambridge University Press.
Yu, Ning. 1995. Metaphorical expressions of anger and happiness in
English and Chinese. Metaphor and Symbolic Activity 10: 5992.
. 1998. The Contemporary Theory of Metaphor in Chinese: A
Perspective from Chinese. Amsterdam: John Benjamins.

Meter

METER
Verse is text that is divided into lines (verse lines). One of the
subtypes of verse is metrical verse. In metrical verse, the length
of the lines is controlled by a set of rules (indirectly, all metrical
rules count syllables). The lines in metrical verse are usually
subject to other restrictions as well, most commonly restrictions
on the rhythm of the line (based on stress, syllable weight,
or lexical tone), and/or on a requirement that a syllable in a
specific line-internal location be word-initial or word-final (a
caesura rule). Some meters also include rules about rhyme or

alliteration.
Although verse is probably a universal (see poetic form,
universals of), found in all oral or literary traditions, there are
poetic traditions without metrical verse, of which perhaps the
best known is the Hebrew poetry of the Old Testament, which
is based on syntactic parallelism rather than on counted syllables. Metrical verse is found in European literatures (Greek,
English, the various Celtic, Germanic, Romance, and Slavic literatures, also Finnish), in Arabic and Islamic literatures (e.g.,
Persian, Urdu, Turkish, Hausa), and in literatures less clearly
influenced by Arabic (such as Berber and Somali), in the literatures of South Asia (e.g., Sanskrit, Pali, Hindi, Malayalam, Tamil),
of Southeast Asia (e.g., Thai, Burmese, Vietnamese), and of East
Asia (e.g., Chinese, Korean, Japanese). Metrical verse is reported
to be largely or completely absent in the poetry of ancient Semitic
literatures and of Australia, non-Islamic Africa, the Americas,
and New Guinea, but this may just be because researchers have
not been looking for it (fieldworkers far too rarely ask questions
about the poetics or poetic practice of a culture).
The variety of meters can be illustrated by some examples.
English iambic pentameter requires a 10- or 11-syllable line,
with even-numbered syllables tending to have stress. The French
alexandrin requires a line of 12 or 13 syllables, with the sixth syllable both stressed and word-final. Swahili shairi requires a line
of 16 syllables, with the eighth syllable word-final (and no control over rhythm). Greek iambic trimeter requires a line of 12 syllables, with even-numbered syllables heavy (containing a long
vowel or ending in two consonants), and the third, seventh and
eleventh syllables light (containing a short vowel ending in at
most one consonant). Arabic kamil requires a line of between 12
and 15 syllables with a complex rhythmic control (in the shortest line the third, seventh and eleventh syllables are light and the
others heavy). Sanskrit sardulavikridita requires a line of 19 syllables with the aperiodic rhythm heavy heavy heavy light light
heavy light heavy light light light heavy (word boundary) heavy
heavy light heavy heavy light heavy. Japanese meters require
lines of five or seven light syllables (but permit a heavy syllable
to substitute for two light syllables). A genre of Vietnamese pairs
a six-syllable line with an eight-syllable line, in which the second, and sixth (and eighth) syllables belong to one tonal class
and the fourth to another. Germanic alliterative meter requires
between two and four stressed syllables, at least two of which
must alliterate.
In literary studies, meter is usually discussed as an aid to
interpretation, and less attention has been paid to the theory that underlies the meter than is desirable. The approach to
meter most common in such studies is the foot combination and

495

Meter
substitution approach. In this approach, a meter such as iambic
pentameter is a template made by combining five iambic feet
each of which is composed of a sequence of an unmarked syllable followed by a syllable that is marked. The resulting template
is matched to a line whose syllables are unstressed or stressed,
so that stressed syllables occupy marked positions in the template and unstressed syllables the unmarked positions. For
lines that are not fully periodic (e.g., in an iambic pentameter
line where the rhythm does not involve a uniform repetition of
unstressed-stressed throughout), the template itself is changed
by substituting a foot of a different kind (e.g., a spondee for an
iamb) to match the stress pattern of the variant part of the line.
This approach only describes the actual rhythm of the line, and
though it offers a convenient vocabulary for the literary critic, it
tells us nothing about the organization of the meter of the line
or why some variations are possible in this meter and some not.
Most recent theoretical accounts express strong reservations or
total rejection of this approach.
Recent theoretical approaches to meter are based primarily
on linguistic theory, particularly on the theory of phonology,
following the foundational work of Morris Halle and Samuel Jay
Keyser (1971). For metrical purposes, most such theories adopt
mechanisms that are used in the theory of phonology, particularly the theory of word stress. Following Mark Libermans (1975)
insight that stress is a matter of the relation between syllables, not
a feature of syllables, different approaches explored the use of
trees and grids as representations in accounts both of word stress
and of metrical poetry (Kiparsky 1977; Hayes 1983). The phonological theory of optimality theory has also been adapted for
use in metrical verse (e.g., Golston and Riad 2005). Nigel Fabb
and Halle (2008) develop their account of poetic meter from a
formalism proposed for word stress by William Idsardi (1992); it
groups the syllables with the help of unpaired parentheses both
in phonology (word stress) and in lines (metrical verse). While in
most approaches the metrical representation is a template built
by special rules and then matched to the line, for Fabb and Halle
the metrical representation is generated from the line (much as
in generative syntax the syntactic representation is generated
from the terminal elements, such as words or morphemes).
In metrical verse, as noted, the length of the line is controlled.
In most cases, the basic unit of measurement is the syllable.
However, in many metrical traditions, some syllables are part of
the line but uncounted. In a common convention, a syllable ending in a vowel precedes a syllable beginning in a vowel, but only
one of the two syllables is counted for metrical purposes (though
both are usually pronounced). The latter fact shows something
important; it shows that the grouping and counting of syllables
for metrical purposes is not directly dependent on the phonology
of the lines. It also poses evident problems for other approaches,
such as that of Kristin Hanson and Paul Kiparsky (1996), which
attempt to account for variation in number of syllables by referring to the specific phonology of the language.
In Japanese, some Indian meters, and some other metrical
traditions, morae are counted, a heavy syllable counting as two
morae and a light syllable as one mora. It is often argued that
the heavy syllable actually consists phonologically of two morae,
but this is not necessary for an explanation of the meter, which
can refer just to the syllable as projecting one or two metrical

496

elements. In Sanskrit and later Indian meters, morae are thus


counted, but some meters count morae while also controlling
syllables: In the gana-counting meters, syllables form (typically)
four-mora groups that are respected in composing the line.
Some song meters (e.g., Tongan, Ugandan) also use mora counting as an organizing principle, but this may be a secondary effect
in song traditions where heavy syllables match two beats, light
syllables match one, and the number of beats is musically controlled. Here, a better understanding of the independent metrical status of text and tune is required.
In addition to controlling the length of the line, metrical rules
also often control a pattern based on putting the syllables into
two classes, one marked and the other unmarked. It is of particular interest that metrical rules differentiate two kinds of syllable but apparently never three or more kinds, even though this
greater differentiation is phonetically possible in many languages. For example, in English metrical verse, the only strictly
controlled syllables are those that carry main stress in a polysyllable; other syllables, whether stressed or not, are not strictly
controlled, and this is why English metrical verse is rhythmically
fairly variable (see the discussion in music, language and).
This means that as regards the strict regulation of syllable types
in English meters, the syllable carrying the main stress in a polysyllabic word is in one class, and all other syllables, whether
stressed or unstressed, are in the other class. Yet English distinguishes several degrees of stress in longer words, such as autobiographical or onomatopoeic, and there is no question that in the
perception of the rhythm of the line, we perceive more than two
degrees of stress. In the quantitative meters of Greek, Sanskrit, or
Arabic, syllable placement depends on whether a syllable is light
or is heavy. In Vietnamese, there are phonologically six distinct
kinds of tone, but the six types of syllable are grouped into just
two tonal classes for metrical purposes. It is interesting in this
connection to consider alliterative meters, such as the meter of
Beowulf; in the normative line with four stressed syllables, the
third must alliterate with the first and/or second but not with the
fourth. Here, stressed syllables are partitioned into two types
alliterating and not alliterating.
A patterned distribution based on two metrical classes of syllable, such as the heavy and light syllables in Greek verse, is often
thought of as the basis of the rhythm of the line. A major way in
which theories of meter diverge is in their account of the relation
between rhythm and meter. For example, in English iambic pentameter, there is a general tendency for odd-numbered syllables
to be unstressed and even-numbered syllables to be stressed,
but the actual pattern of stressed and unstressed syllables varies constantly from line to line, thus, lines in the same meter can
vary in their rhythm. Some accounts of English meters attempt to
explain the full range of rhythmic variation by building statistical
tendencies into the metrical rules. In a different approach, Derek
Attridge (1982) incorporates rhythm fully into his account of
metrical verse, so that meter and rhythm are accounted for by a
single theory. In his account, the metrical template also includes
elements that match silences in the text (offbeats), thus building temporal notions into the metrical theory. This and similar
accounts must cope with the fact that lines with the same metrical pattern can be realized with different rhythmic patterns,
and vice versa. If rhythm is not explained by the metrical rules,

Methodological Solipsism

Methodology

several types of explanation are possible (and can be combined).


For example, Fabb (2002) argues that the perception of rhythmic
regularity involves pragmatic processes of pattern matching that
are distinct from metrical rules (which govern those aspects of
the line that are strictly controlled and, like other kinds of implicit
linguistic rules, are not directly perceived). It is also possible that
rhythmic patterns might be independently represented, perhaps by grids similar to those found in metrical verse. The link
between the metrical form and the rhythmic form of the verse
then may fall under a theory of text-to-tune matching.
Nigel Fabb and Morris Halle
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Attridge, Derek. 1982. The Rhythms of English Poetry. Harlow,
UK: Longman.
Fabb, Nigel. 2002. Language and Literary Structure: The Linguistic
Analysis of Form in Verse and Narrative. Cambridge: Cambridge
University Press.
Fabb, Nigel, and Morris Halle. 2008. Meter in Poetry: A New Theory.
Cambridge: Cambridge University Press.
Golston, Chris, and Tomas Riad. 2005. The phonology of Greek lyric
meter. Journal of Linguistics 41: 77115.
Halle, Morris, and Samuel Jay Keyser. 1971. English Stress: Its Form, Its
Growth and Its Role in Verse. New York: Harper and Row.
Hanson, Kristin, and Paul Kiparsky. 1996. A parametric theory of poetic
meter. Language 72: 287335.
Hayes, Bruce. 1983. A grid-based theory of English meter. Linguistic
Inquiry 14: 35794.
Idsardi, William. 1992. The computation of stress. Ph.D. diss.,
Massachusetts Institute of Technology.
Kiparsky, Paul. 1977. The rhythmic structure of English verse. Linguistic
Inquiry 8: 189247.
Liberman, Mark. 1975. The intonational system of English. Ph.D. diss.,
Massachusetts Institute of Technology.

METHODOLOGICAL SOLIPSISM
Methodological solipsism (MS) is the thesis that mental (or
psychological) states are to be individuated solely by referring
to their relationships with other mental states and the physical
state of someones brain, but not by referring to the physical
world outside the individual to whom those states are ascribed.
This phrase was coined by Hilary Putnam (1975a) in an essay
about meaning externalism and internalism, but its
main advocate (Putnam opposes the thesis) is Jerry Fodor, who
defends it as part of the representational and computational theory of mind (Fodor 1980).
The main issue that MS is concerned with is the relationship between mental states and the outside world or, rather, the
absence of such relationships so far as explanation in scientific
psychology is concerned. In Putnams (1975a) Twin Earth
thought experiment, the question is whether Putnam and his
Twin are in the same mental state when thinking about water,
given that on Earth water is H2O, whereas on Twin Earth water
has the chemical formula XYZ (though it behaves otherwise identically to water on Earth). If mental states are to explain behavior,
Fodor argues, it needs to be the case that Putnam and his Twin
are in the same mental state when they are thinking I would like
to take a dive into the deep waters. In an externalist account of

mental states, this is impossible because Putnams thought refers


to H2O whereas his Twins thought refers to XYZ. Fodors conclusion from this argument is that mental states are to be construed
narrowly, without reference to the external state of the world.
Only the so-called narrow content and structure of a belief determines behavior, not whether the belief is about H2O or XYZ or
even whether its true or not.
In addition to mental states being internal, Fodor argues,
mental processes have to be computational, that is, work on the
formal, syntactical properties of mental states, rather than on
their semantical properties, which are forbidden in an internalist account. This is called the formality condition.
Fodors main argument for MS is a negative one: Adhering
to its counterpart renders psychology practically impossible
because it assumes the availability of a full description of the
relevant aspects of the world in physical terms, such being
necessary to individuate mental states. That is, we would need
to have the physical description of water available to tell us
what a mental state containing water is about. However, such
physical descriptions are often unavailable. MS is different
from functionalism (as proposed by Putnam) in that it adds the
requirement that mental states are formal, symbolic entities on
which computational processes can work. Functionalism defines
mental states to be determined by their functional role, that is,
their place in a causal network of other mental states, sensory
inputs, and behavior resulting from them. Functionalism sets
apart mental states from their physical substratum, whereas MS
divorces mental states from their causal antecedents in the world
and proposes that mental states are to be treated as syntactical
rather than semantical entities.
Ingmar Visser
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Fodor, Jerry A. 1980. Methodological solipsism considered as a research
strategy in cognitive psychology. Behavioral and Brain Sciences
3: 63109.
Putnam, Hilary. 1975a. The meaning of meaning. In Mind, Language
and Reality: Philosophical Papers, II: 21571. Cambridge: Cambridge
University Press.
. 1975b. The nature of mental states. In Mind, Language and
Reality: Philosophical Papers, II: 42940. Cambridge: Cambridge
University Press.
Tuomela, Raima. 1989. Methodological solipsism and explanation in
psychology. Philosophy of Science 56.1: 2347.

METHODOLOGY
The topic of methodology most generally involves exploring the
range of responses to the following questions that any researcher
in the language sciences must answer: What sort of empirical
data are you collecting, how are you collecting it, and how do
you hope it will bear on the research question(s) you are trying
to answer? Addressing these questions for a particular specialization within the language sciences falls under the purview
of its experts. My goal here is a more general examination of
issues that arise when language scientists assess alternative data
sources and means of data collection and seek to interpret their
data. These high-level choices require more finesse in language

497

Methodology
sciences than in others, perhaps, because (for the purposes of
most researchers) humans are the only creatures that display the
target phenomenon, namely, language. This creates ubiquitous
challenges owing to the great complexity of the human organism (as compared, say, to a fruit fly) and to ethical considerations
that prevent us from carrying out potentially informative procedures that are used with other species.
I structure the discussion around a taxonomy of the ways
in which data can be collected by language scientists. Practical
exigencies limit the discussion to data about (human) language,
though many language scientists need to gather other sorts of
data as well (e.g., computational linguists collect simulation
data, anthropological linguists collect cultural data, dialectologists collect geographical data, etc.). I further narrow the focus
to data collected for research purposes, leaving aside issues particular to clinical (see speech-language pathology), forensic (see forensic linguistics), and other applications. I also
omit discussion of instrumental, statistical, or other formal treatment of data: This is important, but ultimately futile if ones data
do not properly address the questions to be answered.
My taxonomic framework divides empirical methods along
two dimensions. One dimension along which data collection
can be characterized is by the population of speakers/hearers
(hereafter simply speakers, by which I mean also users of sign
languages) from whom one is collecting data: adult native
speakers of a particular language, children growing up in bilingual households, a creole community, the last surviving
speaker of a dying language (see extinction of languages),
the unknown author(s) of an ancient text, the editorial board of a
dictionary, and so on. A second dimension, more or less orthogonal to the first in principle, is how the language data get from
the speaker(s) to the researcher(s). Researchers may observe the
speaker while he or she is doing something involving language,
or they may gather data produced as the result of a prior event
involving language, via an artifact or another person.
As illustrations, I use hypothetical findings that should not
be taken as statements of fact. Because their importance here
lies in clarifying conceptual points, I have not restricted myself
to attested uncontroversial results, though I believe these hypothetical results not to be wildly implausible.

Different Populations
INTRINSIC INTEREST IN SUBGROUPS. Obviously, if ones research
questions are about a particular population (e.g., the language
of autistic children [see autism and language], speakers of
tone languages), then this is an excellent reason for collecting
data from that population, but it is not the only possible reason,
as will be discussed presently. Research on any group other than
the default (healthy adult native speakers) is virtually always
comparative, if only implicitly: In order to know whether one is
really discovering properties of population X, rather than simply
heretofore unknown properties of human language in general,
X must be compared with population Y with regard to the same
properties. For example, finding that some group of bilinguals
has an average vocabulary size of n in their dominant language
would be most interesting in the context of knowing that an
otherwise-comparable group of monolinguals has an average
vocabulary size of, say, 1.75n. Determining what constitutes

498

an otherwise comparable population is one of the major challenges of research. For example, if you want to study specific
language impairment (SLI), you presumably want to compare children with SLI to unaffected children, but which ones?
If you use children of the same chronological age, you will surely
find many differences in their speech, but this will mainly confirm that a speech pathologist was correct in diagnosing the first
group with SLI. More interesting would be to find that younger
unaffected children whose language is similar to that of the
children with SLI in some respects (e.g., mean length of utterance) nonetheless is more advanced in others (e.g. correct use of
inflectional morphology).
There are numerous ways of classifying speakers into groups
that have seemed fruitful: age (see aging and language),
gender (see gender and language), handedness, education,
native versus non-native speaker, mono- versus bi-/multilingual
(see bilingualism and multilingualism), socioeconomic
status (see sociolinguistics), and many more. Interpreting
any correlations one finds between one of these variables and
some language phenomenon is rarely straightforward, however. For example, if we find that increasing age correlates with
increasing frequency of tip-of-the-tongue states, does that implicate a general decline in memory retrieval with age, or rather an
ability to partially retrieve words that younger people could not
retrieve at all, thanks to greater exposure to these words over the
course of a lifetime (Gollan and Brown 2006)?
RELEVANCE OF ATYPICAL SPEAKERS TO THE STUDY OF TYPICAL
SPEAKERS. A second reason for studying a particular population
is to allow us to learn things about typical language that typical
speakers do not. For example, it has been suggested that certain
language disorders represent an otherwise intact language system from which one circumscribed grammatical mechanism has
been removed or rendered inoperative, as in Yosef Grodzinskys
1986 account of agrammatic aphasia, according to which
traces of movement are missing from otherwise normal syntactic representations. No unaffected speakers would provide us
with the opportunity to pose the question What does syntax
look like if you take out just the traces? Similarly, SLI in later
life could, on certain views, allow us to ask how an incompletely
developed morphosyntax behaves when coupled with adultsized open-class vocabulary and general cognitive capacities,
such as working memory (see working memory and language processing) permitting us to test, for example, the
behavior of complex sentences in such circumstances. The
congnitive immaturity of (typically developing) children precludes this kind of test.
There is a major caveat when dealing with atypical populations, however particularly when analyzing them using theories based on typical populations or drawing conclusions about
such populations. We do not know how circumscribed their
deviation from the norm really is. For example, in the case of
focal brain damage, it is not an innocent assumption to posit that
the speakers subsequent use of language is simply the output of
an otherwise normal brain (as it was before the lesion occurred)
minus whatever function(s) used to be performed by the lost
neural structures. Rather, the speakers language use is the product of a damaged brain that has recovered from injury in ways

Methodology
that we do not yet know how to ascertain. Some functions previously performed in the damaged area may have been taken over
by intact areas, which may in turn have lost some of their original
functionality; areas that were inhibited by the damaged area may
now be free to come into play; and so forth.
Another kind of atypical population includes the expert users
of language: authors, comedians, songwriters, journalists, poets,
playwrights, preachers, politicians, and so on. They can be taken
as proof by example of what it is possible for humans to do with
language, but beyond that we know very little about how they
come by their expertise, and so it is hard to say how they might
inform the study of language in nonexpert speakers.
A special reason for choosing particular speakers to study is
because of their genetic relationship to other speakers. This is
most obvious for language disorders suspected to have a heritable component, such as SLI. But genetic relationships, in particular between twins, can be used in language sciences (as in many
sciences) to approach issues concerning the possible contributions of the genotype to aspects of language in the phenotype.
The standard methodology is to compare monozygotic to dizygotic twin pairs, whereby the former share (on average) twice as
much genetic material. Any phenomenon of language where the
monozygotic pairs are more similar is taken to be shaped more
heavily by prewired brain structures (cf. Ganger 1998).

Types of Data Collection: Immediate Versus Delayed


By immediate collection of language-relevant data I mean that
researchers obtain data from a speaker while he or she is engaged
in some language-related behavior (though this may not involve
any action or even any awareness of language on the speakers
part). By delayed data collection I mean that researchers
obtain language-related data after the fact, including by studying
artifactual records of previous language-related behaviors (e.g.
transcriptions, recordings, corpora [see corpus linguistics],
grammars, etc.). This distinction is crucial because it bears on
how much researchers can know about the original event. In
what follows I exemplify numerous approaches in each category
and suggest advantages and disadvantages.
KINDS OF IMMEDIATE DATA COLLECTION. There are really only two
sorts of data one can collect from speakers who are doing a linguistic task (meant broadly, as shorthand for doing something
that involves language). One is to collect data on what they are
(deliberately) doing: for example, if they are talking, what they are
saying; if they are listening, when they are nodding, when they are
smiling, and so on. The other is to collect some other physical measure that will (one hopes) provide evidence about language inside
them. This can take the form of voluntary behavioral measures or
involuntary physiological or brain measures. (The data may be preserved for later analysis, e.g., on videotape. What is crucial in counting it as immediate data is that it captures the speakers immediate
response. Even a questionnaire can fall into this category if speakers
report their immediate reactions, e.g., Yes/No or numeric ratings.)
In the category of involuntary responses, we find such techniques as measuring galvanic skin response, pupil diameter,
and eye movements, plus indicators of brain activity from neuroimaging positron emission tomography (PET), functional
magnetic resonance imagining (fMRI), event-related potential

(ERP), magnetoencephalography (MEG). Useful results have


been obtained on some measures without giving subjects any
task at all, simply by exposing them to language auditorily, but
depending on the technique, mental tasks or even ones involving
responses such as button pressing are possible. A serious methodological issue arises when we want to interpret the resulting
data, however. For example, consider eye-tracking data from a
sentence reading task. The assumption has usually been that
the longer a reader spends looking at a particular word or group
of words, the harder they found those words to process or understand. While that is true in many cases, one situation in which
people may spend a very short time looking at some word is
when it signals the need to reanalyze an earlier part of the sentence and triggers an immediate regressive eye movement. This
clearly should not be taken to indicate ease of processing. Due
to challenges of this sort, there are now a half dozen or more
measures of fixation times commonly reported in eye-tracking
studies, but their interpretation is not agreed upon and may
well depend on the particulars of what is being read. Here, and
especially for brain measures, as data become richer they do not
necessarily become more informative until foundational results
establish how the basic response features are to be interpreted.
It is sometimes thought that we do not actually need to
understand these detailed properties of brain activity in order to
make productive use of these measures: So long as we can show
that stimulus Y patterns like stimulus X while stimulus Z patterns differently, then we have evidence that whatever manipulation we used in creating the stimuli classifies X with Y to the
exclusion of Z. For example, someone might try to ask whether
binding theory (see also anaphora) is really part of syntax or
part of semantics by creating a sentence that violates a clearly
syntactic principle (X), one that violates a clearly semantic principle (Z), and a binding violation (Y), and then seeing whether
Y patterns like X or like Z in ERPs (or neither, in which case
no conclusion can be drawn). But it is impossible to construct
sentences that are identical in all respects (phonology, morphology, sequence of word classes, etc.) except for these violations, and so if we know nothing about what the observed brain
patterns actually mean, all that this kind of experiment can tell
us is that some property shared by X and Y is lighting up, and Z
does not share that property. (Although ERP researchers speak
of components sensitive to syntactic violations versus semantic
anomaly, the basis for this is a very small range of sentence types,
and semantic really refers to real-world implausibility, not violations of principles of formal semantics.)
Turning now to conscious reaction tasks, the most common
of course involve psychologys favorite technique, measuring
reaction time to press a button, say a word, and so on. Within
certain schools of linguistics, grammaticality judgments
are the most favored (increasingly encompassed by the broader
term well-formedness ratings, as they are also applied to individual words and are elicited on multipoint or open-ended scales).
In mentioning these two types of data collection side by side, my
intent is to emphasize their similarities. They both involve collecting behavioral measures in immediate response to some linguistic stimulus. Grammaticality judgments can be recorded and
timed by computer. Contrariwise, experiments normally carried
out by computer can be done interview style, for example, with

499

Methodology
people who cannot read. Interviews lose fine-grained timing
information (experimenters should still note gross differences
in response times) but gain elsewhere, including the allowance
for open-ended narrative responses and the possibility of asking
follow-up questions contingent thereon. It is important to note
that the presence/absence of a laboratory setting, electronic
equipment, statistical analysis, and so on has no bearing on the
conceptual/epistemological nature of the data collected.
Finally in this category are data from speakers who are actually using language with no extra task imposed on them. It is
surprising prima facie how little time most language scientists
spend actually observing just these events. The reason is largely
practical. Most research necessarily concentrates on a quite specific aspect of some linguistic phenomenon; waiting for it to arise
by chance is too resource intensive. Nonetheless, it is important
to keep in mind that every step away from the real situations we
are interested in will introduce both random errors and systematic distortions for example, in the case of transcripts, due to
imperfect recording quality and the transcribers subconscious
assumptions about what is being said, respectively.

member of an isolated society that does not welcome outsiders


may report that its leader uses special vocabulary; if researchers are not members of this society, they must take the word of
someone else.
Evidently there are some situations in which the use of delayed
language data is unavoidable, for example, when studying dead
languages or when speakers are not accessible. Also, quantitative measures such as word frequency could not practically be
calculated entirely from immediate interactions with individual
speakers. More generally, the use of delayed data affords us a
much larger sample of language material, hence, potential exposure to rare phenomena that we might otherwise never become
aware of. However, it is misguided to think that the availability
of billions of words of computer-searchable text has eliminated
the need for explicit data-gathering tasks: For the vast majority of
the worlds languages, the quantity of existing written materials
(if there is a writing system at all) is many orders of magnitude
smaller than for the languages that dominate the information
age, and much of it is not on computer.

Conclusion
KINDS OF DELAYED DATA COLLECTION. I classify as a delayed data
situation one in which the object of the researchers measurements, observations, and so on is not a person at a time when he
or she is engaging with language, but rather some indication of
what may have happened at such a time: an artifact produced by
that person, or a behavior observed by someone other than the
researcher. There are two major subclasses of such data.
One subtype comprises any instances of written language,
whether created by an original act of writing or representing an
attempt to transcribe or otherwise keep record of language that
was originally spoken. (Although this distinction is important,
all written material, including phonetic transcription, loses
much information found in spoken language.) This includes
documents from now-dead languages, dictionaries and grammars, dialect atlases, poetry, song lyrics, scripts, and so on
(in some of which the writers intent may be to sound unlike
his own or anyone elses natural speech or prose writing), as
well as corpora amassed specifically for academic purposes.
Any text found on the World Wide Web falls into this category
as well. One can, of course, treat textual material as an object of
study unto itself, ignoring how it was created, but if one wants to
use it as evidence bearing on human language in general, then
considering the many differences between writing and talking
becomes paramount. Most significant is the ability to edit written material after initially producing it (in most situations). The
Web, increasingly used as a corpus because of its size, comes
with many special problems: It can be hard to ascertain who
actually wrote any given passage; it is usually impossible to
establish the native language(s), gender, age, and so on of the
author; the intended meaning and discourse function is often
unclear; and so forth.
The second subtype of delayed data is hearsay, that is, reports
given by someone about language phenomena witnessed, told
about, or engaged in personally. For example, elderly speakers might report that their parents used to use some expression
but that they themselves never used it. This is information that
researchers have no way of independently verifying. Likewise, a

500

Part of what makes the study of language both fascinating and


frustrating is that language can never truly be studied in isolation: It inexorably traces back to the bodies and brains of human
beings, which both are always doing myriad things. Rather than
trying to ignore this as an inconvenience, researchers would do
well to keep it in mind whenever they have methodological decisions to make. Sometimes, as with twin studies and certain language disorders, it can even be turned into an advantage.
Carson T. Schtze
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Botha, Rudolph P. 1981. The Conduct of Linguistic Inquiry: A Systematic
Introduction to the Methodology of Generative Grammar. The
Hague: Mouton.
Cowart, Wayne. 1997. Experimental Syntax: Applying Objective Methods
to Sentence Judgments. Thousand Oaks, CA: Sage.
Ganger, Jennifer B. 1998. Genes and environment in language acquisition: A study of early vocabulary and syntactic development in twins.
Ph.D. diss., Massachusetts Institute of Technology.
Gollan, Tamar H., and Alan S. Brown. 2006. From tip-of-the-tongue
(TOT) data to theoretical implications in two steps: When more TOTs
means better retrieval. Journal of Experimental Psychology: General
135: 46283.
Grodzinsky, Yosef. 1986. Language deficits and the theory of syntax.
Brain and Language 27: 13559.
Labov, William. 1972. Some principles of linguistic methodology.
Language in Society 1: 97120.
Matthewson, Lisa. 2004. On the methodology of semantic fieldwork.
International Journal of American Linguistics 70: 369415.
Newmeyer, Frederick J. 1983. Grammatical Theory, Its Limits and Its
Possibilities. Chicago: University of Chicago Press.
Resnik, Philip, Aaron Elkiss, Ellen Lau, and Heather Taylor. 2006. The
Web in Theoretical Linguistics Research: Two Case Studies Using the
Linguists Search Engine. In Proceedings of the 31st Annual Meeting of
the Berkeley Linguistics Society, 26576.
Schtze, Carson T. 1996. The Empirical Base of Linguistics: Grammaticality
Judgments and Linguistic Methodology. Chicago: University of Chicago
Press.

Metonymy
. 2005. Thinking about what we are asking speakers to do. In
Linguistic Evidence: Empirical, Theoretical, and Computational
Perspectives, ed. Stephan Kepser and Marga Reis, 45784.
Berlin: Mouton de Gruyter.

METONYMY
Metonymy (Greek change of name) is one of the
major figures of speech recognized in classical rhetoric. The
Roman treatise Rhetorica ad Herennium defines metonymy as
a trope that takes its expression from near and close things by
which we can comprehend a word that is not denominated by
its proper word. This ancient characterization already points to
two criterial notions of metonymy, contiguity and substitution,
which still occur in most present-day definitions of metonymy
as the substitution of one word for another with which it is
associated.
Recent studies in cognitive linguistics have shown that
metonymy is not just a matter of words and their substitution
but is part of human thinking and reasoning. The conceptual
nature of metonymy has been demonstrated by George Lakoff
(1987). For example, the term mother makes many people think
of a housewife mother. The relationship between mothers and
housewives is metonymic and operates only on the conceptual
level: The category mother is metonymically associated with the
subcategory housewife mother as one of its members.
Various cognitive linguists have described the conceptual
basis of metonymy using the notion conceptual frame. Frames
are packages of knowledge about coherent segments of experience. The elements of a frame are conceptually contiguous: Any
element evokes the frame as a whole and, concomitantly, other
elements within the frame network. For example, the concept
author establishes a frame that includes literary works, a publisher, biographical information, etc. Since these elements are
conceptually contiguous, they may be exploited by metonymy.
Thus, we may metonymically refer to a book by naming its author,
as in We are reading Shakespeare. Typically, a metonymic interpretation is coerced when there is a conceptual conflict between
expressions belonging to the same frame. In the previous example, the verb read requires an object that denotes a linguistically
coded content, such as a book or a letter. The conceptual conflict
is resolved by understanding Shakespeare as a reference point
that provides mental access to Shakespeares literary work
(Langacker 1993; Radden and Kvecses 1999).
Studies in metonymy have traditionally focused on words.
Standard examples on the synchronic level include The kettle is boiling (container for content) and Jonathan is in the
phone book (person for name). Metonymic processes on the
diachronic level have been long noted by historical linguists and amply demonstrated since the nineteenth century.
Metonymic shifts have been observed cross-linguistically in a
number of conceptual frames (Koch 1999). For example, in the
marriage frame, a preparatory status of being engaged may
stand for the state of being married. Thus, the Latin word sponsus/
sponsa with the meaning fianc/fiance shifted its meaning to
bride/bridegroom and ended up with the meaning husband/
wife, as in Spanish esposo/esposa, French poux/pouse, and
English spouse.

Like lexical metonymies, grammatical metonymies operate


both on the synchronic and diachronic levels. The coercive process in metonymy is particularly striking in cases where grammatical meaning conflicts with lexical meaning. For example,
stative predicates, such as the verb be, may be used in constructions that normally require action predicates, such as imperatives. Thus, the slogan of the American news network CNN Be the
first to know is interpreted as the effect of an intentional act to
be carried out by the hearer: Do something [viz. watch CNN] so
that, as a result, you are the first to know. The conceptual shift
at work here is based on the result for action metonymy. On
the diachronic level, metonymy plays a crucial role in grammaticalization processes. For example, the lexical item go (in
conjunction with the present progressive) in the phrase be going
to has grammaticalized into a future marker. Human motion is
typically directed toward a goal and, hence, is strongly associated with the intention of reaching the goal. Since the goal can
only be reached in the future, the intention to reach the goal may
stand for the future itself.
Looked at from a pragmatic point of view, metonymy can
be regarded as a matter of inferencing. We can distinguish the
following three types of metonymic inference: inferences about
a referential item (referential metonym), inferences about a
predicate (predicational metonymy), and inferences about the
speech-act meaning (illocutionary metonymy) (Panther
and Thornburg 1998). Referential metonymy is a means of indirect reference. For example, the use of subway in The subway is on
strike invites the inference that the subway personnel is meant.
Predicational metonymy is exemplified by utterances such as
The saxophone player had to leave early, which in many contexts
induces the metonymic inference that the saxophone player left
early. In this case, an obligation to leave is interpreted as an actually occurring action. Illocutionary metonymy is illustrated by
utterances such as Can you lend me ten dollars? The speaker literally poses a question about the hearers ability to lend the speaker
$10, but this question gives rise to the metonymic inference that
the hearer is being asked to lend $10 to the speaker; it is understood as a request. Conventional indirect requests like these are
not just random substitute forms for the direct request Lend me
ten dollars. The literal meaning of the metonymic expression has
an important communicative function in this indirect request.
It addresses a potential obstacle: The hearer might be unable
to carry out the requested action because he or she needs the
money, too (Gibbs 1994). In fact, the example illustrates an
important general point: The literal meaning of a metonymy is
always relevant to the interpretation of metonymic expressions.
It thus provides strong evidence against the view that metonymy
is merely the substitution of one word for another.
Gnter Radden and Klaus-Uwe Panther
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Gibbs, Raymond W., Jr. 1994. The Poetics of Mind: Figurative Thought,
Language, and Understanding. Cambridge: Cambridge University
Press.
Koch, Peter. 1999. Frame and contiguity: On the cognitive bases of
metonymy and certain types of word formation. In Panther and
Radden 1999, 13967.

501

Minimalism
Lakoff, George. 1987. Women, Fire, and Dangerous Things: What Categories
Reveal about the Mind. Chicago: University of Chicago Press.
Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By. Chicago
and London: University of Chicago Press.
Langacker, Ronald. 1993. Reference-point constructions. Cognitive
Linguistics 4: 138.
Panther, Klaus-Uwe, and Gnter Radden, eds. 1999. Metonymy in
Language and Thought. Amsterdam and Philadelphia: Benjamins.
Panther, Klaus-Uwe, and Linda L. Thornburg. 1998. A cognitive approach
to inferencing in conversation. Journal of Pragmatics 30: 75569.
Radden, Gnter, and Zoltn Kvecses. 1999. Towards a theory of metonymy. In Panther and Radden 1999, 1759.

MINIMALISM
Minimalism, extending earlier work in transformational
grammar and generative grammar, conjectures that the
computational system central to human language is a perfect solution to the task of relating sound and meaning. Recent
research has investigated the complexities evident in earlier
models and attempted to eliminate them, or to show how they
are only apparent, following from deeper and simpler properties. Major examples of this work include the reduction of the
number of linguistic levels of representation in the model and
the deduction of certain constraints on syntactic derivations
from general considerations of economy and computational
simplicity.
Like earlier versions of generative grammar, the minimalist
program (MP) (Chomsky 1995b, 2000, 2004, 2005) maintains that
linguistic competence is a computational system creating and
manipulating structural representations. MP further proposes
that the derivations and representations conform to economy
criteria, demanding that they be minimal in a sense determined
by the language faculty (perhaps ultimately by general properties of organic systems): no extra steps in derivations, no extra
symbols in representations, and no representations beyond
those that are conceptually necessary.

Reduction of Levels
Minimalism developed out of the government and binding
(GB) or principles and parameters model (Chomsky 1981,
1982; Chomsky and Lasnik 1993). In that model, there are four
significant levels of representation, related by derivation:
(1)

D(eep)-Structure
S(urface)-Structure
PF
(Phonetic Form)

LF
(Logical Form)

Given that a human language is a way of relating sound (or,


more generally, gesture, as in sign languages) and meaning,
the interface levels PF and LF were assumed to be ineliminable.
Minimalism begins with the hypothesis that there are no other
levels.

Structure Building
Minimalism, in a partial return to the apparatus of pre-1965
transformational theory (Chomsky 1955), has lexical items

502

inserted throughout the course of the syntactic derivation, via


generalized transformations, rather than all in one initial block.
The derivation proceeds bottom up with the most deeply
embedded structural unit created, then combined, via merge,
with the head of which it is the complement to create a larger
unit, and so on. Consider the derivation of The woman will see
the man: The noun (N) man is combined with the determiner (D)
the to form the determiner phrase (DP) the man. This DP then
combines with the verb see to produce an intermediate projection (in the sense of x-bar theory), V-bar. The DP the woman
is created in the same fashion as the man, and is combined with
the V-bar to produce the VP. Next, this VP merges with the Infl
will producing I-bar. The DP the woman finally moves (leaving
a TRACE t) to the specifier position of I, yielding the full clausal
projection IP, schematically illustrated in (2) (by labeled bracketing, a notational variant of tree representation):
(2) [IP The woman [I will [VP t [V see [DP the man]]]]]

In this model, there is no one representation following all lexical insertion and preceding all singulary transformations. That
is, there is no D-structure.

Some Minimalist Goals


So far, S-structure persists: If there is a point where the derivation divides, branching toward LF on one path and toward PF on
the other, that point is S-structure. The more significant question
is whether there are any crucial conditions defined on it as in
the GB framework, for example, with respect to binding theory
(Chomsky 1981). One goal of the minimalist research program is
to establish that these further properties are actually properties
of LF, as suggested in the mid-1980s (Chomsky 1986), contrary to
previous arguments (Chomsky 1981).
Another goal is to reduce all constraints on representation to
bare output conditions, determined by the properties of the mental systems that LF and PF must interface with. For instance, the
motor system determines that a phonetic representation must
be linearly ordered.
Internal to the computational system, the desideratum is
that constraints on transformational derivations be reduced to
general principles of economy. Derivations beginning from the
same choice of lexical items are compared in terms of number
of steps, length of movements, and so on, with the less economical ones being rejected. An example is the minimalist deduction
of the superiority condition, which demands that when multiple
items are available for wh-movement in a language, like English,
allowing only one to move, it is the highest one (one closest to
the root of the phrase structure tree) that will move:
(3)

Who t will read what

(4) *What will who read t [* indicates ungrammaticality]

Economy, in the form of shortest move, selects (3) over (4) since
the subject is higher than the object, hence, closer to the sentence initial target of wh-movement than the object is.
The simplifying developments in the theory leading toward
the minimalist approach generally led to greater breadth and
depth of understanding of both how human languages are organized (descriptive adequacy) and how they develop in childrens

Minimalism
minds (explanatory adequacy) (see descriptive, observational, and explanatory adequacy). This success
led Noam Chomsky to put forward the audaciously minimalist conjecture that we are now in a position to go even beyond
explanatory adequacy: The human language faculty might be
a computationally perfect solution to the problem of relating
sound and meaning, the minimal computational system given
the boundary conditions provided by other modules of the mind.
This conjecture leads to a general minimalist critique of syntactic
theorizing, including Chomskys own earlier minimalist theorizing. Consider first the leading idea that multiple derivations
from the same initial set of lexical choices are compared. This
introduces considerable complexity into the computation, especially as the number of alternative derivations multiplies. It thus
becomes desirable to develop a model whereby all relevant derivational decisions can be made in strictly Markovian fashion: At
each step, the very next successful step can be determined, and
determined easily. This arguably more tractable local economy
model was suggested by Chomsky (1995a) and developed in
detail by Chris Collins (1997).

as a result of the movement, but that is simply a beneficial side


effect of the satisfaction of the requirement of the attractor. The
earlier minimalist approach to the driving force of movement
was called Greed by Chomsky. This later one developed out of
what Howard Lasnik (1995) called Enlightened Self Interest.

The Syntactic Similarity of Languages


One recurrent theme in GB and minimalist theorizing, motivated
by the quest for explanatory adequacy, is that human languages
are syntactically very similar. The standard GB and early minimalist instantiation of this claim was the proposal that superficial
differences result from potential derivational timing differences
among languages, with the same transformation applying in
overt or covert syntax. Under both circumstances, LF reflects the
results of the transformation. For example, the wh-movement
operative in English interrogative sentences is overt movement
to specifier of C(omplementizer). In many other languages,
including Chinese and Japanese, interrogative expressions seem
to remain in situ, unmoved, as seen in the contrast between (9)
and its English translation in (10):
(9)

The Last Resort Nature of Syntactic Movement


From its inception in the early 1990s, minimalism has insisted
on the last resort nature of movement: Movement must happen for a formal reason. The case filter (see filters), which
was a central component of the GB system, was thought to provide one such driving force. A standard example involves subject
raising:
(5)

John is certain [t to fail the exam]

(6)

It is certain [that John will fail the exam]

In (5), as in (6), John is the understood subject of fail the exam.


This fact is captured by deriving (5) from an underlying structure
much like that of (6), except with an infinitival embedded sentence instead of a finite one:
(7)

__ is certain [John to fail the exam]

John in (7) is not in a position appropriate to any case. By raising


to the higher subject position (specifier of the higher Infl), it can
avoid a violation of the case filter, since the raised position is one
where nominative case is licensed. But if the case requirement of
John provides the driving force for movement, the requirement
will not be satisfied immediately upon the introduction of that
nominal expression into the structure, under the assumed bottom-up derivation. Rather, satisfaction must wait until the next
cycle, when a higher layer of structure is built, or, in fact, until an
unlimited number of cycles later, as raising configurations can
iterate:
(8)

John seems [ to be certain [ to fail the exam]]

A minimalist perspective favors an alternative whereby


the driving force for movement can be satisfied immediately.
Suppose that Infl has a feature that must be checked against
the NP. Then as soon as that head has been introduced into the
structure, it attracts the NP or DP that will check its feature.
Movement is then seen from the point of view of the target, rather
than the moving item itself. The case of the NP does get checked

(10)

ni xihuan shei [Chinese]


you like who
Who do you like

C.-T. Huang (1981/1982) argued that even in such languages


there is movement, by showing that well-established locality
constraints on wh-movement, such as those of John Robert Ross
(1967), also constrain the distribution and interpretation of certain seemingly unmoved wh-expressions in Chinese. This argument was widely influential and laid the groundwork for much
GB and minimalist research.
Along related lines, Chomsky argued that V-raising, overt in
virtually all of the Romance languages, among others, operates
covertly in English, as in the following examples from English
and their translations into French:
(11) a. John often kisses Mary
b. *John kisses often Mary
(12) a. *Jean souvent embrasse Marie
b. Jean embrasse souvent Marie

The assumption is that the position of the verb vis--vis the adverb
indicates whether the verb has raised overtly. For V-raising, the
feature driving the movement is claimed to be one that resides
in Infl. The feature might be strong, forcing overt movement (as
in French), or weak. Similarly, the feature demanding overt whmovement in English is a strong feature of C. The principle procrastinate disallows overt movement except when it is necessary
(i.e., for the satisfaction of a strong feature as in Chomsky 1993;
Lasnik 1999a).
Procrastinate invited a question. Why is delaying an operation
until LF more economical than performing it earlier? Further,
many of the hypothesized instances of covert movement do
not have the semantic effects (with respect to quantifier scope,
anaphora, etc.) that corresponding overt movements have, as
discussed by Lasnik (1999b, Chapters 6 and 8). To address these
questions, Chomsky (2000; 2001) argues for a process of agreement (potentially at a substantial distance) that relates the two

503

Minimalism
items that need to be checked against each other. Many of the
phenomena that had been analyzed as involving covert movement are reanalyzed as involving no movement at all, just the
operation Agree (though Huangs argument indicates that there
are at least some instances of covert movement). Overt phrasal
movement (such as subject raising) is then seen in a different
light: It is not driven by the need for case or agreement features
to be checked (since that could take place via Agree). Instead, it
takes place to satisfy the requirement of certain heads (including Infl) that they have a specifier (in the X-bar theoretic sense).
Such a requirement was already formulated by Chomsky (1981),
and dubbed the extended projection principle (EPP) in Chomsky
(1982). To the extent that long distance A-movement (basically,
movement to a higher subject position) as in (8) proceeds successive-cyclically through each intermediate subject position,
the EPP is motivated, since, as observed earlier, these intermediate positions are not case-checking positions.
An important question at this point is why language has the
seeming imperfection of movement processes at all. We can
distinguish two major types of movement, phrasal movement
and head movement. Chomsky conjectures that phrasal movement is largely to convey topic-comment information (and possibly to make scope relations more transparent), and that the
EPP is the way the computational system formally implements
this. V-movement, on the other hand, is conjectured to have
PF motivation (guaranteeing that the Infl affix will ultimately
be attached to a proper host, V), and may even be a PF process.
Another possibility is that movement is simply a generalization
of the merge operation combining smaller structures into larger
ones. Given that merge is ineliminable, perhaps move is not an
imperfection after all.

Syntactic Interfaces
The connection between syntactic derivation and semantic and
phonological interfaces has long been a central research area. In
minimalism, interpretation could be distributed over many structures in the course of transformational cycles. Already decades
ago, Joan W. Bresnan (1971) argued that the rule responsible for
assigning English sentences their intonation contour applies
following each cycle of transformations, rather than at the end
of the syntactic derivation. Ray Jackendoff (1972) put forward
similar proposals for semantic phenomena involving scope and
anaphora. Chomsky (2000, 2001) argues for a general instantiation of this distributed approach, sometimes called Multiple
Spell-Out, based on Epstein (1999) and Uriagereka (1999).
At the end of each cycle (or phase in Chomskys more recent
work), the syntactic structure thus far created can be encapsulated and sent off to the interface components for phonological
and semantic interpretation. Thus, even the levels of PF and LF
fade away. Samuel D. Epstein argues that such a move represents
a conceptual simplification (in the same way that elimination of
D-structure and S-structure did), and both Juan Uriagereka and
Chomsky provide empirical justification. The role of syntactic
derivation, always important in Chomskian theorizing, becomes
even more central on this view. Epstein reasons that the centrality of (asymmetric) c-command (as opposed to one of a whole
range of other conceivable geometric relations) in syntax is
predicted on this strongly derivational view, but not in a more

504

representational theory. As the derivation proceeds, always


merging together pairs of items, sisterhood and domination are
the only immediately available primitives. And X (asymetrically)
c-commands Y if and only if Y is dominated by the sister of X.
These notions are illustrated in (13), where B and C are sisters as
are D and E, A dominates B, C, D, and E, and C dominates D and
E. B asymmetrically c-commands D and E.
(13)

A
C

B
D

Multiple Spell-Out effectively deals with a range of reconstruction phenomena. For example, an anaphor normally requires an
antecedent that c-commands it:
(14)

John criticized himself

(15)

*Himself criticized John

But when the anaphor is fronted from a position c-commanded


by an antecedent to a position not in that structural relation, the
anaphoric connection is nonetheless possible:
(16)

Himself, John criticized

This follows straightforwardly if anaphora can be interpreted


prior to movement.
Chomsky has also explored another kind of approach to
reconstruction, based on a condition that he calls Inclusiveness
Chomsky (1995a). This condition demands that a syntactic derivation merely combine elements of the numeration. No new
entities can be created. Traces, as traditionally conceived, violate this condition. Chomsky therefore concludes that a trace of
movement is actually a copy of the item that moved, rather than
a new sort of entity. This is yet another return to earlier generative approaches (wherein movement was seen as a compound of
copying and deletion). The copy left behind is normally deleted
in the phonological component (though Boskovic 2001 presents arguments that under certain circumstances, lower copies
are pronounced in order to rescue what would otherwise be
PF violations) but could persist for semantic purposes, such as
the licensing of anaphoric connection. Danny Fox (2000) presents an analysis of scope and anaphora reconstruction effects in
terms of the copy theory.
An influential research line, initiated by Richard S. Kayne
(1994), extends the impact of c-command to PF as well. Kayne
hypothesizes that the linear order that is manifest in PF (as it must
be, given properties of the phonetic system) comes about via his
linear correspondence axiom (LCA), which states that asymmetric c-command is mapped onto PF linear order. This has the
far-reaching consequence that structures must always be rightbranching. Subject-verb-object (SVO) languages like English are
broadly consistent with this requirement, but subject-object-verb
(SOV) languages like Japanese are not. Kaynes antisymmetry
approach reanalyzes SOV languages as underlyingly SVO (as all
languages must be by this hypothesis), with the SOV order derived
by (leftward) movement. One crucial unanswered question is the
source of the driving force for the required movements.

Minimalism

Mirror Systems, Imitation, and Language

Conclusion

Kayne, Richard S. 1994. The Antisymmetry of Syntax. Cambridge, MA: MIT


Press.
Lasnik, Howard. 1995. Case and expletives revisited: On Greed and
other human failings. Linguistic Inquiry 26: 61533. Repr. in Lasnik
1999b, 7496.
. 1999a. On feature strength: Three minimalist approaches to overt
movement. Linguistic Inquiry 30: 197217. Repr. Howard Lasnik,
Minimalist Investigations in Linguistic Theory (London; Routledge,
2003), 83102.
. 1999b. Minimalist Analysis. Oxford: Blackwell.
Lasnik, Howard, and Juan Uriagereka, with Cedric Boeckx. 2005. A Course
in Minimalist Syntax: Foundations and Prospects. Oxford: Blackwell.
Ross, John Robert. 1967. Constraints on Variables in Syntax. Ph.D. diss.,
Massachusetts Institute of Technology. Published as Infinite Syntax!
(Norwood, N.J.: Ablex, 1986).
Uriagereka, Juan. 1998. Rhyme and Reason: An Introduction to Minimalist
Syntax. Cambridge, MA: MIT Press.
. 1999. Multiple spell-out. In Working Minimalism, ed. Samuel
David Epstein and Norbert Hornstein, 25182. Cambridge, MA: MIT
Press.

Chomsky constantly emphasizes that minimalism is as yet still


just an approach, a set of questions and a conjecture about
how human language works (perfectly), and a general program
for exploring the questions and developing the conjecture. The
descriptive and explanatory success attained thus far gives some
reason for optimism that the approach can be developed into an
articulated theory of human linguistic ability and of why it has
the exact properties it does.
Howard Lasnik
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Boeckx, Cedric. 2006. Linguistic Minimalism: Origins, Methods, Concepts,
and Aims. Oxford: Oxford University Press.
Boskovic, Zeljko. 2001. On the Nature of the Syntax-Phonology
Interface: Cliticization and Related Phenomena. Amsterdam: Elsevier
Science.
Bresnan, Joan W. 1971. Sentence stress and syntactic transformations.
Language 47: 25781.
Chomsky, Noam. 1955. The logical structure of linguistic theory.
Manuscript, Harvard University and Massachusetts Institute of
Technology. Revised 1956 version published in part by Plenum, New
York, 1975 and by University of Chicago Press, Chicago, 1985.
. 1981. Lectures on Government and Binding. Dordrecht, the
Netherlands: Foris.
. 1982. Some Concepts and Consequences of the Theory of Government
and Binding. Cambridge, MA: MIT Press.
. 1986. Knowledge of Language. New York: Praeger.
. 1993. A minimalist program for linguistic theory. In The View
from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger,
ed. Kenneth Hale and Samuel J. Keyser, 152. Cambridge, MA: MIT
Press. Repr. in Chomsky 1995b, 167217.
. 1995a. Categories and transformations. In The Minimalist
Program, 219394. Cambridge, MA: MIT Press.
. 1995b. The Minimalist Program. Cambridge, MA: MIT Press.
. 2000. Minimalist Inquiries: the framework. In Step by Step: Essays
on Minimalist Syntax in Honor of Howard Lasnik, ed. Roger Martin,
David Michaels, and Juan Uriagereka, 89155. Cambridge, MA: MIT
Press.
. 2001. Derivation by phase. In Ken Hale: A Life in Language, ed.
Michael Kenstowicz, 152. Cambridge, MA: MIT Press.
. 2004. Beyond explanatory adequacy. In Structures and Beyond
the Cartography of Syntactic Structure. Vol. 3. Ed. Adriana Belletti,
10431. Oxford: Oxford University Press.
. 2005. Three factors in language design. Linguistic Inquiry
36: 122.
Chomsky, Noam, and Howard Lasnik. 1993. The theory of principles and
parameters. In Syntax: An International Handbook of Contemporary
Research. Vol. 1. Ed. Joachim Jacobs, Arnim von Stechow, Wolfgang
Sternefeld, and Theo Vennemann, 50669. Berlin: Walter de Gruyter.
Reprinted in Chomsky 1995b, 13127.
Collins, Chris. 1997. Local Economy. Cambridge, MA: MIT Press.
Epstein, Samuel D. 1999. Un-principled syntax: The derivation of syntactic relations. In Working Minimalism, ed. Samuel D. Epstein and
Norbert Hornstein, 31745. Cambridge, MA: MIT Press.
Fox, Danny. 2000. Economy and Semantic Interpretation. Cambridge,
MA: MIT Press.
Huang, C.-T. James. 1981/1982. Move wh in a language without whmovement. Linguistic Review 1: 369416.
Jackendoff, Ray. 1972. Semantic Interpretation in Generative Grammar.
Cambridge, MA: MIT Press.

MIRROR SYSTEMS, IMITATION, AND LANGUAGE


Any normal child reared in human society will acquire language. Some argue that basic structures of grammar are innate,
so that the child need simply hear a few sentences to set the
parameter for each key principle of the grammar of his or her
first language (Baker 2001; Chomsky and Lasnik 1993). Others
have argued that the modern child receives rich language stimuli
within social interactions in learning these key principles. In any
case, the child must acquire the particular sounds (phonology) of the language, an ever-increasing stock of words, and
constructions for arranging words to compound their meanings.
The infant acquiring maternal phonology does not imitate the
caregiver (Y. Yoshikawa and colleagues [2003] model how the
process may use associative learning), but learning how to put
sounds together to form a word that achieves the childs communicative goal seems to involve imitation. Imitation also lies at the
heart of the acquisition of syntax and semantics (see syntax,
acquisition of; semantics, acquisition of). Even within
the principles and parameters approach, the child must imitate
words and combinations, as well as set parameters, to come to
speak the language (see principles and parameters theory and language acquisition).
Monkeys have little or no capacity for imitation, and apes
(chimpanzees, gorillas, bonobos, orangutans) have a capacity for simple imitation, whereas humans are the only primates
capable of complex imitation. We describe these forms of imitation, then argue that increasing imitative skills, and the relation
of mirror neurons to these imitative skills, were at the heart of the
evolution of the language-ready brain.

Simple and Complex Imitation


M. Myowa-Yamakoshi and T. Matsuzawa (1999) observed that
chimpanzees took 12 or so trials to learn to imitate a behavior in
a laboratory setting, focusing on bringing an object into relationship with another object or the body, rather than the actual movements involved. R. W. Byrne and J. M. E. Byrne (1993) found that
gorillas learn complex feeding strategies but may take months to
do so. Consider eating nettle leaves. Skilled gorillas grasp the stem

505

Mirror Systems, Imitation, and Language


Parietal Lobe
Monkey

[Not to scale]

Human
Wernickes
A rea

Frontal
Lobe

Occipital
Lobe

Brocas
A rea

Homology

Temporal Lobe

Figure 1. A comparative side view of the monkey brain (left) and human brain (right), not to scale. The left view
emphasizes premotor area F5; the right view emphasizes Brocas area and Wernickes area, considered crucial for
language processing. F5 and Brocas area are considered homologous.
firmly, strip off leaves, remove petioles bimanually, fold leaves
over the thumb, pop the bundle into the mouth, and eat. Teaching
is virtually never observed in apes (Caro and Hauser 1992), and
the young seem to look at the food, not at the methods of acquisition (Corp and Byrne 2002). Moreover, chimpanzee mothers seldom, if ever, correct and instruct their young (Tomasello 1999).
The challenge for acquiring such skills is compounded because
the sequence of atomic actions varies greatly from trial to trial.
Byrne (2003) implicates imitation by behavior parsing, a protracted form of statistical learning whereby certain subgoals (e.g.,
nettles folded over the thumb) become evident from repeated
observation as being common to most performances. Apparently,
the young ape, over many months, may acquire the skill by coming to recognize the relevant subgoals and derive action strategies
for achieving subgoals by trial and error.
The ability to learn the overall structure of a specific feeding
behavior over many, many observations, however, is very different
from the human ability to understand any sentence of an openended set as it is heard and to generate another novel sentence
as an appropriate reply. In many cases, humans need just a few
trials to make sense of a relatively complex behavior and can then
repeat it under changing circumstances, if the constituent actions
are familiar and the subgoals these actions must achieve are readily discernible. (The next section places this facility for complex
imitation in an evolutionary and neurological perspective.) It is
interesting to note that even newborn infants can perform certain
acts of imitation, but this capacity for neonatal imitation such
as poking out the tongue on seeing an adult poke out a tongue
(Meltzoff and Moore 1977) is quantitatively different from that
for complex imitation (see communication, prelinguistic).

The Mirror System Hypothesis


The system of the macaque brain for visuomotor control of grasping has its premotor outpost in an area called F5 (Figure 1 left),
which contains a set of neurons, mirror neurons, such that each
one is active not only when the monkey executes a specific grasp
but also when the monkey observes a human or other monkey
execute a more-or-less similar grasp (Rizzolatti et al. 1996). Thus,
macaque F5 contains a mirror system for grasping that employs
a similar neural code for executed and observed manual actions.
The homologous region of the human brain is in or near Brocas

506

area, traditionally thought of as a speech area but which has been


shown by brain imaging studies (see neuroimaging) to be active
when humans both execute and observe grasps. It is posited that
the mirror system for grasping was also present in the common
ancestor of humans and monkeys (perhaps 20 million years ago)
and that of humans and chimpanzees (perhaps 5 million years
ago). Moreover, the mirror neuron property accords well with the
parity requirement for language that what counts for the speaker
must count approximately the same for the hearer. In addition,
normal face-to-face speech involves manual and facial as well as
vocal gestures and, moreover, signed languages are fully developed human languages (see sign language). These findings
ground the Mirror System Hypothesis (Arbib and Rizzolatti 1997;
Rizzolatti and Arbib 1998):
The parity requirement for language in humans is met because
Brocas area evolved atop the mirror system for grasping, which
provides the capacity to generate and recognize a set of actions.

Recent work (see Arbib 2005 for a review and commentaries


on current controversies) has elaborated the hypothesis, defining an evolutionary progression of seven stages, S1 through S7:
(S1) Cortical control of hand movements.
(S2) A mirror system for grasping, shared with the common
ancestor of human and monkey.
A mirror system does not provide imitation in itself. A monkey
with an action in its repertoire may have mirror neurons active
both when executing and observing that action. The monkey
does not repeat the observed action nor, crucially, does it use
observation of a novel action to add that action to its repertoire.
Nonetheless, the mirror system may serve the monkey well both
in providing feedback during close observation of handobject
relations during dextrous actions and in allowing its recognition
of others actions to influence social behavior. In any case, the
data on primate imitation support the hypothesis that a monkeylike mirror system becomes embedded in more powerful systems in the next two stages of evolution.
(S3) A simple imitation system for grasping, shared with
common ancestor of human and apes.
(S4) A complex imitation system for grasping.

Mirror Systems, Imitation, and Language


Each of these changes can be of evolutionary advantage in
supporting the transfer of novel skills among the members
of a community, though involving praxis rather than explicit
communication.
M. A. Arbib, K. Liebal, and S. Pika (2008) summarize data
suggesting that manual gestures have greater openness than
vocalizations in nonhuman primates. Monkey vocalizations are
innately specified (though occasions for using a call may change
with experience), whereas a group of apes may communicate
with novel gestures. M. Tomasello and colleagues (1997) argue
that novel gestures may develop through ontogenetic ritualization, wherein repeated interaction between two individuals
establishes a conventionalized form of an action as a signal for
the action for example, a beckoning movement may become
recognized as short for the physical action of pulling the other
toward oneself. These gestures may then be propagated by social
learning. This supports the hypothesis that it was gesture rather
than primate vocalizations that created the opening for
greatly expanded gestural communication once complex imitation had evolved for practical manual skills. R. M. Seyfarth,
D. L. Cheney, and T. J. Bergman (2005) advance the opposing
view, but the Mirror System Hypothesis postulates that evolution
proceeded via the next two stages:
(S5) Protosign, a manual-based communication system
breaking through the fixed repertoire of primate vocalizations
to yield an open repertoire.
(S6) Protolanguage as Protosign and Protospeech: an expanding spiral of conventionalized manual, facial, and vocal communicative gestures.
The transition from complex imitation and the small repertoires of ape gestures (perhaps 10 or so novel gestures shared
by a group) to protosign involves pantomime, first of grasping and manual praxic actions, then of nonmanual actions
(e.g., flapping the arms to mime the wings of a flying bird).
Pantomime transcends the slow accretion of manual gestures
by ontogenetic ritualization, providing an open semantics
for a large set of novel meanings (Stokoe 2001). However, such
pantomime is inefficient both in the time taken to produce it
and in the likelihood of misunderstanding. Conventionalized
signs extend and exploit more efficiently the semantic richness opened up by pantomime. Processes like ontogenetic
ritualization can convert elaborate pantomimes into a conventionalized shorthand, just as they do for praxic actions.
This capability for protosign rather than elaborations intrinsic to the core vocalization systems may then have provided
the essential scaffolding for protospeech and evolution of the
human language-ready brain.
(S7) Language: the development of syntax and compositional semantics.
The final stage the transition from protolanguage to language may have rested primarily on biological evolution (Pinker
and Bloom 1990), but may instead result from cultural evolution
(historical change) alone (Arbib 2005; Kemmerer 2005). On the
former view, the brain might have innate biological mechanisms
for processing nouns and verbs, as well as principles and parameters for combining them with words and morphemes of other

categories. This is supported by the observation that nouns are often


marked for case, number, gender (see gender marking), size,
shape, definiteness, and possession, while verbs are often marked
for tense, aspect, mood, modality, transitivity, and agreement. On the latter view, once protolanguage was established, different peoples developed (and later shared) different strategies for
talking about things and actions and then developed these strategies in diverse ways to talk about more and more of their world.
This view is based on the fact that there are further aspects of language diversity hard to reconcile with natural selection of brain
mechanisms. Some languages, like Vietnamese, lack all inflection,
precluding the use of inflectional criteria for identifying grammatical categories; other languages employ inflection in unusual ways.
For example, the language of the Makah of the northwestern coast
of the United States applies aspect and mood markers not only to
words for actions that are translated into English as verbs but also
to words for things and properties.
Complex imitation has two parts: i) the ability to perceive that
a novel action may be approximated by a composite of known
actions associated with appropriate subgoals, and ii) the ability to employ this perception to perform an approximation of
the observed action, which may then be refined through practice. Both parts come into play when the child is learning a language; the former predominates in adult use of language as the
emphasis shifts from mastering novel words and constructions
to finding the appropriate way to continue a dialogue.
Michael A. Arbib

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Arbib, M. A. 2005. From monkey-like action recognition to human language: An evolutionary framework for neurolinguistics (with commentaries and authors response). Behavioral and Brain Sciences
28: 10567.
Arbib, M. A., K. Liebal, and S. Pika. 2008. Primate vocalization, gesture, and the evolution of human language. Current Anthropology
59.6: 105376.
Arbib, M. A., and G. Rizzolatti. 1997. Neural expectations: A possible
evolutionary path from manual skills to language. Communication
and Cognition 29: 393424.
Baker, M. 2001. The Atoms of Language: The Minds Hidden Rules of
Grammar. New York: Basic Books.
Byrne, R. W. 2003. Imitation as behavior parsing. Philosophical
Transactions of the Royal Society of London (B) 358: 52936.
Byrne, R. W., and J . M. E. Byrne. 1993. Complex leaf-gathering skills of
mountain gorillas (Gorilla g. beringei): Variability and standardization. American Journal of Primatology 31: 24161.
Caro, T. M., and M. D. Hauser. 1992. Is there teaching in nonhuman animals? Quarterly Review of Biology 67: 15174.
Chomsky, N., and H. Lasnik. 1993. The theory of principles and parameters. In Syntax: An International Handbook of Contemporary Research,
I: 50669. Berlin: de Gruyter.
Corp, N., and R. W. Byrne. 2002. Ontogeny of manual skill in wild chimpanzees: Evidence from feeding on the fruit of saba florida. Behavior
139: 13768.
Kemmerer, D. 2005. Against innate grammatical categories. Behavioral
and Brain Sciences 28. Available online at: http://www.bbsonline.org/
Preprints/Arbib-0501 2002/Supplemental/.
Meltzoff, A. N., and M. K. Moore. 1977. Imitation of facial and manual
gestures by human neonates. Science 198: 758.

507

Modality
Myowa-Yamakoshi, M., and T. Matsuzawa. 1999. Factors influencing
imitation of manipulatory actions in chimpanzees (Pan troglodytes).
Journal of Comparative Psychology 113: 12836.
Pinker, S., and P. Bloom. 1990. Natural language and natural selection.
Behavioral and Brain Sciences 13: 70784.
Rizzolatti, G., and M. A. Arbib. 1998. Language within our grasp. Trends
in Neuroscience 21.5: 18894.
Rizzolatti, G., L. Fadiga, V. Gallese, and L. Fogassi. 1996. Premotor cortex and the recognition of motor actions. Cognitive Brain Research
3: 13141.
Seyfarth, R. M., D. L. Cheney, and T. J. Bergman. 2005. Primate social
cognition and the origins of language. Trends in Cognitive Sciences
9.6: 2646.
Stokoe, W. C. 2001. Language in Hand: Why Sign Came Before Speech.
Washington, DC: Gallaudet University Press.
Tomasello, M. 1999. The human adaptation for culture. Annual Review
of Anthropology 28: 50929.
Tomasello, M., J. Call, J. Warren, T. Frost, M. Carpenter, and K. Nagell.
1997. The ontogeny of chimpanzee gestural signals. In Evolution
of Communication, ed. S. Wilcox, B. King, and L. Steels, 22459.
Amsterdam and Philadelphia: John Benjamins.
Yoshikawa, Y., M. Asada, K. Hosoda, and J. Koga. 2003. A constructivist
approach to infants vowel acquisition through mother-infant interaction. Connection Science 15: 24558.

The modern approach to the semantics of modal logic, developed by Saul Kripke and others, is a form of truth conditional semantics based on possible worlds. For example,
a sentence of the form S is true, at a given world w, iff S is
true at every accessible world v. Different meanings of and
are represented by establishing different sets of worlds as accessible. For example, suppose we use to represent it is morally
required that; then, at any world w, the accessible worlds are
those that are ideal, from the point of view of morality in w:

MODALITY

(2)

Definition
In its broadest sense, this term encompasses all means by which
we can talk about hypothetical situations. The conception of
modality includes the following, plus more:
(1) Expressions of necessity and possibility, in any sense
of these terms (English examples: necessary, possible, must,
may).
(2) Expressions of knowledge, belief, desire, and so on (know,
believe, want, must).
(3) Expressions used to indicate how strongly the speaker is
committed to what he or she is saying (perhaps, might).
(4) Expressions used to say that some action is obligatory or
permissible (have [to], must, may, allowed, permit).
(5) Conditional sentences (If then ).
There is a range of narrower senses of the term, each used to
describe a grammatical category or set of related categories. For
example, English can express necessity, possibility, obligation,
permissibility, and ability (and other concepts) by means of a
grammatically special set of auxiliary verbs (must, may, should,
can, etc.). So, when studying English, it is reasonable to define
modality as the range of meanings expressed by these verbs. But
other languages do not have this grammatical category, and so
its also reasonable to define modality differently when studying
these languages.

Semantic Theories
MODAL LOGIC. Much research on modality in natural language
has been inspired by modal logic (see Blackburn, de Rejke, and
Venema 2001 for a brief history). Modal logic typically has two
modal operators, (necessarily, must, or obligatory) and
(possibly, may, permissible), which attach to sentences.

508

(1) (the rich give money to the poor) is true at w iff the rich give
money to the poor in every accessible world v.
A world v is accessible if and only if it is a perfect world, from
the point of view of the moral principles holding in w.

The semantics of is given by replacing every with some in (1).


LINGUISTIC THEORIES BASED ON POSSIBLE WORLDS. Most linguistic theories of modal semantics are based on possible worlds.
For example, Angelika Kratzer (1981) refines the approach by
defining the set of accessible worlds in terms of two conversational backgrounds. According to Kratzer, the conversational
backgrounds for (2) are i) relevant facts and ii) moral principles.
Simplifying somewhat:
The rich must give money to the poor is true at w iff the rich
give money to the poor in every world v which is i) consistent with the relevant facts in w and ii) as good as possible
from the point of view of relevant moral principles in w.

NON-TRUTH CONDITIONAL THEORIES. Many philosophers and


linguists have argued that epistemic modals (see the following) lack truth conditions. Instead, they are said to indicate the
speakers level of commitment to what he or she is saying (e.g.,
Palmer 2001). Dynamic modal logic (Groenendijk, Stockhof, and
Veltman 1996) combines ideas from possible worlds semantics
with a non-truth conditional analysis of epistemic modality. The
fundamental semantic concept of dynamic logic is update potential, the capacity of a sentence to affect an information state (for
example, someones knowledge state or the information shared
in a conversation). Although the update potential of some sentences can be defined in terms of truth conditions, that of an epistemic sentence cannot be.
FUNCTIONAL
THEORIES. functional linguistics has
made important contributions to our understanding of the
history (e.g., Traugott and Dasher 2002) and typology (e.g.,
Bybee, Perkins, and Pagliuca 1994) of modality. cognitive
linguistics offers a theory of modality based on metaphor
(e.g., Talmy 1988).

Varieties of Modality
SENTENTIAL MODALITY. Most linguists take as the central cases of
modality examples in which some expression combines with a
nonmodal sentence, making it modal. For example, English must
can be analyzed as: must + (the rich give money to the poor). There
are many distinct subtypes of sentential modality, including:
(1) Deontic modality: having to do with rules, including
morality and law (example: Criminals must be punished).

Modality

Modern World-System, Language and the

(2) Epistemic modality: having to do with knowledge (It


must be raining).
(3) Subjective modality: having to do with the speakers
point of view (overlapping with 1 and 2).
(4) Dynamic modality: having to do with ability or the laws of
the natural world (Ducks can swim).
Much work in syntax has studied the representation of sentential modality. Two important issues are the extent to which
modality is represented in the same ways across languages and
whether different subtypes are realized in different grammatical
positions (e.g., Cinque 1999).
SUBSENTENTIAL MODALITY. Broad definitions of modality will
include verbs and adjectives such as know and likely. They will
also include mood, the category of expressions that reflect the
presence of modal meaning in the sentence but that do not introduce modal meaning themselves (for example, indicative and
subjunctive verb forms; e.g., Farkas 1985).
DISCOURSE MODALITY. Some varieties of modality operate at the
discourse level. evidentials are forms indicating the speakers
source or quality of information (e.g., Willett 1988). The concept
of illocutionary force is connected to modality as well
(e.g., imperative sentences direct the addressee to perform a
hypothetical action). Discourse modality overlaps with sentential and subsentential modality. For example, in some languages
the subjunctive mood can operate as an imperative.
Paul Portner
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Blackburn, Patrick, Maarten de Rejke, and Yde Venema. 2001. Modal
Logic. Cambridge: Cambridge University Press.
Bybee, Joan, Revere Perkins, and William Pagliuca. 1994. The Evolution of
Grammar: Tense, Aspect, and Modality in the Languages of the World.
Chicago: University of Chicago Press.
Cinque, Guglielmo. 1999. Adverbs and Functional Heads: A CrossLinguistic Perspective. Oxford: Oxford University Press.
Farkas, Donka. 1985. Intensional Descriptions and the Romance
Subjunctive Mood. New York: Garland.
Garson, James. 2007. Modal logic. In The Stanford Encyclopedia
of Philosophy (summer ed.), ed. Edward N. Zalta. Available
online at: http://plato.stanford.edu/archives/sum2007/entries/
logic-modal/.
Groenendijk, Jeroen, Martin Stockhof, and Frank Veltman. 1996.
Coreference and modality. In The Handbook of Contemporary
Semantic Theory, ed. S. Lappin, 179213. Oxford: Blackwell.
Kratzer, Angelika. 1981. The notional category of modality. In Words,
Worlds, and Contexts, ed. H.-J. Eikmeyer and H. Rieser, 3874.
Berlin: de Gruyter.
Palmer, F. 2001. Mood and Modality. Cambridge: Cambridge University
Press.
Portner, P. 2009. Modality. Oxford: Oxford University Press.
Talmy, Leonard. 1988. Force dynamics in language and cognition.
Cognitive Science 12: 49100.
Traugott, Elizabeth, and Richard Dasher. 2002. Regularity in Semantic
Change. New York: Cambridge University Press.
Willett, Thomas. 1988. A cross-linguistic survey of the grammaticalization of evidentiality. Studies in Language 12.1: 5197.

MODERN WORLD-SYSTEM, LANGUAGE AND THE


Ever since the modern world-system came into existence in the
long sixteenth century, language has been a primary political
concern and a locus of major political struggle. In particular,
the issue of the language or languages that one will require to be
learned and used has been a subject of decisions by states in
their constitutions, their legislation, and/or their executive policies (see language policy).
In the modern world-system, all included territory falls within
the jurisdiction of individual states. All states are linguistically
heterogeneous in the languages used within households, though
some much more so than others. Seeking to be a strong state,
most states have proclaimed an official language, meaning that
laws are written, governmental processes conducted, and education offered in the official language. Sometimes, but rarely,
it means that no other language may be used in public locales,
including in signage.
Some states have had more than one official language, and
some distinguish between the official language and one or more
national languages (which have some more restricted legal
rights). As an overall rule, almost every state has been pressed to
adopt a single official language. The usual argument, aside from
the convenience, is that a single language favors national integration, part of a process of turning a state into a nation-state.
Integration is particularly an issue when there are large immigrant groups who speak a different language.
In many states, speakers of so-called minority languages, in
the name of cultural rights, resist efforts to impose a single official language. In particular, they demand the right to use other
languages in governmental business and schools. Whether
states yield to such demands is largely a question of the internal balance of power and demographic strength of the dominant
linguistic group, as well as the degree of support a minority language may have from powerful neighboring states in which this
states minority language is the neighbors majority language. In
multilingual states, there is often social resistance by the users of
the language second in strength to learning well and using the
primary language.
The problem is compounded beyond the boundaries of a single state. Strong regional powers favor the learning of their language by states that they consider to fall within their orbit. They
use direct political pressure, the benefits of economic ties, or
cultural liaison. Adoption of particular alphabetic or ideographic
systems also favors the dominant regional power. If a small state
breaks politically with a regional power and allies itself with
another world power, it often seeks to demonstrate and cement
the new ties by adopting a new secondary language or (if relevant) changing the alphabetic system.
It is at the world or continental level that the issue becomes
most contentious. There are practical benefits in using as few languages as possible financial costs, ease of communication, and
savings in the time and effort required for either translation or
interpretation. However, the political implications of eliminating
a particular language as a legitimate option in interstate communication are very large. The United Nations now has six official
languages. Two are working languages English and French.
The inclusion of French has been the result of continuing and

509

Modularity
very strong political pressure from France. The European Union
has decided that any official language of a member state may be
used. Since this number is very large, and the costs of translating,
for example, Maltese into Finnish are enormous, the result has
been a creeping usage of English as the de facto but not de jure
official language.
In the history of the modern world-system, as Latin fell out
of diplomatic usage, French took its place. Since 1945, given the
hegemony of the United States in the world-system, English has
displaced French. The story in international scientific discourse
is different. In the nineteenth century, German was the favored
lingua franca. After 1918 and especially after 1945, because of
defeats on the battlefield, it lost this status. Before 1939, at an
international scholarly congress, participants felt free to deliver
their papers in English, French, German, and usually Italian as
well. There was normally no translation, and it was assumed that
scholars could understand at least three of the four languages.
After 1945, international scholarly organizations dropped
German and Italian entirely. In the 50 years since then, the use of
French has declined but is still permitted, and Spanish has joined
French as a permitted but seldom-used language. The inclusion
of Spanish is the direct result of the fact that there are 19 states in
which it is an official language.
In commerce, there have always been lingua francas. Anyone
going to a local market in a major center of a country in the global
South will see merchants capable of conducting their business in
multiple relevant languages. If one looks at discussions among
personnel of large corporations, there has been an increasing
tendency to use English. Nonetheless, it is probably still the
case that the ability to use a widely spoken (official) language
other than English is an advantage to persons doing business in
countries outside the English linguistic zone. In commerce, the
decision on linguistic use is less a matter of coercion than of optimizing the ability to engage in profitable business.
Finally, we should notice the consequences for the geoculture of the world-system of the existence of dominant languages.
The widespread use of English in the twenty-first century is very
advantageous for native English speakers. It is not merely convenient but tends to turn English linguistic eyes into world linguistic eyes. It also, however, has its negative side for native English
speakers. They are often the only ones cut off from the internal
communications of other linguistic zones, as well as from the
possibilities of seeing the world through other linguistic eyes.
It is quite possible that the increasing role of the Internet
in communications of all kinds, along with the declining
power of the United States in the world-system, will lead to the
reemergence of a multipolar linguistic situation, with five to
seven world languages that diplomats, scholars, and business
executives will feel the need to master and use.
Immanuel Wallerstein

MODULARITY
Modularity is the claim that human cognition is compartmentalized into a number of discrete components or modules potentially including vision, audition, moral judgement, theory of
mind, and language. These specialized modules contrast with

510

the central system responsible for problem solving and abstract


thought. Modularity is in most striking contrast with theories
such as connectionism that treat cognition as the emergent
product of an unstructured neural network.
Theories of modularity come in a variety of flavors with sometimes incompatible properties, making evaluation of the general
hypothesis difficult. I begin with the best-known example Jerry
Fodors (1983) Modularity of Mind and then contrast it with
alternative views.
For Fodor, modules, or input systems, convert sensory
inputs into representations on which the central system of
the mind can operate. Incoming stimuli of a visual, auditory, tactual, or other kind are converted into a form that, in conjunction
with knowledge drawn from memory, is adequate for problem
solving or the fixation of belief. Such beliefs are typically neither complex nor profound: Hearing a whining noise and seeing
a wagging tail may activate enough encyclopedic knowledge to
make you fix the belief that the dog wants to go out.
Fodor argued that input systems (corresponding to the
senses) all share a number of properties. Each has a specific
domain of operation (vision, audition, and so on); they act fast
and mandatorily (you have no choice but to see a dog as a dog);
they are subserved by dedicated neural architecture and, hence,
are subject to idiosyncratic pathological breakdown (you can be
blind without being deaf, and vice versa); they are innately determined (hence, universal and uniform across the species); and,
most importantly, they are informationally encapsulated (that
is, the operation of the input systems proceeds independently
of information stored in memory). You may know that railway
lines dont really converge in the distance but your visual system
still makes them look as if they do. Fodor then suggested that any
system that shared the properties of the sensory input systems
was by definition a module, with the result that language was
included as a module just like vision.
This claim highlights a radical distinction between Fodors
version of modularity and Noam Chomskys earlier one (1975)
that treats modules as knowledge structures, rather than as processing systems. The language faculty is a system of knowledge
that can be accessed by both input and output systems: We produce as well as understand language. A further difference is that
Fodor is pessimistic about the possibility of saying anything interesting about the structure of the inscrutable central system,
whereas Chomsky is more optimistic, suggesting that the central
system too is modular, with moral judgment, music, and other
faculties all having specific (if not localized) areas of the mind
dedicated to them. On a point of terminology, it is also important to note that Chomsky (and linguists more generally) use the
term module for the various subparts of the language faculty (the
lexicon and the computational system with components such as
binding, control, movement, etc.).
It is clear that even if they share some of the Fodorian properties such as innate specification and domain specificity, moral
judgment and the sense of smell are radically different. This
has led to the suggestion that we need a distinction between
(Fodorian) modules and quasi-modules, or modules of the central system (Smith and Tsimpli 1995), where these are defined
in terms of the properties (such as informational encapsulation) that they possess and the kind of vocabulary, perceptual or

Modularity

Montague Grammar

conceptual, over which they are defined. An extreme version of


this position is the claim (cf. Sperber 2002) that the mind is massively modular, with everything from individual concepts like
dog, to Fodorian modules like vision, to our general pragmatic
ability to interpret utterances being modules. It is unclear what
the identity criteria for a module are in such theories, and Fodor
himself is vehemently opposed to the claim (cf. Fodor 2000).
A rival view (e.g., Karmiloff-Smith 1992) accepts that the
minds structure is modular but denies that it is innately determined, suggesting instead that the (adult) modular structure
arises as a result of a process of modularization on the basis
of interaction with the environment during development.
Connectionists (e.g., Elman et al. 1996) are more radical and
deny the validity of modularity and its use of rules and representations entirely, relying instead on the ability of neural networks
to simulate the properties of rule-based systems.
The major evidence for modularity in all its guises is (double) dissociation. Although it is typically the case that abilities
and disabilities cut across domains (if youre good at one subject youre likely to be good at others, hence the possibility of
assigning people an intelligence quotient), the existence of dissociations demonstrates the intrinsic separability and autonomy
of the various components of the mind. For instance, intelligence
and language may doubly dissociate. It is possible to combine
high intelligence and good language (you), low intelligence and
good language (linguistic savants like Christopher [Smith and
Tsimpli 1995]), high intelligence and poor (or nonexistent) language, as in some kinds of aphasia, and low intelligence and poor
language (as in typical Down syndrome subjects).
Evidence for some version of innate modularity versus modularization due to interaction with the environment comes from
the developmental trajectory of normal children. R. Plomin and
P. Dale (2000) demonstrate that when tested over time, children
typically start with different abilities in the verbal and nonverbal domains and then gradually converge so that their abilities
are similar across domains. This is exactly the opposite of what
one would expect on a modularization story. Similarly, connectionist claims that modularity is unnecessary are undermined
by the implications of connectionisms uniform reliance on statistics. The mind exploits statistical regularities in the input differently in different domains, and Neil Smith and I.-M. Tsimpli
(1995) demonstrate that connectionist models are undesirably
powerful in that they can infer statistical regularities that normal
humans cannot.
Modularity, of some kind, is still the most successful theory
of cognition there is. It has rivals and it has problems, but it is
indispensable.
Neil Smith
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, N. 1975. Reflections on Language. New York: Pantheon. A seminal source for modularity.
Elman, J., E. Bates, M. Johnson, A. Karmiloff-Smith, D. Parisi, and
K. Plunkett. 1996. Rethinking Innateness: A Connectionist Perspective
on Development. Cambridge, MA: MIT Press. A sustained alternative
to modularity.
Fodor, J. 1983. The Modularity of Mind. Cambridge, MA: MIT Press. The
classic and best-known statement of the modularity thesis.

. 2000. The Mind Doesnt Work That Way: The Scope and Limits of
Computational Psychology. Cambridge MA: MIT Press.
Karmiloff-Smith, A. 1992. Beyond Modularity. Cambridge, MA: MIT
Press.
Plomin, R., and P. Dale 2000. Genetics and early language development: A UK study of twins. In Speech and Language Impairments
in Children: Causes, Characteristics, Intervention and Outcome, ed.
D. Bishop and L. Leonard, 3551. Philadelphia: Psychology Press.
Smith, N. 2003. Dissociation and modularity: Reflections on language
and mind. In Mind, Brain and Language, ed. M. Banich and M. Mack,
87111. Mahwah, NJ: Lawrence Erlbaum. This article treats in greater
depth many of the issues discussed here.
Smith, N., and I.-M. Tsimpli. 1995. The Mind of a Savant: LanguageLearning and Modularity. Oxford, Blackwell.
Sperber, D. 2002. In defense of massive modularity. In Language, Brain
and Cognitive Development: Essays in Honor of Jacques Mehler, ed.
E. Dupoux, 4757. Cambridge, MA: MIT Press.

MONTAGUE GRAMMAR
Montague grammar is a theory of semantics and the syntaxsemantics interface developed by the logician Richard Montague
(193071) and subsequently modified and extended by linguists,
philosophers, and logicians. Classical Montague grammar
had its roots in logic and the philosophy of language; it quickly
became influential in linguistics, and linguists played a large
role in its evolution into contemporary formal semantics.
The most constant features of the theory over time have been
the focus on truth conditional aspects of meaning (see truth
conditional semantics), a model-theoretic conception of
semantics, and the methodological centrality of the principle of
compositionality.

History
Montague was a student of Alfred Tarski (190283), a pioneer in
the model-theoretic semantics of logic. Montague developed an
intensional logic with a rich type theory and a model-theoretic
possible worlds semantics, incorporating certain aspects
of (formal) pragmatics, including the treatment of indexical
words and morphemes like I, you and the present tense. In the
late 1960s, Montague turned to the project of universal grammar,
which for him meant a theory of syntax and semantics encompassing both formal and natural languages.
Montagues idea that a natural language could be formally
described using logicians techniques was radical. Most logicians considered natural languages too unruly for precise
formalization, while most linguists either had no awareness
of model-theoretic techniques in logic or doubted the applicability of logicians methods to natural languages (Chomsky
1955).
At the time of Montagues work, generative grammar was
established, linguists were developing approaches to semantics,
and the relation of semantics to syntax had become central. The
linguistic wars between generative semantics and interpretive semantics were in full swing (Harris 1993). In introducing Montagues work to linguists, Barbara Partee (1973, 1975)
and Richmond Thomason (1974) argued that Montagues work
offered some of the best aspects of both warring approaches,
with added advantages of its own.

511

Montague Grammar

The Theory and Substance of Montague Grammar


It was the short but densely packed PTQ (The proper treatment
of quantification in ordinary English, Montague 1973) that had
the most impact on linguists and on the development of formal
semantics. Montague grammar has often meant PTQ and its
extensions by linguists and philosophers in the 1970s and 1980s.
But it is the broader algebraic framework of UG (Universal
Grammar, Montague 1970) that constitutes Montagues theory
of grammar. Crucial features of that theory include the truth conditional foundations of semantics, the algebraic interpretation of
the principle of compositionality, and the power of a higher-order typed intensional logic.
Before Montague, semanticists focused on the explication of
ambiguity, anomaly, and semantic relatedness; data were often
subjective and controversial. The introduction of truth conditions and entailment relations as core data profoundly affected
the adequacy criteria for semantics and led to a great expansion
of semantic research. While some cognitively oriented linguists
reject the relevance of truth conditions and entailment relations
to natural language semantics, many today seek a resolution of
meaning externalism and internalism by studying mindinternal intuitions of mind-external relations, such as reference and truth conditions.
In UG, Montague formalized the Fregean principle of compositionality as the requirement of a homomorphism between a
syntactic algebra and a semantic algebra. The nature of the elements of both the syntactic and the semantic algebras is open to
variation; what is constrained by compositionality is the relation
of the semantics to the syntax, making compositionality as relevant to representational and conceptual theories of meaning as
it is to model-theoretic semantics.
The richness of Montagues logic made possible a compositional semantic interpretation of independently motivated syntactic structure (see autonomy of syntax), which was key in
overcoming the problems that underlay the linguistic wars. This
was well illustrated in PTQ, where a typed higher-order logic with
lambda-abstraction made it possible to interpret noun phrases
(NPs) like every man, the man, a man uniformly as semantic
constituents (generalized quantifiers), an idea simultaneously
advocated by David Lewis (1970). PTQ also contained innovative
treatments of quantification and binding, intensional transitive verbs, phrasal conjunction, adverbial modification, and
more. Montagues type theory introduced to linguists Freges
strategy of taking function-argument application as the basic
semantic glue for combining meanings, giving renewed significance to categorial grammar.
Montagues logic was an intensional logic, developing
Gottlob Freges distinction between sense and reference
and Rudolf Carnaps distinction between intension and
extension, using possible world semantics to treat the phenomenon of referential opacity, pervasive in propositional
attitude sentences and many other constructions (see
intentionality ).
Details of Montagues analyses have been superseded,
but in overall impact, PTQ was as profound for semantics as
Noam Chomskys Syntactic Structures was for syntax. Emmon
Bach (1989, 8) summed up their cumulative innovations
thus: Chomskys thesis was that English can be described as

512

Mood
a formal system; Montagues thesis was that English can be
described as an interpreted formal system.
Barbara H. Partee
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bach, Emmon. 1989. Informal Lectures on Formal Semantics. New
York: State University of New York Press.
Chomsky, Noam. 1955. Logical syntax and semantics: Their linguistic
relevance. Language 31: 3645.
Dowty, David, Robert E. Wall, and Stanley Peters, Jr. 1981. Introduction to
Montague Semantics. Dordrecht, the Netherlands: Reidel. The classic
textbook on Montague grammar.
Gamut, L. T. F. 1991. Logic, Language, and Meaning. Vol. 2. Intensional
Logic and Logical Grammar. Chicago: University of Chicago Press. A
good, rigorous introduction to Montague grammar and its logic.
Harris, Randy Allen. 1993. The Linguistics Wars. Oxford: Oxford University
Press.
Lewis, David. 1970. General semantics. Synthese 22: 1867.
Montague, Richard. 1970. Universal Grammar. Theoria 36: 37398.
Repr. in Montague 1974, 22246.
. 1973. The proper treatment of quantification in ordinary English.
In Approaches to Natural Language, ed. K. J. J. Hintikka et al., 22142.
Dordrecht, the Netherlands: Reidel. Repr. in Montague 1974, 24770.
. 1974. Formal Philosophy: Selected Papers of Richard Montague.
Ed. R. Thomason. New Haven, CT: Yale University Press.
Partee, Barbara. 1973. Some transformational extensions of Montague
grammar. Journal of Philosophical Logic 2: 50934.
. 1975. Montague grammar and transformational grammar.
Linguistic Inquiry 6: 203300.
Partee, Barbara H., with Herman L. W. Hendriks. 1997. Montague grammar. In Handbook of Logic and Language, ed. J. van Benthem and A.
ter Meulen, 591. Amsterdam and Cambridge, MA: Elsevier and MIT
Press. A fuller history and explication of Montague grammar and its
impact.
Thomason,
Richmond.
1974.
Introduction.
In
Formal
Philosophy: Selected Papers of Richard Montague, ed. R. Thomason,
169. New Haven, CT: Yale University Press.

MOOD
Mood forms part of the nonspatial setting of an event, alongside

modality, reality status, tense, aspect, and evidentiality.


Mood refers to a type of speech-act, with three basic choices.
Many languages have a special verb form marking commands,
which is known as imperative mood. In Latin, the second person imperative dic means (you) say! and is different from the
statement dicis, you say. Declarative mood (sometimes called
indicative) is used in statements. Many more categories tend to
be expressed in declarative clauses than in either interrogative or
imperative. Interrogative mood occurs in questions as in West
Greenlandic where every question is marked with a special suffix
on verbs (Fortescue 1984, 49, 28798).
In traditional uses, the notion of mood applied to sets of
inflectional verb forms. The Western classical tradition, based on
Greek and Latin, identified three moods: indicative, subjunctive,
and imperative, which only partially correspond to the aforementioned three speech-acts. Further meanings associated with
mood involve optative and dubitative (see Lyons 1977, 725848;
Sadock and Zwicky 1985). Some scholars consider conditional
modality which marks a clause in a conditional sentence and

Mood
subjunctive modality typically, a form expressing desire or
uncertainty on a par with moods. This is problematic since
the distinction between moods as speech-acts and clause types
(which include division between main and subordinate clauses,
where conditional forms would be used) is blurred. The introduction of interrogative mood into the system is largely due to
the existence of languages that have an overtly marked verbal
paradigm used for the interrogative speech-act, as in a number
of languages of Amazonia. Further formal distinctions between
moods as clause types involve prosody and constituent order.
Both imperative and interrogative are characterized by a typical intonation contour. Imperatives often have fewer categories than corresponding declaratives. The English imperative is
perhaps the simplest form in the language: It consists of the base
form of the verb without any tense inflection, whose subject
typically, the addressee can be and often is omitted. In contrast, many languages of North and South America and Siberia
distinguish delayed versus immediate imperatives and proximal
versus distal imperatives. The universal property of imperatives
is having the second person as subject, of a transitive or intransitive verb (Dixon 1994, 13142). A prototypical imperative is
agentive, and this is why in numerous languages imperative cannot be formed on passive and stative verbs. Other moods do not
have such restrictions. Imperatives directed at the first person
(e.g., Lets go!), also known as hortatives, are often expressed differently from second person imperatives. Imperatives directed at
the third person (e.g., Long live the king!), also known as jussives,
may share similarities with first person imperatives, or have
properties different from all other imperative forms. Further,
minor moods include exclamative (as in Thats so tacky!) and
expressive types, such as imprecatives (or curses, often cast as
commands but without a command meaning).
Mood interacts with modality, understood as a means used
by the speaker to express his or her opinion or attitude towards
the proposition that the sentence expresses or the situation
that the proposition describes (Lyons 1977, 452). Expressions
of probability, possibility, and belief are epistemic modalities,
and expressions of obligation are deontic modalities. In English,
these meanings are conveyed by modal verbs, e.g., he might
come or he must have come (epistemic), he must come (deontic)
(see Palmer 1986, 51125; Jespersen 1924, 3201). Further modal
distinctions include desiderative (unachievable desire), optative
(achievable desire), conditional, hypothetical, potential, purposive, and apprehensive (lest). Languages with a rich verbal
morphology may have special marking for each distinction.
An alternative (rare) cover term for both mood and modality is
mode (Chung and Timberlake 1985).
Some languages have an affix with a general meaning of
irrealis covering possibility, future, negative statements, and
commands. These languages have the category of reality status,
the grammaticalized expression of an events location either in
the real world or in some hypothetical world (see Elliott 2000, for
its cross-linguistic validity). In Maung, an Australian language,
statements in the present, past, and future are marked with realis suffixes. Potential meanings I can do X are expressed with
irrealis, as are commands. In Manam, an Oceanic language, irrealis covers future, probable, and counterfactual statements, positive commands, and habitual actions. But in Yuman languages

Morpheme
and in Caddo, from North America, realis marks statements and
commands, while irrealis expresses future, possibility, and condition. This shows that the realisirrealis distinction is language
specific and that it is distinct from mood (see Mithun 1999,
17880). Mood, modality, and reality status are distinct from evidentiality (q.v.) whose primary meaning is information source.
Mood is often an obligatory inflectional category of the verb,
marked with affixes (suffixes or prefixes, rarely infixes); it is never
expressed derivationally. In languages of an isolating profile,
mood can be expressed through particles. Modalities are not
obligatory and, thus, do not constitute part of an inflectional paradigm. Modal verbs express modalities rather than moods (this
is the case in English and many other familiar Indo-European
languages).
Forms of mood marking can develop additional meanings
overlapping with modalities. Imperative forms can be used to
express optative and conditional, while indicative forms may
develop overtones of certainty (associated with epistemic modality). Indicative forms for instance, future can be used as command strategies, with differences in illocutionary force.
Alexandra Y. Aikhenvald
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aikhenvald, Alexandra Y. 2004. Evidentiality. Oxford: Oxford University
Press.
Chung, Sandra, and Alan Timberlake. 1985. Tense, aspect and mood.
In Language Typology and Syntactic Description. Vol. 3: Grammatical
Categories and the Lexicon. Ed. Timothy Shopen, 20258.
Cambridge: Cambridge University Press.
Dixon, R. M. W. 1994. Ergativity. Cambridge: Cambridge University
Press.
Elliott, Jennifer R. 2000. Realis and irrealis: Forms and concepts of the
grammaticalisation of reality. Linguistic Typology 4: 5590.
Fortescue, Michael. 1984. West-Greenlandic. Beckenham, UK: Croom
Helm.
Jespersen, Otto. 1924. The Philosophy of Grammar. London: George Allen
and Unwin.
Lyons, John. 1977. Semantics. Vol. 2. Cambridge: Cambridge University
Press.
Mithun, Marianne. 1999. The Languages of Native North America.
Cambridge: Cambridge University Press.
Palmer, F. R. 1986. Mood and Modality. Cambridge: Cambridge University
Press.
Sadock, Jerrold M., and Arnold M. Zwicky. 1985. Speech act distinctions in
syntax. In Language Typology and Syntactic Description. Vol. 1: Clause
Structure. Ed. Timothy Shopen, 15596. Cambridge: Cambridge
University Press.

MORPHEME
This term has been used in two ways: In Leonard Bloomfields
sense, a morpheme is a minimal meaningful form; in Zellig
Harriss and Charles F. Hocketts later usage, a morpheme is an
abstract unit of analysis realized by a morph (= minimal meaningful form) or by a set of synonymous morphs in complementary distribution. In Bloomfields sense, the plural suffixes -s
and -en are distinct morphemes; in the latter sense, they are
distinct morphs realizing the same morpheme. The term is not
always used consistently in morpheme-based approaches to

513

Morphological Change
morphology. In paradigm-based approaches, no linguistic
principle is assumed to make essential reference to morphemes
as a unified class of elements.
Gregory Stump
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bloomfield, Leonard. 1933. Language. New York: Henry Holt and Co.
Harris, Zellig S. 1942. Morpheme alternants in linguistic analysis.
Language 18: 16980.
Hockett, Charles F. 1947. Problems of morphemic analysis. Language
23: 32143.

MORPHOLOGICAL CHANGE
Morphological change involves alterations made by speakers over
time to the analysis of complex words, or to the relations between
a lexical base and its compounds and derivatives, or to the set
of inflected words that share a common lexical base. The main
mechanisms of morphological change are reanalysis (the reinterpretation of forms) and the extension of patterns to create new
forms. The impetus for reanalysis often comes from changes to
the semantics, phonology, or syntax of the affected forms.
Semantic shift may affect the function of a grammatical element. Thus, in various Australian languages, a subordinating
purposive suffix (in order to VERB), typically -ku, came to be
used in independent clauses as a marker of intentional mood
(may VERB), then further shifted to express future tense (will
VERB). Functional shift in a grammatical affix also took place
in some Karnic (Australian) languages, in which a locative (at)
case suffix -nga came to mark the dative (for) function. Meaning
changes may lead to the reinterpretation of compound words as
simple lexemes (e.g. hlford loaf + ward > lord, shep-herd
sheep+ herder > shepherd), or as a lexical stem plus a derivational affix (king-dom) with downgrading of a bound lexical form
-dom meaning condition to a derivational suffix.
Sound changes often create new allomorphs: For example,
the earlier English plural suffix -z split into three variants -z, -z,
-s, and the stem long developed a variant leng- in length with the
e conditioned by the following i in the former derivational suffix
-ith. As the relationship between words becomes obscured by
the accumulation of sound changes, some phonological differences become morphologized, that is, reinterpreted as partial or
even sole signals of a morphological property. Thus, the vowel e
in slep-t (vs. ee in sleep) helps to mark past tense, and ee in feet
(vs. oo in foot) alone marks plural. Even sound changes that

once operated between words can give rise to alternations with


morphological value; thus, the consonant mutations of Irish
Gaelic, such as b/v/m in ba:d, va:d, ma:d her, his, their
boat, respectively, result ultimately from the differential effects
of former word-final consonants of possessors *as, *a, *an on
word-initial b. The effects of sound changes, such as the erosion
of final syllables whereby English singular and plural forms
oxe, oxene became ox, oxen, can lead to the reanalysis of internal
word structure so that -en is interpreted as a (new) plural suffix. Loss of final t in the pronunciation of French argent silver
caused argent-ier silversmith to be reinterpreted as argen-tier,
and allowed the new pattern to be extended to create derivatives
such as bijou-tier jeweler from bijou.
Syntax may supply the source of new morphology, as phrases
are reinterpreted as single words (by a process called univerbation).
Former clitics may be reanalyzed as affixes in a process often
described as a kind of grammaticalization with accompanying functional changes. Thus in Tocharian, new case suffixes (with
meanings such as toward, through, with, from) were created by fusing former postpositions with nouns in the oblique case.
Complex new inflectional markers can be created, such as French
aim-eras you will love, where the suffix includes part of the Latin
infinitive suffix, auxiliary verb have, second-singular (2sg) subject marker as the Romance future was grammaticalized from a
construction have to VERB. The univerbation of phrases can even
lead to word-internal inflection, with grammatical markers becoming trapped between erstwhile lexical elements, for example, in the
slightly archaic English whomever and whose-ever, where ever was
once a separate word, or in Old Irish atotch sees you (vs. atch
sees), where ot continues an earlier pronoun that was positioned
between the two words that together meant see.
Much morphological change involves only rearrangements
within the morphology itself, within and across paradigms, and
involving either stems or affixes. In leveling, one variant of a
stem is extended to all cells in an inflectional paradigm; thus, in
Ancient Greek, the prehistoric kw of *leikw- leave developed by
regular sound change into t or p before different vowels, but the
leip- variant was later generalized to the whole paradigm. Stem
variants may be redistributed according to a pattern prevalent
in other paradigms by a process called analogical change (see
analogy; synchronic and diachronic). Thus, in the early
modern German verb give the variant gib-, which arose by
sound change in all the singular forms, was later confined to the
second and third persons singular because many other verbs, for
example, sleep, had a pattern where only these two forms had
a different stem vowel. (See Table.)

Pre-Greek

(Doric) Greek

EMGerman

ModGerman

German

leave

leave

give

give

sleep

1Sg

leip

leip

Gib

geb-e

schlafe

2Sg

leiteis

leipeis

gib-st

gib-st

schlfst

3Sg

leitei

leipei

gib-t

gib-t

schlft

1Pl

leipomen

leipomen

geb-en

geb-en

schlafen

2Pl

leitete

leipete

geb-t

geb-t

schlaft

3Pl

leiponti

leiponti

geb-en

geb-en

schlafen

514

Morphological Typology

Morphology

Where there are different inflectional classes, one class is


usually dominant and its inflectional pattern tends to influence
the others. Thus, in early Italic languages, noun stems in -, i-,
u- remodeled their former ablative singular forms on the pattern
of d in the dominant o-stem class, creating new endings in -d,
-d, -d, respectively. Words are often transferred from an irregular inflectional class to the dominant one: Thus, English drag, a
former strong verb with past drug, changed to the weak class
with regular past inflection dragged.
Iconicity has been emphasized by Natural Morphologists
as one of the principles motivating morphological change.
Iconically organized paradigms code more complex grammatical meanings by means of phonologically larger markers and
simpler meanings with smaller markers, and the most basic
meanings (singular number, nominative case, present tense,
third person agreement, etc.) by no marker at all. In some
Slavic languages, after sound changes created zero case-number
suffixes in both the nominative singular and genitive plural of
o-stem nouns, the paradigm was repaired only in the genitive
plural (by substituting an overt suffix -ov from another inflectional class), whereas the iconic zero marking was retained in
the nominative singular. Iconicity in the syntagmatic dimension
is increased by changes that reorder a form like (Australian)
Arrernte me-k(e)-atye mother-to-my, where atye was originally an enclitic pronoun, to the sequence m(e)-atye-ke mother-my-to, which better mirrors the semantic scopal relations
between the elements.
Harold Koch
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, Stephen R. 1992. Morphological change. In A-morphous
Morphology, 33672. Cambridge: Cambridge University Press.
Joseph, Brian D. 1998. Diachronic morphology. In The Handbook
of Morphology, ed. Andrew Spencer and Arnold M. Zwicky, 35173.
Oxford: Blackwell.
Koch, Harold. 1996. Reconstruction in morphology. In The Comparative
Method Reviewed: Regularity and Irregularity in Language Change, ed.
Mark Durie and Malcolm Ross, 21863. New York: Oxford University
Press.

ev
-ler- im
- iz -den
house -PL -POSS. 1-PL -ABL

The third type of language expresses differences in morphosyntactic and lexicosemantic properties through contrasting modifications, or inflections of a words stem. These are inflectional
or fusional languages. The classical languages, Greek, Latin, and
Sanskrit, belong to this type. In Latin you (sg[singular]) loved
is expressed by various modifications of the root am- love to
yield amvist: stem formative -v to express perfect, and -ist to
express perfect (again) + 2d person + singular. Typically, properties are fused in one exponent: Here aspect, person and number agreement are expressed together. Equally, a property can
be expressed by more than one exponent: Here perfect is being
expressed twice.
There has been general unease among modern linguists with
the classical typology. One reason is that languages rarely fall
cleanly into one of these types. For example, Mandarin Chinese
productively uses what looks like a derivational suffix to build
agentive nouns, the word q mechanism: sn-r q cooler,
jin-c q monitor, yng-shng q speaker; compare the
English -er/-or agentive suffix (Hippisley, Cheng, and Ahmad
2005). More importantly, there is some doubt that the typology
offers any theoretical insight, a point argued as far back as Sapir
(1921). Part of the reason is that morphological type is really a
function of other grammatical structures worthy of typological investigation, and is, therefore, epiphenomenal (Anderson
1990).
A more promising approach is to focus on much more narrowly defined word structures and to investigate how they
cross-cut languages that may or may not be genetically or typologically related. The result is then a typology of narrowly defined
structures of words that answer the question What is a possible
word? This is the approach taken by Greville G. Corbett and colleagues, who look at unusual morphology such as suppletion,
deponency, and defectivenesss, recording such structures in a
large number of individual languages and inducing diachronic
and synchronic models of their appearance and use in syntax
(e.g. Corbett 2007; Baerman and Corbett 2007).
Andrew Hippisley

MORPHOLOGICAL TYPOLOGY
typology has its origins in nineteenth-century morphological typology, a method of grouping languages not according
to genetic relatedness but to structural similarity, where the
structure was specifically word structure (see morphology ).
Traditionally, there are three possibilities for phonologically
expressing morphosyntactc (inflectional) and lexicosemantic
(derivational) properties at the level of the word. In an isolating or analytical language, complex words are built from
existing words, free forms. Mandarin Chinese could be viewed
as an isolating language. Productive coining of new terms is
through compounding. The word for Internet is h-lin
wng with h inter + lin related + wng net. In agglutinating languages, the pieces of a complex word map onto
specific meaning elements biuniquely, both at the lexical and
grammatical level. Turkish evlerimizden from our houses is
glossed as

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Anderson, Steven. 1990. Sapirs approach to typology. In Contemporary
Morphology, ed. W. Dressler et al., 27795. Berlin: Mouton de Gruyter.
Baerman, Matthew, and Greville G. Corbett. 2007. Linguistic typology: Morphology. Linguistic Typology 11: 11517.
Corbett, Greville G. 2007. Canonical typology, suppletion and possible
words. Language 83: 842.
Hippisley, Andrew, David Cheng, and Khurshid Ahmad. 2005. The
head modifier principle and multilingual term extraction. Natural
Language Engineering 11.2: 12957.
Sapir, Edward. 1921. Language. New York: Harcourt, Brace and World.

MORPHOLOGY
Morphology and Words
While the lexicon of a language lists basic forms and their content
(meanings and grammatical properties), a languages complex

515

Morphology
words neednt be invariably listed, since their form and content
are often partially or wholly deducible from those of their parts
by means of regular principles. This system of principles is the
languages morphology.
In morphological theory, it is useful to distinguish three senses
of word. In one sense, the form come is the same word in (1) and
(2); in another sense, come in (1) is a different word from come
in (2). This apparent paradox arises because word can be used
to refer to either a phonological or a grammatical unit: In (1) and
(2), come represents the same phonological unit (phonetically
[km]) but two distinct grammatical units: the unmarked infinitive form of the verb come in (1) and the past participial form of
this verb in (2).
(1)

Sandy should come home.

(2)

Sandy has already come home.

Thus, it is useful to distinguish phonological words such as [km]


from grammatical words such as the past participle of come.
Moreover, there is a third theoretically relevant interpretation
of word according to which go and gone are different forms of
the same word. Here, word refers neither to a phonological word
nor to a grammatical word but to the abstract lexical element of
which go and gone are both realizations; abstract elements of this
sort are referred to as lexemes.
These three senses of word are related in the following
way: the pairing of a lexeme with an appropriate set of morphosyntactic properties defines a grammatical word, as in (3), and
the phonological realization of a grammatical word is a phonological word, as in (4).
(3)

Grammatical words

(4) Phonological words

a.talk, {3sg present


indicative}

a.(3a)s realization: [tks]

b.dog, {plural}

b.(3b)s realization: [dgz]

c.good, {comparative} c.(3c)s realization: [b]


Correspondingly, a lexemes paradigm is the full set of grammatical words associated with it; a morphosyntactic property sets
exponence is its phonological realization; and where G is a grammatical word in the paradigm of lexeme L, Ls root is the phonological form (if such can be identified) with which the exponence
of Gs property set combines in the phonological word realizing
G. Thus, in the realization of the grammatical word (3b) in the
paradigm of the nominal lexeme dog, the exponence [z] of (3b)s
property set {plural} combines with dogs root [dg]. The distinction between a words exponence and its root is, of course,
sometimes difficult to draw, as in the realization of be, {1sg
[first-singular] present indicative} as the portmanteau [m].

Branches of Morphology
A languages morphology comprises two systems. The inflectional
system defines the phonological realization of the grammatical words in a lexemes paradigm; for instance, the inflectional
system of English specifies that the third-singular (3sg) present
indicative form of the lexeme talk is talks. By contrast, the system of word formation (better: lexeme formation) defines complex lexemes in terms of simpler lexemes. The latter system itself

516

comprises two subsystems. The derivational subsystem derives


one lexeme from another; for instance, the derivational subsystem of English derives the verbal lexeme crystallize from the
nominal lexeme crystal. The compounding subsystem defines
complex lexemes through the combination of other lexemes;
thus, the compounding subsystem of English creates the compound lexeme mountain lion from the lexemes mountain and
lion. A number of criteria have been adduced to distinguish
inflection from derivation and to distinguish compounds from
syntactic combinations; see Booij (2000) and Matthews (1991),
respectively.

Incremental and Realizational Approaches to Morphology


Structuralist approaches to morphology gave primacy to
morphemes (minimal pairings of form with meaning) as the basic
units of morphological analysis. They were incremental in orientation, in that they presumed that the content of a word is the sum
of the content of its component morphemes. These structuralist
assumptions have been very persistent in modern linguistic theory.
Their widespread acceptance has led many linguists to assume that
all morphological phenomena can be accounted for by independently needed principles of syntax and phonology; thus, since the
advent of generative linguistics, morphological issues have often
been addressed as a part of syntax (Selkirk 1982; Lieber 1992) or as
a part of phonology (Kiparsky 1982).
Incrementalist theories of morphology are problematic, however (Stump 2001, 3 ff). First, there are words whose content cannot be factored into that of their component morphemes; that
is, the content of a words individual morphemes may underdetermine that of the word itself. The aorist verb form krd-o-x I
stole in Bulgarian is unambiguous despite the fact that none of
its three component morphemes expresses first-singular (1sg)
subject agreement; compare 2sg/3sg krde, 1pl krd-o-x-me,
2pl krd-o-x-te, and 3pl krdo-x-a. To account for such forms,
proponents of incrementalist theories must postulate zero morphemes, which lack overt phonological realization but purportedly supply the missing content.
Second, syncretism (the use of the same morphology to
express distinct content) is problematic for incrementalist theories. In Sanskrit, the accusative singular suffix -m is also used as
a nominative singular suffix in the paradigms of neuter a-stem
nouns: Compare the masculine noun horse (nom. sg. ava-,
acc. sg. ava-m) with the neuter noun gift (nom./acc. sg.
dna-m). Incrementalist theories must attribute syncretism to
homonymy (e.g., to the existence of two distinct -m suffixes in
Sanskrit), but in doing so miss important generalizations (e.g.,
the fact that the nominative and accusative are always syncretized in the paradigms of Sanskrit neuter nouns, regardless of
what the exponence of these cases might be).
Finally, incrementalist assumptions give no explanation for
the incidence of extended exponence (the appearance, within a
single word, of more than one morpheme expressing the same
content). In Nyanja (Niger-Congo; Malawi), adjectives exhibit
noun-class agreement with the noun they modify, and members
of one group of adjectives exhibit two agreement prefixes, as in
the case of ci-pewa ca-ci-kulu large hat, where kulu large
agrees with the class 7 noun -pewa hat by means of two distinct
prefixes. On incrementalist assumptions, the prefix ci- in ca-ci-

Morphology
kulu should alone suffice to mark this form for class 7 agreement; the appearance of the additional prefix ca- not only seems
unnecessary but actually violates the anti-redundancy principle
(Kiparsky 1982; 136 f) purported to prevent the suffixation of plural -s to English men.
The alternative to an incrementalist theory is a realizational
theory, according to which a words content determines its morphological form (Matthews 1972; Zwicky 1985; Anderson 1992).
In a realizational theory, the paradigm of the verbal lexeme talk
includes the grammatical word in (3a), and the phonological
word that realizes this grammatical word arises from talks root
through the application of any rules associated with the morphosyntactic property set in (3a); there is one such rule, which
realizes the property set {3sg present indicative} through the suffixation of s.
In a realizational theory, the fact that a words form may
underdetermine its content is unproblematic, since content is
not deduced from form in any event. Thus, the fact that the firstsingular aorist form krd-o-x I stole in Bulgarian has no exponent of first-singular subject agreement is simply the effect of a
kind of poverty in the languages verb morpology: It happens not
to have any means of expressing the property 1sg in the realization of the grammatical word steal, {1sg aorist}. Syncretism
is likewise unproblematic: One need only assume that the realization of one word in a lexemes paradigm may pattern after the
realization of a different word in that paradigm. Rules of referral
(Zwicky 1985; Stump 1993) express this kind of relation between
cells in a paradigm; thus, in Sanskrit, a rule of referral specifies
that the realization of a neuter nouns nominative singular cell is
the same as that of its accusative singular cell. Finally, extended
exponence is unproblematic in a realizational theory; in the case
of Nyanja ca-ci-kulu large [class 7], one need only assume that
more than one rule of prefixation participates in the realization
of the grammatical word large, {class 7}.

Current Theories of Morphology


Two approaches to morphology dominate the theoretical landscape: the morpheme-based approach and the paradigm-based
approach. Distributed morphology (DM) is the main embodiment of the morpheme-based approach (Halle and Marantz
1993). DM maintains the structuralist focus on morphemes as
the central unit of morphological analysis, but differs from earlier morpheme-based approaches in its assumption that morphemes are inserted into abstract grammatical structures in a
realizational fashion. (Here and throughout, I use morpheme in
the Bloomfieldian sense of minimal form-meaning pairing.)
The verb in They talked instantiates the abstract grammatical
structure V-past-pl through the insertion of the verbal morpheme talk and the past tense morpheme -ed; the property of
third-plural agreement goes unrealized because there is no nonzero morpheme available to realize it. (DM therefore accommodates cases of underdetermination such as that of krd-o-x I
stole.) From earlier incrementalist approaches, DM inherits the
assumption that morphological structures such as V-past-pl
are defined by rules of syntax. This assumption presents problems that have never been convincingly resolved in the somewhat hermetic DM literature: In rejecting rules that (like rules
of referral) are defined over paradigms, DM is left without any

general account of such essentially paradigmatic phenomena as


syncretism (the relation among paradigm cells that are identical in their realization), deponency (the realization of one cell by
means of morphology appropriate to a different cell), heteroclisis (the realization of distinct cells within a paradigm according
to contrasting conjugational/declensional patterns), and defectiveness (the existence of unrealized cells within a paradigm). For
discussion of these phenomena in paradigm-based frameworks,
see Baerman (2004), Baerman, Brown, and Corbett (2005), and
Stump (2001, 2006).
The alternative, paradigm-based approach is instantiated
by such realizational theories as A-morphous morphology
(Anderson 1992), network morphology (Corbett and Fraser 1993;
Brown and Hippisley in press), and paradigm function morphology (Stump 2001). These theories take paradigms rather than
morphemes as the primary object of morphological inquiry and
formulate morphology as an autonomous grammatical component. Despite differences of detail, they are alike in assuming that
a lexeme has a paradigm of grammatical words a set of pairings
such as those in (3) whose phonological realization is determined by a system of deductive rules, for example, the rule of
exponence in (5a) and the rule of referral in (5b).
(5)

a. Where lexeme L has root R, L, {finite past } is realized


as Red.
b. L, {past participle} has the same realization as L, {finite
past }.

By (5a), the lexeme walk has the past tense form walked; by
(5b), this lexeme also has walked as its past participle. A central assumption in paradigm-based theories is that rules act as
defaults and are therefore subject to override; in the inflection
of verbal lexemes such as sing, the rules in (5) are overridden
by those in (6). An important concern in such theories is that of
establishing general principles regulating the default/override
relations among rules of morphology. In network morphology,
these relations are regulated by their position in default-inheritance hierarchies; in paradigm function morphology, they are
regulated by the Pinian determinism hypothesis (Stump 2001,
23), according to which Rule A overrides Rule B if and only if A is
narrower in application than B.
(6) Where L belongs to the sing class and has root R,
a. L, {finite past } is realized through the substitution of
[] for [] in R.
b. L, {past participle} is realized through the substitution of
[] for [] in R.

Because they define complex words by means of deductive


rules such as those in (5)/(6), paradigm-based theories afford
a parsimonious account of interactions between concatenative
and nonconcatenative morphology: The fact that (5a) fails to
apply in the definition of sings past tense form can be directly
attributed to the override relation between (6a) and (5a).
In DM, by contrast, the absence of -ed in sang must instead be
attributed to an overriding, phonologically empty suffix whose
presence triggers a rule of [] [] ablaut. This account of the
sing/sang (*singed) alternation implies a parallel account of sing/
sung (*have singed), mouse/mice (*mouses), thief/thieve (*thiefize,
cf. burglarize), and other alternations: In each case, a default affix

517

Morphology
must be seen as being overridden by a null affix, which by stipulation triggers a rule of internal modification. What emerges is
a widely recurrent coincidence that is never explained: Again
and again, a zero affix is stipulated as the unoverridden override among a set of competing morphemes; over and over, this
unoverridden override by stipulation triggers a rule of internal
modification. There is, of course, no overt class of phonologically
identical affixes in any language that ever shows the kind of syntagmatic and paradigmatic distribution that DM must stipulate
for the artifactual class of phonologically null affixes upon which
this approach depends.

Current Issues in Morphology


The differences between morpheme-based theories and paradigm-based theories have been most clearly articulated with
reference to inflectional phenomena. But the question naturally
arises whether the principles of lexeme formation are morpheme
based or instead favor a paradigm-based approach. Significantly,
derivational morphology exhibits the same sort of default/override relations as inflectional morphology: In much the same way
as the lexeme sing possesses an inflectional paradigm in which
the appearance of the past tense form sang blocks that of *singed,
the lexeme strong seems to possess a derivational paradigm
in which the appearance of the nominal derivative strength
blocks that of *strongness. For discussion of the evidence for
derivational paradigms, see Bauer (1997), Booij (1997).
The nature of the interface between morphology and syntax
also requires further scrutiny. One question subsumed by this
broad issue is whether a languages periphrases are defined by its
morphology or by its syntax. A periphrase is a multiword realization for a grammatical word; thus, while the grammatical word
smart, {comparative} has the synthetic realization smarter,
the grammatical word intelligent, {comparative} has the
periphrastic realization more intelligent (*intelligenter). For discussion of the evidence in favor of a morphological approach to
periphrasis, see Kersti Brjars, Vincent, and Chapman (1997) and
Ackerman and Stump (2004); the grammatical consequences of
this conclusion remain to be worked out in detail.
Another controversial aspect of the morphology/syntax interface is the phenomenon of clisis. Because their morphology
resembles that of affixes while their syntax is wordlike, clitics
raise very specific questions about the division of labor between
the components of morphology and syntax. Although recent
years have seen a vast amount of research into the properties of
clitics, there is, as yet, little consensus as regards their precise
theoretical status. Particularly urgent are the need to understand
the differences between clitics and affixes (Zwicky and Pullum
1983) and the need to reconcile these differences with the complex interactions between clisis and affixation (Spencer and Lus
2005).
The principles of the morphology/semantics interface also
urgently require clarification. Phenomena such as syncretism,
deponency, extended exponence, and morphological underdetermination are apparently incompatible with the assumption
(characteristic of morpheme-based theories) that a words morphology is isomorphic to its semantic structure. Research in paradigm-based theories has tended to assume (often tacitly) that
the morphosyntactic properties associated with grammatical

518

words are invariant in their semantic interpretation; yet there


are instances in which the semantics associated with a particular
morphosyntactic property is sensitive to its paradigmatic context
(Stump 2007).
A final area of current interest is that of implicative theories
of morphology. Like realizational theories, implicative theories
depend on the postulation of paradigms, but unlike them, they
assume that the forms realizing a paradigms cells are defined
by implicative relations among these cells (Blevins 2005, 2006).
Thus, in an implicative theory, certain words in a paradigm have
a privileged status because they serve as the basis for deducing
the paradigms other words. If a small number of such privileged forms uniquely determine the entire paradigm (as the
forms laud, laudre, laudv, and laudtum suffice to determine the paradigm of praise in Latin), they may be characterized as principal parts; but even words that arent principal
parts may carry specific implications for the formation of certain other members of their paradigm. Reference to these relations among a paradigms cells seems central for an account
of the processes of morphological deduction upon which the
acquisition and use of language depend; moreover, implicative
relations among the cells in lexemes paradigms are a significant domain of typological contrast among languages (Finkel
and Stump 2009).
Gregory Stump
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ackerman, Farrell, and Gregory Stump. 2004. Paradigms and periphrastic expression: A study in realization-based lexicalism. In Projecting
Morphology, ed. Louisa Sadler and Andrew Spencer, 11157. Stanford,
CA: CSLI Publications.
Anderson, Stephen R. 1992. A-morphous Morphology. Cambridge:
Cambridge University Press.
Aronoff, Mark. 1994. Morphology by Itself: Stems and Inflectional Classes.
Cambridge, MA: MIT Press.
Baerman, Matthew. 2004. Directionality and (un)natural classes in syncretism. Language 80: 80727.
Baerman, Matthew, Dunstan Brown, and Greville G. Corbett.
2005. The Syntax-Morphology Interface: A Study of Syncretism.
Cambridge: Cambridge University Press.
Bauer, Laurie. 1997. Derivational paradigms. In Yearbook of
Morphology 1996, ed. Geert Booij and J. van Marle, 24356. Dordrecht,
the Netherlands: Kluwer.
Blevins, James P. 2005. Word-based declensions in Estonian. In
Yearbook of Morphology 2005, ed. Geert Booij and J. van Marle, 125.
Dordrecht, the Netherlands: Springer.
. 2006. Word-based morphology. Journal of Linguistics
42: 53173.
Booij, Geert. 1997. Autonomous morphology and paradigmatic relations. In Yearbook of Morphology 1996, ed. Geert Booij and J. van
Marle, 3553. Dordrecht, the Netherlands: Kluwer.
. 2000. Inflection and derivation. In Morphology: An International
Handbook on Inflection and WordFormation, ed. Geert Booij,
C. Lehmann, and J. Mugdan, 3609. Berlin: Walter de Gruyter.
Brjars, Kersti, Nigel Vincent, and Carol Chapman. 1997. Paradigms,
periphrases and pronominal inflection: A feature-based account. In
Yearbook of Morphology 1996, ed. Geert Booij and J. van Marle, 15580.
Dordrecht, the Netherlands: Kluwer.
Brown, D. and A. Hippisley. Network Morphology. Cambridge: Cambridge
University Press. In press.

Morphology, Acquisition of
Corbett, Greville G., and Norman M. Fraser. 1993. Network morphology: A DATR account of Russian nominal inflection. Journal of
Linguistics 29: 11342.
Embick, David, and Rolf Noyer. 2007. Distributed morphology and the
syntax-morphology interface. In The Oxford Handbook of Linguistic
Interfaces, ed. G. Ramchand and C. Reiss, 289324. Oxford: Oxford
University Press.
Finkel, Raphael, and Gregory Stump. 2009. Principal parts and degrees
of paradigmatic transparency. In Analogy in Grammar, ed. J. Blevins
and J. Blevins, 1353. Oxford: Oxford University Press.
Halle, Morris, and Alec Marantz. 1993. Distributed morphology and
the pieces of inflection. In The View from Building 20, ed. K. Hale and
S. Keyser, 11176. Cambridge, MA: MIT Press.
Kiparsky, Paul. 1982. From cyclic phonology to lexical phonology. In
The Structure of Phonological Representations (Part I), ed. H. van der
Hulst and N. Smith, 13175. Dordrecht, the Netherlands: Foris.
Lieber, Rochelle. 1992. Deconstructing Morphology. Chicago: University
of Chicago Press.
Matthews, P. H. 1972. Inflectional Morphology. Cambridge: Cambridge
University Press.
. 1991. Morphology. 2d ed. Cambridge: Cambridge University
Press.
Selkirk, Elisabeth O. 1982. The Syntax of Words. Cambridge, MA: MIT
Press.
Spencer, Andrew, and Ana Lus. 2005. A paradigm function account of
mesoclisis in European Portuguese (EP). In Yearbook of Morphology
2004, ed. Geert Booij and J. van Marle, 177228. Dordrecht, the
Netherlands: Springer.
Stump, Gregory T. 1993. On rules of referral. Language 69: 44979.
. 2001. Inflectional Morphology. Cambridge: Cambridge University
Press.
. 2006. Heteroclisis and paradigm linkage. Language 82:
279322.
. 2007. A non-canonical pattern of deponency and its implications. In Deponency and Morphological Mismatches, ed. Matthew
Baerman, Greville G. Corbett, Dunstan Brown, and Andrew Hippisley,
7196. Oxford: Oxford University Press.
Zwicky, Arnold M. 1985. How to describe inflection. In Proceedings
of the Eleventh Annual Meeting of the Berkeley Linguistics Society, ed.
M. Niepokuj, M. VanClay, V. Nikiforidou, and D. Feder, 37286.
Berkeley, CA: Berkeley Linguistics Society.
Zwicky, Arnold M., and Geoffrey K. Pullum. 1983. Cliticization vs. inflection: English nt. Language 59: 50213.

MORPHOLOGY, ACQUISITION OF
The acquisition of morphology has played a central role in
exploring both the acquisition of syntax (see syntax, acquisition of) and lexical acquisition. The study of acquisition
of morphology can be distinguished by research in first language
acquisition (FLA) and second language acquisition (SLA)
and by research into inflectional or derivational morphology.
Little research has attempted to link acquisition of inflectional
and derivational morphology. There has been some influence
of work in FLA on that in SLA, in particular in debates about
the order of acquisition of morphemes for second language
learners.
The acquisition of inflection is often examined as the acquisition of morphosyntax, that is, structures that are governed by
both morphological and syntactic rules such as subject-verb
agreement. The majority of literature focuses on children under
the age of seven, as inflectional morphology is acquired during

early childhood. The acquisition of derivational morphology, on


the other hand, is concerned with the formation of new words and
is thus related to school-age language learning and reading.
Jean Berkos classic 1958 study found that children five to
seven years of age are able to apply both inflectional and derivational suffixes to novel stems (e.g., the plural wugs from wug,
or the adjective quirky from the noun quirks). These results were
interpreted at the time as evidence against the predictions from
the prevailing theories of Behaviorism (Skinner 1957) and supported the cognitive revolution in psychology and linguistics.
This study has been replicated with children as young as two
in English and other languages (e.g., Kopcke 1998; Akhtar and
Tomasello 1997). Together with work examining childrens natural productions of morphology, this experimental work has been
taken as evidence that young children use morphological rules
to inflect and to form parts of speech.

Inflectional Morphology
FIRST LANGUAGE. One major set of works investigating the
acquisition of morphemes examined the order of acquisition of
inflectional morphemes (Cazden 1968; Brown 1973; de Villiers
and de Villiers 1973), focusing on English. This work found a
consistent (though not identical) order of acquisition among
children. Cross-linguistic work demonstrated that there was no
universal order of morpheme acquisition between languages,
that the order and speed of acquisition depends on the target
language and the morphemes themselves (Slobin 1985; Clark
1998). Work on other languages has also found consistent, but
not identical, orders within a language. Several factors appear to
influence the order of acquisition, including perceptual salience,
complexity of the morpheme either semantically (how many
concepts it encodes) or formally (how variable the affix is, how
many parts it contains), and frequency in the input. Eve V. Clark
(1993) suggested that several principles transparency (how
easily the meaning is derived from the parts), simplicity (how
variable the forms are), and productivity govern the order of
acquisition of both inflectional and derivational morphemes.
Morphemes that are consistent in form (have few allomorphs)
and semantically encode a single feature, such as plural or progressive -ing, tend to be acquired earlier than morphemes that
show more allophonic variation, such as with regular past tense
-ed, and/or encode multiple features, such as third person
singular -s (Brown 1973; Clark 1993). This preference to encode
single features/forms with a single morpheme holds across languages and even for children acquiring more than one language.
For example, Melanija Mikes (1967) discussed the acquisition of
locatives for bilingual Hungarian-Serbo-Croatian children. The
Hungarian locative suffix, which is relatively transparent, was
acquired earlier than the semantically equivalent structure in
Serbo-Croatian, which required locative prepositions + agreement, which varied by gender.
Inflected forms appear early, from the earliest word use, especially in highly or consistently inflected languages (Slobin 1985).
Languages such as English, with fewer inflected forms, have bare
forms appearing first and inflected forms appearing later, often
concurrently with first word combinations. In English, childrens
production of inflectional morphology begins with an initial
period of some variability, gradually becoming more consistent

519

Morphology, Acquisition of
over time. Across languages, some morphemes (such as plurals)
are consistently produced accurately by the age of three or four,
whereas others, such as conditional marking of verbs, are often
not mastered until ages seven or eight. While little research has
been done on inflectional prefixes, it has been suggested that
inflectional prefixes are more difficult and are acquired later
than suffixes (Slobin 1982; Clark 1998).
In all languages, children often go through a period during
which they overregularize irregular forms in their grammar,
for example, breaked or mans. Irregular forms are often acquired
early (broke/men) and then when regular forms are acquired,
overregularized forms coexist with irregular forms (breaked/
broke) for a time until the irregular forms and exceptions are
mastered (Marcus et al. 1992).
The majority of research on inflectional morphology has
focused on the verbal and nominal domains, with relatively
little work on adjectives or adverbs. According to Clark (1998),
the earliest verb forms across languages tend to be imperative,
infinitive verb forms, and third person singular; singular forms
tend to be acquired before plural forms. The characteristics of
individual languages determine which agreement markers
are learned earliest in any given language. In terms of tense
and aspect, present/nonpresent is the first distinction children
make, and distinctions between past, present and future appear
to be in place by age three. In languages that distinguish aspect,
aspect also appears to be acquired around age three. However,
early aspect marking is also associated with the semantic characteristics of the verb. For example, Li and Shirai (2000) argue
that early use of perfective aspect is more likely to occur with
telic (e.g., walked) or resultative verbs (e.g., smashed) than other
types of verbs. Compound or periphrastic tenses such as present
perfect are acquired later, and may not be in place until age five
or older.
Within the nominal domain, number marking occurs early
in nouns and is one of the earliest nominal morphemes to be
seen. gender marking also appears early, just after the first
nouns. Early gender/noun class marking appears to be based
primarily on the phonological shape of the noun and later
becomes associated with individual lexical items. Noun class
marking appears by age three, but adultlike acquisition of gender or noun class marking, which requires attention to both
phonological form and semantics, does not appear until age
four or five in many languages (Demuth 2003). In languages
with classifier systems, such Chinese, Japanese or Thai, general
classifier patterns appear early, and more fine-grained semantic distinctions appear gradually. Case marking occurs early as
well, just after first nouns are learned. In nominative-accusative languages, the first distinction to be acquired appears to
be between nominative and accusative. Dative case is next,
followed by other oblique cases. There is also evidence from
ergative languages that the major case distinctions in these languages are acquired early as well. Languages in which case varies by gender (e.g., Russian) take longer for contrasts to appear
than in languages that simply mark for case (e.g., Hungarian,
or Turkish).
Inflectional morphology is generally mastered by the early
school years for all but the most infrequent or irregular constructions across languages.

520

SECOND LANGUAGE. Early research in second language acquisition of inflectional morphology investigated whether there was
a consistent order of acquisition of morphemes. Initial studies
reported consistent orders of acquisition across both adult and
child second language learners of English (e.g., Dulay and Burt
1974). Later research, however, criticized these early studies for
their methodology, and found variability among learners of different backgrounds that seemed to belie a single order of acquisition for second language learners (e.g., Hakuta 1974; Rosansky
1976). The present consensus seems to be that morpheme language studies provide strong evidence that ILs [interlanguages,
developing grammars] exhibit common accuracy/acquisition
orders (Larsen-Freeman and Long 1991, 92). After this intense
period of debate about the acquisition order of morphemes, later
research in SLA has focused on one of several areas: research on
specific structures such as tense or aspect, the role of context in
the acquisition of morphology, and the implications of missing
morphology for the developing grammars of second language
learners. Research has begun to look again at order of acquisition
in SLA, considering how models of acquisition and/or the functor morphemes themselves can explain the order (Goldschneider
and DeKeyser 2001). Here again, the discussion parallels that of
first language acquisition, with perceptual salience, morphophonological regularity, semantic complexity, and frequency being
posited as factors contributing to the order of acquisition.

Derivational Morphology
FIRST LANGUAGE. Because derived forms are often used to fill
semantic gaps within a lexicon, the acquisition of derivational
morphology is studied within the domain of vocabulary learning. Derivational morphology is acquired somewhat later, and
with a less clear order of development, than inflectional morphology. Highly inflected languages such as Turkish or Finnish
show evidence of early derivational morphology. Studies indicate that the earliest derivations are zero-stem alternations (e.g.,
the noun knife used as a verb) and compounding (e.g., dog-book)
for languages where these processes are productive (Clark 1993).
Agentive suffixes (-er in particular) and diminutives (dogg-y) are
also acquired between ages two and three across many languages.
However, the bulk of derivational morphology is acquired during
middle childhood and adolescence (Tyler and Nagy 1989).
As with inflectional morphology, the course of development for derivational morphology varies from language to language and depends on both the patterns within the language
and the properties of the derivations themselves. For example,
Hebrew-speaking children derive verbs from nouns as young as
age three, while derived nominals are acquired later, after age
eight and continuing into adolescence (Berman 2003; Ravid
and Avidor 1998). Awareness of and ability to decode derivational morphology continues to develop throughout the school
years. Comprehension of derivational morphology is correlated
with reading skills in a number of languages, including English,
Hebrew and Chinese (Tyler and Nagy 1990; Levin, Ravid, and
Rapaport 2001; Kuo and Anderson 2006). As with inflectional
morphology, acquisition appears to be best explained by the
transparency of the morpheme, complexity (semantic and formal), and productivity (Clark 1998). Later-learned morphemes
such as -tion have more allomorphs, make more changes to the

Morphology, Acquisition of
stem they attach to, encode more concepts, and/or are less productive than early learned morphemes such as -er.
SECOND LANGUAGE. As with first language (L1) acquisition, second language acquisition of derivational morphology has been
studied together with general vocabulary growth and knowledge of the lexical semantics of the language (Redouane 2004;
Montrul 2001; Lardiere 1997). Little research has examined this
area of second language acquisition in detail. The research that
exists suggests that beginning learners tend to fill lexical gaps
with word formation strategies from their L1 or by extending
existing second language vocabulary. More advanced learners are more likely to use derivational morphology found in the
target language, although they are still likely to use non-target
forms. Evidence also suggests that derivational morphology patterns that are substantially different from the L1 present considerable problems to second language learners and are learned
only gradually through time.

Current Debates
Major areas of debate in the acquisition of morphology include
the status of missing inflectional morphemes in learners grammars, whether productive use of morphemes reflects rules and
an innate capacity for rule formation, and whether regular and
irregular forms are acquired and represented similarly.
For first language acquisition, one area of strong debate has
been the status of missing or omitted morphology by children.
Within a general nativist framework, explanations range from
maturation of grammatical structures, to prosodic and/or phonological learning, to language-specific lexical learning (e.g.,
Peters and Stromqvist 1996; Santelmann, Berk, and Lust 2000).
More empiricist approaches point to such issues as frequency
of morphemes, vocabulary size, and transparency of the morphemes themselves to explain the course of acquisition (e.g.,
Rowland et al. 2003).
Parallel debates have taken place within the literature on
second language acquisition. Within the search for a consistent
order of morpheme acquisition, debates have focused on the
possible reasons for such an order: Is it due to an underlying universal grammar (Dulay and Burt 1974), characteristics of the
input, or general learning and characteristics of the morphemes
themselves (Goldschneider and DeKeyser 2001). In addition, the
issue concerning why adult learners, unlike child learners, continue to make mistakes with inflectional morphology even after
years of exposure to a language has been a source of significant
debate. Explanations for this phenomenon range from a lack of
access to universal grammar for adult learners to issues with prosodic differences between languages, differences in proficiency,
or other factors.
Within the domains of both inflectional and derivational
morphology, another major theoretical debate concerns the
nature of rules. Do children and adults learn rules, or do they
simply extract regularities from the speech stream? This debate
has been particularly strong in the area of regular versus irregular verbs. One side of the debate argues that children use rules for
regular verbs, and thus have distinct processes for forming and
representing past tense with regular versus irregular forms (e.g.
Marcus 1996). This view argues that irregular forms are stored as

words, while regular forms are concatenated via rules (Pinker


and Ullman 2002). The other approach, from those mainly working within connectionism, claims that learners are extracting
statistical regularities and patterns, without positing a mental
rule (e.g., Plunkett and Marchman 1993). These researchers suggest that there is a single, associationist mechanism for forming
past tense for all verbs, regular or irregular, and thus the representations of regular and irregular forms do not differ. While
rule-based accounts appear to have underestimated the human
ability to track statistical information about morphology, connectionist models may not be able to generalize regular forms
based on the typical frequencies of those forms in the input
(Marcus 1995). Thus, the extent to which these networks can
model human language acquisition is unclear.

Summary
The acquisition of morphology is central to both morphosyntactic development and lexical development for first and second
language learners. While the acquisition of inflectional morphology is largely complete for children by the time they enter
school, it is often problematic for adult learners of a language.
Derivational morphology, on the other hand, is an ongoing, lifelong process for both first and second language learners. Little
research has attempted to link the two types of morphological
acquisition. More research on both derivational and inflectional
morphology is needed on a wider variety of languages or varying
typologies, in particular on non-Indo-European languages. This
is particularly the case for SLA, where the bulk of the research
has focused on European languages.
Lynn Santelmann
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Akhtar, N., and M. Tomasello. 1997. Young childrens productivity
with word order and verb morphology. Developmental Psychology
33.6: 95265.
Berko, Jean. 1958. The childs learning of English morphology. Word
14: 15077.
Berman, Ruth A. 2003. Childrens lexical innovations: Developmental
perspectives on Hebrew verb structure. In Language Processing and
Acquisition in Languages of Semitic, Root-Based, Morphology, ed.
J. Shimron. Amsterdam: John Benjamins.
Brown, Roger W. 1973. A First Language: The Early Stages.
Cambridge: Harvard University Press.
Cazden, Courtney B. 1968. The acquisition of noun and verb inflections.
Child Development 39: 43348.
Clark, Eve V. 1993. The Lexicon in Acquisition. Cambridge: Cambridge
University Press.
. 1998. Morphology in language acquisition. In The Handbook of
Morphology, ed. A. Spencer and A. Zwicky. Oxford: Blackwell.
Demuth, K. 2003. The acquisition of the Bantu languages. In The Bantu
Languages, ed. D. Nurse and G. Phillipson. Surrey : Curzon.
de Villiers, Jill G., and Peter A. de Villiers. 1973. A crosssectional study of
the acquisition of grammatical morphemes. Journal of Psycholinguistic
Research 2: 26778.
Dulay, Heidi, and Marina Burt. 1974. Natural sequences in child second
language acquisition. Language Learning 24: 3753.
Goldschneider, Julie M., and Robert M. DeKeyser. 2001. Explaining
the natural order of L2 morpheme acquisition. Language Learning
51: 150.

521

Morphology, Acquisition of
Hakuta, Kenji. 1974. A preliminary report on the development of grammatical morphemes in a Japanese girl learning English as a second language. Working Papers on Bilingualism 3: 1838.
Kopcke, K. M. 1998. The acquisition of plural marking in English and
German revisited: Schemata versus rules. Journal of Child Language
25.2: 293319.
Kuo, Li-jen, and Richard C. Anderson. 2006. Morphological awareness
and learning to read: A Cross-language perspective. Educational
Psychologist 41.3: 16180.
Lardiere, Donna. 1997. On the transfer of morphological parameter
values in L2 acquisition. Proceedings of the Annual Boston University
Conference on Language Development, 1997 21.2: 36677.
Larsen-Freeman, Diane, and Michael H. Long. 1991. An Introduction to
Second Language Acquisition Research. New York: Longman.
Levin, Iris, Dorit Ravid, and Sharon Rapaport. 2001. Morphology and
spelling among Hebrew-speaking children: From kindergarten to first
grade. Journal of Child Language 28.3: 74172.
Li, P. and Y. Shirai. 2000. The Acquisition of Lexical and Grammatical
Aspect . Berlin: Mouton de Gruyter.
Marcus, Gary F. 1995. The acquisition of the English past tense in children
and multilayered connectionist networks. Cognition 56.3: 2719.
. 1996. Why do children say breaked? Current Directions in
Psychological Science 5.3: 815.
Marcus, Gary F., Steven Pinker, Michael Ullman, Michelle Hollander, T.
John Rosen, and Fei Xu. 1992. Overregularization in language acquisition. Monographs of the Society for Research in Child Development
57.4: v164.
Mikes, Melanija. 1967. Acquisition des categories grammaticales dans
le langue de lenfant [Acquisition of grammatical categories in the language of the child]. Enfance 3/4: 28998.
Montrul, Silvina. 2001. The acquisition of causative/inchoative verbs in
L2 Turkish. Language Acquisition 9.1: 158.
Peters, Ann M., and Sven Stromqvist. 1996. The role of prosody
in the acquisition of grammatical morphemes. In Signal to
Syntax: Bootstrapping from Speech to Grammar in Early Acquisition,
ed. J. Morgan and K. Demuth. Mahwah, NJ: Lawrence Erlbaum.
Pinker, Steven, and M. T. Ullman. 2002. The past and future of the past
tense. Trends in Cognitive Sciences 6: 45663.
Plunkett, Kim, and Virginia A. Marchman. 1993. From rote learning to
system building: Acquiring verb morphology in children and connectionist nets. Cognition 48.1: 2169.
Ravid, Dorit, and Avraham Avidor. 1998. Acquisition of derived nominals in Hebrew: Developmental and linguistic principles. Journal of
Child Language 25.2: 22966.
Redouane, Rabia. 2004. The acquisition of MSA word formation processes: A case study of English-speaking L2 learners and native speakers. ITL, Review of Applied Linguistics, 145/146: 181217.
Rosansky, E. 1976. Methods and morphemes in second language acquisition. Language Learning 26: 40925.
Rowland, Caroline F., Julian M. Pine, Elena V. Lieven, and Anna L.
Theakston. 2003. Determinants of acquisition order in wh-questions: Re-evaluating the role of caregiver speech. Journal of Child
Language 30.3: 60935.
Santelmann, Lynn, Stephanie Berk, and Barbara Lust. 2000. Assessing
the strong continuity hypothesis in the development of English
inflection: Arguments for the grammatical mapping paradigm.
Proceedings of the XIX West Coast Conference on Formal Linguistics
19: 43952.
Skinner, B. F. 1957. Verbal Behavior. New York: AppletonCentury
Crofts.
Slobin, Dan I. 1982. Universal and particular in the acquisition of language. In Language Acquisition: The State of the Art, ed. E. Wanner
and L. Gleitman, 12872. Cambridge: Cambridge University Press.

522

Morphology, Evolution and


Slobin, Dan I, ed. 1985. The Cross-Linguistic Study of Language Acquisition.
Hillsdale, NJ: Erlbaum.
Tyler, Andrea, and William Nagy. 1989. The acquisition of English
derivational morphology. Journal of Memory and Language
28.6: 64967.

MORPHOLOGY, EVOLUTION AND


For some aspects of language, possible evolutionary explanations are not hard to imagine, even if establishing the truth of
any of them may be difficult or impossible. For example, the
fact that utterances can be segmented into individual meaningful elements (words or morphemes) has a clear functional
advantage in that combinations of these elements can be used
to express a huge range of complex meanings an advantage
exploitable in natural selection. Likewise, it is advantageous
to have a syntax, that is, a set of traffic rules for combining
words or morphemes into larger units that can be interpreted
reliably. Less immediately obvious but nevertheless vigorously
defended by some scholars in recent years is the possibility
that certain aspects of language are as they are for physical or
mathematical, rather than biological, reasons. The hexagonal
shape of honeycomb cells is not due to a hexagonal-cell gene
inherited by bees but is a self-organizing outcome of cell construction under particular spatial constraints. Conceivably,
nonbiological self-organizing factors influence language,
too.
For the existence of morphology, however, no such explanations seem immediately plausible. What functional advantage is there in the availability of not one but two patterns of
grammatical organization for complex expressions: syntactic,
as in the French phrase tasse th (literally cup to tea) and
the English sentence They were being bitten, and nonsyntactic, as in the English compound word teacup and the Latin
one-word sentence Mordebantur? More puzzling still, what
functional advantage is there in the fact that the plural of pan is
pans while the plural of man is men or that the past tense forms
of bring, sting, and sing are brought, stung, and sang, respectively? The first question concerns the relationship between
morphology and syntax, while the second concerns allomorphic variation. Wouldnt languages, in general, function better
if there were just one set of traffic rules, not two, to guide the
interpretation of complex expressions? And wouldnt English
function better if plurality and pastness were expressed in a
uniform fashion, as in the Newspeak of George Orwells 1984
or as in an artificial language such as Esperanto? Should
we then look for a physical or quasi-mathematical explanation
instead? Yet these phenomena do not display the sort of honeycomb-like elegance that renders them obvious candidates
for that explanation.
For possible solutions to these puzzles, it is natural to consult hypotheses specifically concerning the evolutionary origin
of morphology as a component of grammar. Hypotheses of that
kind so far published are sketchy. Nevertheless, four trends can
be distinguished:
(a) an appeal to uninterpretable features in Noam
Chomskys minimalist syntax (one type of appeal to selforganization);

Morphology, Evolution and


(b) the projection into prehistory of grammaticalization
processes such as are observed in historical linguistic
change;
(c) an appeal to phonological consequences of the fact
that the speech signal is continuous, not segmentable into discrete chunks with clear boundaries (and gesture likewise, if
we suppose that language originated in that medium);
(d) a variant of (c) that also invokes the special circumstances
of our huntergatherer ancestors.
Chomskyan Minimalism explores rationales for apparent
imperfections in language. One such apparent imperfection
is syntactic displacement: for example, the kind of noun phrase
fronting exhibited in Beans I like and Who did you see? and perhaps even in a simple clause such as John kissed Mary, if one
assumes that syntactic subjects originate internally to the verb
phrase. Such displacement may serve communicative purposes
(topicalization, for example). It still counts as a grammatical imperfection, however, if there is nothing within grammar
itself to drive it. This is where morphology may come in (it is
claimed). Let us suppose that some constituents have features that are uninterpretable and thus need to be erased
by moving those constituents to a location where these features
can be matched (Chomsky 2000, 2004). So far as grammar is
concerned, this matching helps to ensure that all of the syntactic
requirements of the vocabulary items in the sentence are met,
while so far as purposes of language that lie outside grammar
are concerned (such as communication), it may aid their
fulfillment by (for example) moving shared information to the
start of utterance. The apparent imperfection thus disappears.
If, incidentally, some of the features that drive displacement
manifest themselves in overt morphology (for example, as case
inflections), that is hardly surprising; it may facilitate the acquisition of grammar, for example. Thus, the existence of morphology helps to resolve tensions between the way that the grammar
is ideally structured and the extragrammatical uses to which
language is put.
This line of argument has at least three weaknesses, however.
Firstly, it says nothing about the allomorphy exhibited in pans
and men or in brought, flung, and sang. Secondly, it says nothing about why derivational morphology exists (for example, why
we say writer and artist rather than, say, person write or person
art). Thirdly, it relies too much on the intellectual appeal that
paradoxes can exert. Consider the orbits of the planets around
the sun. These orbits are not circular, which may be seen as an
imperfection, but the imperfection disappears in a paradoxical
yet satisfying way (one may think) in that, even though an orbit
is elliptical, the planets position on its orbit is correlated with
its speed, as Johannes Kepler demonstrated. But the enjoyment
of paradox can go too far. I carry a puncture repair kit when I go
cycling, which is an apparent imperfection because it adds to
the weight of my equipment. Am I then entitled to claim that this
imperfection disappears whenever I get a puncture because my
repair kit enables me to get on the road again? And can I even
argue that getting a puncture is paradoxically a positive event
because it justifies my carrying the repair kit? Flat tires thus contribute to perfect cycling! This style of argument is strange, to
say the least. Yet it is uncomfortably close to a style of argument

used by some minimalist theorists to explain the existence of


morphology.
Grammaticalization theory concerns itself with the process
whereby in language change, what were once free forms
with concrete, lexical meanings can change in three ways: grammatically, so as to become bound rather than free (as the free
form full has developed into a suffix in helpful); semantically,
so as to contribute grammatical rather than lexical information
(as the verb will in English has shifted from desire to future
tense); and phonologically, so as to merge with a neighboring
phonological word (as in Ill come, derived from I will come)
(Heine and Kuteva 2002). All three changes can be observed in
the history of Swedish, where what was once a free pronoun sik
meaning himself/herself has developed into a suffix -s with a
habitual passive meaning. Bernard Comrie (1992) has suggested
that not only individual affixes but also morphology overall originated this way. At an earlier, simpler stage of language, there was
syntax but no morphology. Subsequently, phonological reduction and meaning change in frequently occurring collocations
brought into being a new kind of structure, with bound items
alongside free ones, and phonologically reduced items alongside
phonologically full ones.
A drawback with this approach is that it privileges syntax over
morphology in an arbitrary way. Granted, all languages have
syntax while some languages today make little or no use of morphology. That does not, however, constitute evidence that syntax
evolved earlier than morphology did. Implicitly, this approach
posits a sort of prehistoric linguistic Golden Age when forms
and meanings were neatly paired one-to-one, and when language did indeed have only one set of grammatical traffic rules.
However, such a Golden Age would have no parallel elsewhere
in evolutionary biology; there is no reason to think that functionality as an outcome of natural selection was more pervasive
at some time in the past than it is now.
Is there any reason to think, then, that the anomalies of morphology have been around for as long as modern-style syntax
has, or even longer? Andrew Carstairs-McCarthy (2010) argues
just this. Individual meaningful units (morphemes), whether
spoken or signed, are not diamond-hard discrete entities, unaffected in their shape by neighboring units. This would have been
just as true before fully modern syntax had evolved as subsequently. Thus, there would already then have been in existence
phonological processes that would give rise to allomorphy, and,
just as now, historical changes would sometimes have deprived
this allomorphy of its phonological motivation (just as the voicing of the [v] in wives, plural as wife, now lacks the phonological
motivation it had in Old English and has thus acquired a grammatical function, as an exponent of plural). Let us assume
that this allomorphy was coupled with an expectation that formal differences should always be accompanied by differences
in information content. One has already the seeds of a kind of
grammar in which the same item can be viewed as having more
than one form, provided that these forms are differentiated
somehow. The differentiation could be semantic (e.g., rise versus
raise), grammatical (e.g., sang versus sing), or in terms of phonological environment (e.g., in Italian udire to hear, the stem is
od- when stressed and ud- when unstressed). (Grammatical differentiation in this hypothetical stage of development would ex

523

Morphology, Evolution and

Morphology, Neurobiology of

hypothesi not involve syntax, but could conceivably involve systematic expression of categories such as number, tense, or definiteness.) And, where formal differences involved extra segments
at the beginning or the end of an item, the seeds were sown for
what we now call affixes, arising by a process quite distinct from
grammaticalization.
A variant of Carstairs-McCarthys approach has been developed by Dieter Wunderlich (2006a, 2006b), linking grammatical
evolution with cultural and economic change. The sort of syntax
that many modern languages have, with lavish opportunities for
long-distance syntactic movement, would not have had significant evolutionary advantages (Wunderlich suggests) until after
the emergence of large speech communities whose members
did not all know one another, that is, until the Neolithic period.
Until then, he suggests, that is, as long as all humans were huntergatherers living in small groups, elaborate morphology would
have preponderated over syntax.
It will be seen that, as regards morphological evolution,
widely divergent suggestions have been made about the balance
between cultural and noncultural factors. Carstairs-McCarthys
and Chomskys approaches, though different in many ways,
agree in emphasizing noncultural reasons for the existence of
morphology as a component of grammar. For Comrie, on the
other hand, cultural change is at least as important as biological
or self-organizational factors, while Wunderlich revives the view
that fully modern syntax came late as a cultural by-product of
population expansion and the transition to agriculture, with the
added twist that an elaborate morphological component was
already in existence early. Time will tell which viewpoint prevails or which combination of viewpoints.
Andrew Carstairs-McCarthy
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Carstairs-McCarthy, Andrew. 2010. The Evolution of Morphology. Oxford:
Oxford University Press.
Chomsky, Noam. 2000. New Horizons in the Study of Language and
Mind. Cambridge: Cambridge University Press. Chapter 1 is especially
relevant.
. 2004. Language and mind: Current thoughts on ancient problems. In Variations and Universals in Biolinguistics, ed. L. Jenkins,
379405. Amsterdam: Elsevier.
Comrie, Bernard. 1992. Before complexity. In The Evolution of Human
Languages, ed. J. Hawkins and M. Gell-Mann, 193211. Reading,
MA: Addison-Wesley.
Heine, Bernd, and Tania Kuteva. 2002. World Lexicon of
Grammaticalization. Cambridge: Cambridge University Press.
Hinzen, Wolfram. 2006. Mind Design and Minimal Syntax. Oxford: Oxford
University Press. This book confronts frankly, from a Chomskyan perspective, some of the difficulties that morphology poses for the minimalist program.
Wunderlich, Dieter. 2006a. What forced syntax to emerge? In Between
40 and 60 Puzzles for Krifka, ed. Hans-Martin Grtner, Sigrid Beck,
Regine Eckardt, Renate Musan, and Barbara Stiebels. Berlin: Zentrum
fr Allgemeine Sprachwissenschaft. Available online at: http://www.
zas.gwz-berlin.de/fileadmin/material/40 60-puzzles-for-krifka/
index.html.
. 2006b. Why is there morphology? Abstract of paper presented at
Workshop on Theoretical Morphology, Leipzig, June. Available online
at: http://www.uni-leipzig.de/~jungslav/rmag/Wunderlich.pdf.

524

MORPHOLOGY, NEUROBIOLOGY OF
The nature of word formation and word storage has long been
prominent in the disparate fields of linguistics, psychology, and
neurobiology, and recently the neurobiology of morphology has
emerged as its own distinct subfield of study.
Two prevalent issues in formal linguistic and psycholinguistic
approaches to morphology are the distinctions drawn between
inflectional versus derivational morphology and between regular
versus irregular morphology. Another line of inquiry has sought
to identify the formal structure of basic morphological representations (i.e., words, stems, affixes) and to determine the extent
to which complex words are either composed by a grammatical process or stored as unanalyzed wholes. The issue of compositionality interacts with inflectional/derivational status and
with regularity; for example, various theorists have proposed
that irregular inflection is not compositional (1a), that familiar
inflected forms are not compositional (2a), or that derivational
morphology is not compositional (3a) in the way that they are
stored or processed.
Noncompositional

Compositional

(1)

a. ran = [ran]

b. ran = [run]+[past]

(2)

a. walked = [walked]

b. walked = [walk]+[past]

(3)

a. hopeless = [hopeless]

b. hopeless = [hope]+[less]

For these issues, there is an immense and contradictory body


of linguistic evidence. As a consequence, some linguists have
looked to new methods, such as neuropsychology and neuroimaging, as alternative sources of evidence.
Psychological research in the lexicon and morphology has
largely focused on the roles of frequency, familiarity, and similarity in word storage and identification. A critical finding in
this tradition concerns the effect of lexical frequency. Words
that occur more frequently are more quickly and successfully
recalled in a wide range of experimental settings. This finding
has enabled psychologists to pose deeper questions about the
nature of lexical representations in terms of which aspects of
those representations are crucial to the frequency effect. The
fact that processing morphologically complex words may be
affected by the frequency of the entire form has led some to
argue that familiar affixed words, including regularly inflected
forms like walked, are stored as wholes in the mental lexicon
(e.g., Baayen, Dijkstra and Schreuder 1997). In essence, this
whole-word approach to representation and processing treats
regular forms like walked and irregular forms like ran as equivalent: Both are past tense forms, and, by hypothesis, access to
that inflectional information is mediated in both cases by the
whole-word recognition.
Neurobiological methods first made an impact on our understanding of morphology through studies of individuals with
acquired impairments to their morphological systems. These
studies typically tried to establish the loci of morphological functions in the brain (in either anatomical structures or functional
architectures). The progenitors of these studies are the works of
French neurologist Paul Pierre Broca and German neurologist
Karl Wernicke during the late nineteenth century. Broca reported
on a patient whose production abilities were severely impaired

Morphology, Neurobiology of
due to a lesion in the inferior frontal gyrus of the frontal lobe,
a lesion/deficit pattern now know as Brocas aphasia. Wernicke
reported that patients with lesions in the left posterior section
of the superior temporal gyrus suffered from a severe comprehension deficit; their speech was fluid and natural sounding but
their word selection seemed divorced from meaning. Numerous
researchers have since argued that lexical storage and retrieval
functions are located in wernickes area, and that morphological grammar functions are housed in brocas area.
A quite different approach to the study of morphology, but one
that is also grounded in neurobiology, is connectionist modeling. Work in connectionist modeling has called into question
some of the most fundamental tenets of morphological theory.
D. E. Rumelhart and J. L. McClelland (1986) presented arguments from modeling against the distinction between regular and
irregular morphology. M. S. Seidenberg and L. M. Gonnerman
(2000) challenged the very existence of morphological representations, arguing instead that what linguists call morphemes are
merely the points of convergence between sound and meaning
codes, and are not a distinct type of entity of their own. These
studies have drawn fierce criticism, stimulated vigorous debate,
and played a major role in shifting the standards of mainstream
morphology to a consideration of brains and simulated brains as
viable data sources.
In each of the disciplines concerned with the neurobiology
of morphology, many central issues remain unresolved and
hotly debated. Furthermore, there had been little occasion until
recently for these methodologically distinct disciplines to communicate, despite their concern with fundamentally the same
topic. However, recent technological advances have provided
each of these disciplinary perspectives with new methods of
study, and thus new insights into perennial questions. These
advances, combined with shifting disciplinary boundaries, have
enabled the neurobiology of morphology to become a largely
coherent line of inquiry into the neural underpinnings of word
storage and processing. We may divide recent approaches into
neuropsychological, hemodynamic, and neurophysiological
methods.

Neuropsychological Methods
Patients with brain damage and resulting impairments have
long been a valuable source of evidence about the neurobiology
of morphology. Modern neuropsychological studies of morphology evaluate patients who suffer impairments that selectively
affect (or selectively spare) morphological functions as the
result of brain damage. Typically, the rationale of these studies
is to identify dissociations between patterns of impaired and
preserved capacities in order to establish whether certain morphological functions are distinct from other components of the
lexical system.
One such dissociation is a morphological deficit that disrupts
the processing of inflectional morphology in the context of a
relatively spared ability to process derivational morphology. The
Italian-speaking patient FS reported in Miceli and Caramazza
(1988) presented with one of the clearest instances of this performance pattern. In spontaneous speech, FS made frequent
errors of agreement between nouns and attributive modifiers
(e.g., the target phrase il mio studio [det.Ms my.Ms office.Ms] was

produced as *la mia studia [det.Fs my.Fs office.Fs]) and between


subjects and verbs (io vivo solo [I live.1s alone.Ms] was produced
as *io vive solo [I live.3s alone]). In repetition tasks, FS was 98 percent correct in repeating derived words with their derivational
morphology intact, but only 40 percent correct in repeating the
inflection.
One issue not expressly examined in the case of this patient
was whether the difficulty that FS had in repeating inflected
forms was modulated by the regularity of their morphology.
The relevance of this point relates to the importance of identifying the particular level of morphological representation that is
implicated in the deficit. In some cases of acquired morphological deficit, performance on regular and irregular morphology
dissociates. For example, patient SJD presented with a deficit
that disrupted the production (but not the comprehension) of
regularly inflected forms like walked (which was read as walk
and as walking on different occasions), as well as morphologically derived words like publisher (which she read as publishing)
(Badecker and Caramazza 1991). In comparison, that patients
performance on irregularly inflected words was equal to her relatively intact production of uninflected forms. A complementary
dissociation of regular versus irregular inflection has also been
reported. For example, patient AW exhibited poor performance
for irregularly inflected forms in comparison to nearly intact performance with regularly inflected words (Miozzo 2003; see also
cases reported in Laiacona and Caramazza 2004; Shapiro and
Caramazza 2003).
Some single-case studies have reported patients who present
with impaired comprehension and production for both regular
and irregular inflection in both spoken and written modalities
though not always in equal proportion. This deficit has been
construed as resulting from an abstract, morphosyntactic level
of deficit (i.e., one where walked and ran are both represented as
morphologically complex) (Badecker 1997).
Most often, the method of neuropsychological studies is to
establish dissociations between distinct morphological subsystems, but sometimes the content of patients errors themselves
provides insight into the nature of morphological grammar. For
example, patient SJDs affix selection errors were not always
grammatically licensed (e.g., she read poorest as poorless,
along with an elaborative comment that indicated comprehension of the superlative affix: the most poorless Indians have
very little money). These performance features suggest that the
mechanisms for producing productively affixed words exploit
compositional procedures (Badecker and Caramazza 1991).

Hemodynamic Methods
In contrast to neuropsychological methods, hemodynamic

neuroimaging methods have made it possible to directly observe


areas of the brain involved in normal (intact) morphological
processing. These methods include positron emission tomography
(PET) and functional magnetic resonance imaging (fMRI). These
methods compare levels of blood flow and blood oxygenation in
different areas of the brain as subjects perform cognitive tasks.
One of the few PET studies focused narrowly on morphology
sought to identify a neural correlate of the use of overt inflectional verbal morphology in German (Gnther et al. 2001).
This study contrasted German verbs with and without overt

525

Morphology, Neurobiology of
inflectional suffixes, revealing a difference in activation in and
around Brocas area. The researchers interpret these results as
evidence that Brocas area subserves morphological and/or
morphosyntactic functions.
Increasingly, fMRI is the preferred method for hemodynamic
investigations of morphology. Numerous fMRI studies have
pursued a functional localization for morphology by contrasting conditions with and without affixes, but they have often
returned inconclusive results (see, e.g., Davis, Meunier, and
Marslen-Wilson 2004).
A few fMRI studies of morphology and the lexicon, however,
have taken on more refined questions, producing more robust
results. One such study is the investigation by A. Beretta and colleagues (2003) of German regular and irregular inflection. They
find significant overall differences in neural activation between
the processing of regularly inflected and irregularly inflected
nouns and verbs. They interpret this as evidence that regular and
irregular morphological functions are subserved by distinct neural systems, though their findings do not address where irregulars
are processed or where regulars are processed, if such distinct
locations were to exist.

Neurophysiological Methods
In order to examine how specific types of morphologically complex words are processed (e.g., regularly vs. irregularly inflected
words) over the time course of lexical processing, researchers
have increasingly turned to neurophysical recording techniques
whose temporal resolution is well suited to the rapid changes in
brain response to linguistic materials. These imaging methods
include electroencephalography (EEG) also known as eventrelated brain potentials (ERPs) which measures electrical currents caused by neural activity, and magnetoencephalography
(MEG), which measures the magnetic fields that result from this
neuroelectrical activity.
For the most part, neurophysical studies of morphology
exploit well-studied event-related response components under
a variety of stimulus conditions (including lexical priming,
manipulations of lexical properties such as surface or stem frequency, and contextual fit). In EEG/ERP studies, there are two
response components that have been exploited to some advantage: the N400 a negative deflection peaking around 400 ms
that increases in amplitude after a novel or unexpected lexical stimulus and the P600 a positive current shift following
syntactic anomalies (Kutas and Hilyard 1980; Osterhout and
Holcomb 1992; Osterhout and Nicol 1999). In MEG, most morphology studies have focused on the M350 response component believed to reflect some of the currents underlying the
N400 ERP which peaks approximately 350 ms after the presentation of a word stimulus and is sensitive to stimulus factors such
as lexical frequency (Embick et al. 2001).
Several MEG studies have engaged the connectionist literature on morphology, addressing the issue of whether there exists
a distinctly morphological level of representation, one that differs from the representation of meaning and spoken/written
form. In a MEG priming study, L. Stockall and A. Marantz (2006)
found that genuine morphological relatives (e.g., givegave,
teachtaught) pattern differently at the M350 response than
morphologically nonrelated word pairs that are merely similar at

526

the orthographical and semantic levels (e.g., boilbroil, screechscream). The former pairs exhibit a facilitatory priming effect on
the M350 latency, while the latter pairs do not. However, as in
many MEG studies, effects that were visible in the M350 peak
latency were obscured in the behavioral response latency, presumably by a different and opposite effect that arose later in the
time course of processing.
Many other neurophysical studies have found convergent evidence that morphological constituents are actively recognized
in the early stages of lexical processing. Repetition priming has
been found to attenuate the N400 response component to isolated words in lexical decision tasks (Rugg 1985). This effect on
ERP has also been observed with priming by morphological relatives: Regularly inflected primes elicit a weaker N400 response
for uninflected verb targets in comparison to unrelated primes,
whereas irregularly inflected primes do not produce a comparable reduction (Mnte et al. 1999). These contrasting priming
effects have been taken as evidence for morphological parsing
of regularly inflected forms. Other studies have found evidence
that the recognition of regularly inflected words is supported by
morphological decomposition in brain responses to morphologically illegal combinations like bringed (Morris and Holcomb
2005; Lck, Hahne, and Clahsen 2006; see also McKinnon, Allen,
and Osterhout 2003 for evidence of bound-stem parsing).
Evidence for decomposition is also observed in the effects
of regularity and lexical frequency on the P600 response to
inflected words that are ungrammatical for their context. In
a study that manipulated lexical frequency, morphological
regularity, and grammatical fit, high-frequency irregularly
inflected verbs in ungrammatical contexts (e.g., the boy couldnt
*ran / *walked fast enough) showed an earlier onset of the P600
response than did low-frequency irregularly inflected verbs,
in comparison to their grammatical counterparts (e.g., the boy
couldnt run / walk fast enough); but the onset of this response
was unaffected by lexical frequency for regularly inflected verbs
(Allen, Badecker, and Osterhout 2003). The pattern suggests
that for irregular verbs, surface frequency affects the speed with
which lexical recognition mechanisms can gain access to (and
exploit) inflectional content, but that this is not so for regularly
inflected forms.
Recent studies have used MEG to explore how morphology
shapes the recognition and interpretation of compounds and
morphologically derived words (Fiorentino and Poeppel 2007;
Pylkknen et al. 2004). These studies provide further support for
the view that the detection and exploitation of morphological
structure play a major part in the early and subsequent stages of
lexical recognition.
Neural methods provide a distinctive and potentially compelling source of data for the study of morphology. Still, the set of
compelling neural studies of morphology remains quite small,
relative to other research methods, and the coherence among
these studies remains low. This is due in part to the limited availability of costly neuroimaging equipment and to the limited number of scholars who have expertise in both the technical issues of
morphology and the technical methods of cognitive neuroscience. However, as the equipment proliferates and the methods
gain a stronger foothold in the field, we can expect that more rigorous methodological conventions will develop, and that studies

Morphology, Neurobiology of
in this emerging area will have an even greater impact on our
understanding of language and morphology.
Ehren Reilly and William Badecker
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Allen, M., and W. Badecker. 2000. Morphology: The internal structure of words. In What Deficits Reveal about the Human Mind/
Brain: Handbook of Cognitive Neuropsychology, ed. B. Rapp, 21132.
London: Psychology Press.
Allen, M., W. Badecker, and L. Osterhout. 2003. Morphological analysis in sentence processing: An ERP study. Language and Cognitive
Processes 18: 40530.
Baayen, R. H., T. Dijkstra, and R. Schreuder. 1997. Singulars and plurals
in Dutch: Evidence for a parallel dual-route model. Journal of Memory
and Language 37: 94117.
Badecker, W. 1997. Levels of morphological deficit: Indications from
inflectional regularity. Brain and Language 60: 36080.
Badecker, W., and A. Caramazza. 1991. Morphological composition in
the lexical output system. Cognitive Neuropsychology 8: 33567.
Beretta, A., C. Campbell, T. H. Carr, J. Huang, L. M. Schmitt,
K. Christianson, and Y. Cao. 2003. An ER-fMRI investigation of morphological inflection in German reveals that the brain makes a distinction
between regular and irregular forms. Brain and Language 85: 6792.
Davis, M., F. Meunier, and W. Marslen-Wilson. 2004. Neural responses to
morphological, syntactic, and semantic properties of single words: An
fMRI study. Brain and Language 89: 43949.
Embick, D., M. Hackl, J. Schaeffer, M. Kelepir, and A. Marant z. 2001. A
magnetoencephalographic component whose latency reflects lexical
frequency. Cognitive Brain Research 10: 3458.
Fiorentino, R., and D. Poeppel. 2007. Compound words and structure in
the lexicon. Language and Cognitive Processes 22.7: 9531000.
Gnther, T., F. Longoni, O. Sabri, L. Sturz, K. Setani, and W. Huber.
2001. PET study of basic syntax and verb morphology. NeuroImage
13: 538.
Kutas, M., and S. A. Hillyard. 1980. Reading senseless sentences: Brain
potentials reflect semantic incongruity. Science 207: 2035.
Laiacona, M., and A. Caramazza. 2004. The noun/verb dissociation in
language production: Varieties of causes. Cognitive Neuropsychology
21: 10323.
Lck, M., A. Hahne, and H. Clahsen. 2006. Brain potentials to morphologically complex words during listening. Brain Research 1077: 14452.
McKinnon, R., M. Allen, and L. Osterhout. 2003. Morphological decomposition involving non-productive morphemes: ERP evidence.
NeuroReport 14: 8836.
Miceli, G., and A. Caramazza. 1988. Dissociation of inflectional and derivational morphology. Brain and Language 35: 2465.
Miozzo, M. 2003. On the processing of regular and irregular forms of verbs
and nouns: Evidence from neuropsychology. Cognition 87: 10127.
Morris, J., and P. Holcomb. 2005. Event-related potentials to violations
of inflectional verb morphology in English. Cognitive Brain Research
25: 96381.
Mnte, T., S. Tessa, H. Clahsen, K. Schlitz, and M. Kutas. 1999.
Decomposition
of
morphologically
complex
words
in
English: Evidence from event-related brain potentials. Cognitive
Brain Research 7: 24153.
Osterhout, L., and P. J. Holcomb. 1992. Event-related brain potentials
elicited by syntactic anomaly. Journal of Memory and Language
31: 785806.
Osterhout, L., and J. Nicol. 1999. On the distinctiveness, independence,
and time course of brain responses to syntactic and semantic anomalies. Language and Cognitive Processes 14: 283317.
Pinker, S., and M. Ullman. 2002. The past-tense debate: The past and
future of the past tense. Trends in Cognitive Science 6: 45663.

Morphology, Universals of
Pylkknen, L., S. Feintuch, E. Hopkins, and A. Marantz. 2004. Neural
correlates of the effects of morphological family frequency and family
size: An MEG study. Cognition 91: B3545.
Pylkknen, L., and A. Marantz. 2003. Tracking the time course of word
recognition with MEG. Trends in Cognitive Science 7: 1879.
Pylkknen, L., A. Stringfellow, and A. Marantz. 2002. Neuromagnetic
evidence for the timing of lexical activation: An MEG component sensitive to phonotactic probability but not to neighborhood density.
Brain and Language 81: 66678.
Rugg, M. D. 1985. The effects of semantic priming and word repetition
on event-related potentials. Psychophysiology 22: 6427.
Rumelhart, D. E., and J. L. McClelland. 1986. On learning the past tenses
of English verbs. In Parallel Distributed Processing:Explorations in the
Microstructure of Cognition, ed. D. E. Rumelhart, J. L. McClelland, and
the PDP Research Group, 21671. Cambridge, MA: MIT Press.
Seidenberg, M. S., and L. M. Gonnerman. 2000. Explaining derivational
morphology as the convergence of codes. Trends in Cognitive Sciences
4: 35361.
Shapiro, K., and A. Caramazza. 2003. Grammatical processing of nouns
and verbs in the left frontal cortex? Neuropsychologia 41: 118998.
Stockall, L., and A. Marantz. 2006. A single route, full decomposition
model of morphological complexity. Mental Lexicon 1: 85123.

MORPHOLOGY, UNIVERSALS OF
The topic of universals has not been as prominent in the area of
morphology as it has been in some other areas of linguistics.
Many linguists share the impression that morphology is predominantly a domain of the language particular, rather than the
general and universal. Whereas all languages compose words to
make sentences in one way or another (syntax), it is not certain that all languages compose morphemes to make words.
The Chinese languages, for example, have almost no morphology apart from the possibility of compounding two roots to make
a word.
While it is probably too strong to say that some languages have
no morphological system at all, at least it seems clear that there
are no universal morphological categories notions that are
expressed by morphological means in all languages. For example, English requires that past tense be expressed as a suffix on
verbs (stun vs. stunned) and that plural number be expressed as
a suffix on nouns (box vs. boxes), but there are many languages
in which these notions are not expressed morphologically (e.g.,
Yoruba); they are either expressed by syntactic constructions or
not expressed at all. Moreover, morphology is a notorious repository of many historical relics, irregularities, exceptions, and idiosyncrasies. For example, the plural of box in English is boxes, but
the plural of ox is oxen; similarly, the past tense of stun is stunned
but the past tense of run is ran. Irregularities of this sort are tolerated in morphology in a way that they may not be (or not as
much) in other linguistic domains. What universals we can hope
to find in morphology, then, are statistical and implicational
universals, rather than absolute universals (see absolute and
statistical universals).
Despite the inherent noisiness of morphology, some universals of these sorts are discernable. Perhaps the best known
and best understood are universals of markedness. These universals have the form of statements saying that no language will
have an affix that expresses a marked category Y unless it also
has an affix that expresses a less marked category X within the

527

Morphology, Universals of
same semantic domain. For example, many languages (including English) have a plural form for nouns but no dual form that
means exactly two, whereas there are few or no languages that
have morphological marking for the category dual without also
marking the category plural. Similarly, verbs in a given language
do not express the distinction between inclusive first person
plural (we including you) and exclusive first person plural (we
not including you) without also expressing the more basic distinction between first person and second person (we versus
you). Along the same lines, languages do not have special affixes
for remote past and remote future without also having affixes
for simple past and simple future, and if a language makes any
aspect distinctions in its verbal morphology, it will make a distinction between imperfective aspect and perfective aspect.
A plausible reason why universals of this sort hold has to do
with the logic of features. It is assumed that more marked, semantically complex categories are built up out of simpler categories.
For example, the category dual shares a semantic feature [Group]
with the category plural but adds a feature such as [Minimal],
which it shares with singular (Harley and Ritter 2002). It stands
to reason, then, that a language will not have morphemes that
realize a more complex feature bundle like [Group, Minimal]
(dual) without also having morphemes that realize the simpler
feature bundles [Group] (plural) and [Minimal] (singular) that
this bundle properly contains. It seems likely that this vision can
be extended to the full range of nominal and verbal inflectional
categories, although many details remain to be worked out.
One of the most general universals of morphology is that the
order of morphemes in a complex word is almost always rigidly
fixed. Almost all languages allow the words of a sentence to be
rearranged to some degree for stylistic or pragmatic effect.
This is particularly true of a language such as Mohawk, in which
any word order is usually possible. In contrast, no language
allows the morphemes in a complex word to be freely rearranged
in this way. For example, the following Mohawk word consists
of 11 distinct morphemes, but any other ordering of these morphemes is ungrammatical:
(1)

Wa-sha-ko-t-yat-awi-tsher-ahetkv-t-v
FACT-he-her-self-body-put.on-NOML-be.ugly-make-forPUNC
He made the thing you put on the torso [i.e., a shirt or dress]
ugly for her.

There are occasional examples that might seem like exceptions to


this rule, when two affixes can come in different orders to express
different semantic scopes (as will be seen), or when a morpheme
is displaced from its expected position in order to respect conditions of phonological well-formedness. But even such deviations, which are strongly motivated by semantic or phonological
concerns, stand out as being rather unusual. A striking universal of morphology, then, is that morpheme order is fixed for a
language, and is not permitted to vary for pragmatic or stylistic
reasons. structuralist linguists and descriptive linguists
commonly capitalize on this property of morphology by using
the device of a template to describe the morphological structure
of words in a given language: A set number of morphological slots
(position classes) are identified for each word class, and every

528

affix of the language is indexed as being able to appear (ideally)


in one and only one of these slots.
Another putative universal of morphology is that inflectional
morphology can only appear outside of (further from the root
than) derivational morphology. Roughly speaking, derivational
morphology creates new words by adding affixes to existing
stems, whereas inflectional morphology creates the forms of a
word that a specific syntactic context might require. Examples
of derivational affixes in English include -ize (crystalcrystallize),
-ship (friendfriendship), -less (carecareless), -able (liftliftable),
and -ing (clipclipping). Examples of inflectional affixes in
English include the plural affix that attaches to nouns (crystalcrystals, friendfriends) and the past tense affix that attaches to
verbs (liftlifted, clipclipped). Now, there is no problem adding
inflectional morphology to the output of derivational morphology: Words like fossil-ize-d, friend-ship-s, and clip-ping-s are perfectly possible. But the reverse order is not allowed: The process
of having a solution turn into more than one crystal is not to
*crystal-s-ize and the state of having many friends is not *friend-sship, nor is a *clip-ped-ing something that was clipped out in the
past. Also bad are words like *lift-ed-able and *care-s-ful (having
many cares). A similar constraint says that inflectional morphology cannot be found inside a compound word: One can have
doghouse but not *dogshouse (a large house intended for more
than one dog); one can have a pickpocket but not a *pick-edpocket (a thief who has already done his dirty work). One can calculate what these words should mean, and in some cases one can
imagine uses for the word; nevertheless, the examples are at best
highly marked and unlikely to be used. Similar restrictions can
be observed in many other languages, although a limited range
of counterexamples has occasionally been pointed out. There are
also some unresolved questions about how exactly to define the
difference between derivational and inflectional morphology,
which need to be clarified to make this generalization meaningful and applicable in all cases (see, for example, Anderson 1982).
Nevertheless, there is little doubt that an important universal
characteristic of morphological systems lurks here.
Some finer-grained universal restrictions have been discovered. One idea, supported in many studies, is that the order of
morphemes in a complex word reflects the scope of those morphemes the order in which they were composed for syntax and
semantics (Baker 1985, 1988; Bybee 1985; Cinque 1999; Rice
2000). An example from derivational morphology is the following pair from Quechua (Baker 1988):
(2)

a. Mikhu-naya-chi-wa-n.
eat-want-make-1sO-3sS
It makes me feel like eating.
b. Mikhu-chi-naya-wa-n.
eat-make-want-1sO-3sS
I feel like making someone eat.

The suffixes chi to make and naya to want can attach to a verb
stem in either order, but there is a systematic difference in meaning. If make attaches before want, the combination means
to want to make someone eat, whereas if want attaches
before make, the combination means to make someone want
to eat. The order in which the affixes attach in Quechua matches
the order in which the words are combined syntactically in the

Morphology, Universals of
corresponding English paraphrases, and this in turn reflects how
the meanings are composed semantically in both languages. Not
all languages that have similar affixes allow both of the orders
shown in (2), but it is generally true that the orders that are used
correspond systematically to the order of interpretation for purposes of syntax and semantics.
Compositional ordering effects of this kind are rather widespread and apply to different types of morphology. One can see
something similar in English in the domain of compounding. For
example, ethics committee proposal refers to a proposal by or for
the ethics committee, whereas ethics proposal committee refers
to a committee in charge of formulating an ethics proposal. As in
Quechua, the different morpheme orders reflect different orders
of semantic composition. The only difference between the two
cases is that chi and naya are bound affixes that must attach to
verb roots, whereas proposal and committee are roots that can be
used as words in their own right in English. The observation that
morpheme orders must directly reflect the order of syntactic/
semantic composition is sometimes called the mirror principle
(Baker 1985); Bybee (1985) refers to a similar idea as the principle
of relevance.
Much the same constraint seems to hold of inflectional morphology as well, except that in this domain, there are very few
cases in which the order of morphemes can be reversed to give a
semantic effect. For example, Joseph Greenberg (1963) showed
that when both number marking and case marking are attached
to a noun root, the number marking almost always attaches
before the case marker does:
(3)

a. adam-lar-a (man-PL-DAT) to the men (Turkish)


b. *adam-a-lar

The reason is presumably because the plural operator is defined


semantically over the meaning of the noun itself, whereas the
function of the case marker is to relate the noun phrase as a
whole to the rest of the sentence in which it appears. Thus, in
(3a) the order of the affixes reflects the natural order of the
semantic composition, just as in the examples in (2). In contrast,
(3b) is bad because the relevant semantic operators do not
combine that way. The morphological universal in (3) can thus
be related to the fact that in languages like English, the plural
marker must attach directly to the noun, not to the prepositional
phrase that contains the noun (to the boy+s, not *to+s the boy).
Joan Bybee (1985) applies the same kind of reasoning to verbs
(see also Bybee, Perkins, and Pagliuca 1994). When verbs bear
multiple inflectional affixes, they almost always come in a fixed
order: An aspect marker attaches first, a tense marker attaches
outside an aspect marker, mood markers attach outside both
tense and aspect, and subject agreement markers attach last of
all (though the position of agreement is a bit more variable than
the others). Example (4) thus shows a typical morpheme order;
other orders are rare or nonexistent.
(4)

aku-wye-a-y-mi.
(Mapudungun)
arrive-PERF(aspect)-FUT(tense)-IND(mood)-2sS(agreement)
You will have arrived.

Notice that the English auxiliary verbs appear in essentially the


same relative order, suggesting that this, too, can be attributed
to domain-general facts about semantic composition. Guglielmo

Cinque (1999) presents a more fine-grained approach of this


kind, in which some 30 distinct inflectional categories are identified, each of which is shown to attach to a verb in a set order relative to all the others. (It is possible that the ban on inflectional
morphology coming inside of derivational morphology is a special case of this mirror principle/relevance principle, though that
is not obvious in all cases.)
In some of the more recent literature, however, there have
been hints that the fixedness of morpheme order might be even
more restricted than one would expect, given considerations of
semantic compositionality alone. For example, Larry Hyman
(2003) discusses examples like (5) from Chichewa:
(5)

Alenj a-ku-tks-its-il-a mkz mthko. (*a-ku-tks-il-its-a)


Hunters 3.PL-PROG-stir-make-APPL-FV woman spoon
The hunters are making the woman stir with a spoon.

At issue here are the suffixes its causative and il applicative,


which (in this case) adds the meaning of doing an action with
a particular instrument. From the semantic point of view, one
would think that these affixes should be able to attach to the verb
in either order, giving different compositional meanings. One
could start with the base verb stir, add the applicative affix il to
get stir with a spoon, and then add the causative to get make
someone [stir with a spoon]. Alternatively, one could add the
causative affix to the base verb first to get a causative action
make someone stir and then add the applicative affix to create
[make someone stir] with a spoon. In the first case, it would be
the woman who is using the spoon to stir; in the second case, it
would be the hunters who are using the spoon to impose their
will on the woman. Yet only the second affix order is possible,
and this form is ambiguous (or perhaps vague) concerning
the two imaginable meanings.
Hyman himself points to a historical explanation of the
absence of a second form in (5). He shows that the same restricted
ordering holds true for a wide range of Bantu languages spoken
in sub-Saharan Africa and that it is a special case of a more farreaching template, which stipulates not only that applicative
must follow causative but also that the reciprocal suffix can only
follow both of these, and the passive suffix can only follow all of
the others. He claims that this particular template was inherited
from Proto-Bantu by most of the descendant languages. But this
sort of historical explanation might not be general enough. First,
it begs the question of why the relevant affixes had to attach in
this particular order in the ancestor language. Second, it turns
out that most non-Bantu languages also allow the causative affix
to attach before applicative but not vice versa as well. Example
(6) shows that this is true for classical Nahuatl (spoken in Mexico
[Launey 1981, 197]); it also holds for Mohawk (northeastern
United States), Hiaki (southwestern United States), Shipibo
(Peru), Mapudungun (Chile), and many others.
(6)

Ti-ne:ch-in-tlacua-l-ti:-li-a in
no-pil-hua:ntoto:n.
2sS-1sO-PL-eat-caus-appl the
my-chidren
You made my children eat for me. (*Ti-ne:ch-in-tlacua-li-tia)

Something more general seems to be at work here.


In a similar vein, Gabriella Caballero and her colleagues
(2006) have recently argued that the morpheme order nounverb is universally preferred to the order verb-noun whenever a

529

Morphology, Universals of

Motif

noun and a verb combine to form a single verb. This holds true
regardless of the specific mode of combination, whether it is
the result of syntactic noun incorporation, productive morphological compounding, or idiosyncratic lexical combination. The
order of morphemes in the Mohawk example in (7) is thus typical
in this respect:
(7)

Wa-ke-nakt-a-hninu-
Fact-I-bed-0-buy-PUNC
I bought a bed.

(not: hninu-nakt buy-bed)

This universal order is occasionally overridden by syntactic


ordering principles in particular languages (like Mapudungun).
But noun-verb order is always preferred by the morphology, and
it emerges more strongly in the more lexicalized, purely morphological constructions, where contamination from syntactic factors is minimal.
Taken together, studies like these hint that there might be a
universal morphological template, roughly of the form nounverb-causative-applicative-passive. This template appears to be
a force (though not an irresistible one) that is at work influencing
the morpheme orders of all languages. Moreover, this order does
not seem to reduce to historical factors or semantic composition.
Why this should be is unknown at this point. However, it seems
likely that more morphological universals of this sort will be discovered in the future as linguists recover from their impression
that morphology is primarily the domain of the idiosyncratic and
the language particular.
Mark C. Baker
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, Stephen. 1982. Wheres morphology? Linguistic Inquiry
13: 571612.
Baker, Mark. 1985. The mirror principle and morphosyntactic explanation. Linguistic Inquiry 16: 373415.
. 1988. Incorporation: A Theory of Grammatical Function Changing.
Chicago: University of Chicago Press.
Bybee, Joan. 1985. Morphology: A Study of the Relation Between Meaning
and Form. Amsterdam: John Benjamins.
Bybee, Joan, R. Perkins, and W. Pagliuca. 1994. The Evolution of
Grammar: Tense, Aspect, and Modality in the Languages of the World.
Chicago: University of Chicago Press.
Caballero, Gabriella, Michael Houser, Nicole Marcus, Teresa
McFarland, Anne Pycha, Maziar Toosarvandani, Suzanne Wilhite,
and Johanna Nichols. 2006. Nonsyntactic ordering effects in syntactic noun incorporation. Manuscript, University of California at
Berkeley.
Cinque, Guglielmo. 1999. Adverbs and Functional Heads: A CrossLinguistic Perspective. New York: Oxford.
Greenberg, Joseph. 1963. Universals of Language. Cambridge, MA: MIT
Press.
Harley, Heidi, and Elizabeth Ritter. 2002. A feature-geometric analysis of
person and number. Language 78: 482526.
Hyman, Larry. 2003. Suffix ordering in Bantu: A morphocentric
approach. In Yearbook of Morphology 2002, ed. Geert Booij and Jaap
van Marle, 24582. Dordrecht, the Netherlands: Kluwer Academic
Publishers.
Launey, Michel. 1981. Introduction la Langue et la Littrature
Azteques. Vol. 1. Paris: LHarmattan.
Rice, Keren. 2000. Morpheme Order and Semantic Scope. Cambridge:
Cambridge University Press.

530

MOTIF
Motif is a unit of measurement and content analysis. It is applied
to expressive culture, especially literatures and certain branches
of the arts, such as painting, sculpture, and music. The systematic application of the term was established in the early 1900s
through the work of the Finnish School and its attempts to use
an international folktales history (time) and dispersal in social
and physical space (geography) to reconstruct the tales original form (urform, archetype), place of birth, and other related
matters of diffusion. This approach is also known as the historic-geographic method for its reliance on objective verifiable
criteria, rather than speculative hypotheses. For the Finnish
School, two key concepts became indispensable research instruments: tale-type, a problematic term signifying a full folktale
known cross-culturally, and motif, a smaller unit designating a
detail contributing to the formation of the plot. Motif complex/
cluster/sequence and episode are other related measurement
units. Though arising from the works of the Finnish School,
perceived by many as useful only in comparative studies, and
shackled by problems of name interpretation and linkage to the
currently unfashionable quest for origins, the usefulness of these
terms as tools of data identification and objective analysis transcends these limitations (El-Shamy 1997, 235).
The most salient attributes of a motif are its endurance (continuity in time) and recurrence within a community (continuity
in space). Continuity in time and space are basic requirements
for traditionality. In 1925, Arthur Christensen argued that a motif
persists in tradition according to a psychological law which is
not easily explicable. This empirical characteristic was left unexplained (Bdker 1965, 2013). From the perspective of the bearer
of lore and the author or composer of elite art, certain traditional
themes possess cognitive salience (impressiveness) that make
them stand out and grab a persons attention. This salience (logical or affective) may be due to frequency of occurrence (repetition), meaningfulness, structure, uniqueness, ego involvement,
and so on, properties that make such themes easily perceived,
learned, retained, and recalled (El-Shamy 1997).
The concept of motif is a close parallel to that of theme. The
two terms are often used interchangeably. However, theme has
dominated in the study of elite literature, whereas motif has
been more common in the study of folklore. Research in literary
themes has commonly been pursued because of its interpretive potentialities and its intrinsic congruency with the history
of ideas (Jost 1988, xv). Additionally, literary authorities have
considered it an efficient counteragent against primarily aesthetic movements, such as progressive Universalpoesie. It is
also argued that the motif is intellectual by nature. It expresses
a process of reasoning about mens conduct of life and, as a consequence, does not concern itself with the analysis of individual
characters or extraordinary happenings (Jost 1988, xvii).
The viewpoint outlined here addressing elite literature stands
at some variance with the folkloristic usage of the term motif and
the perceived scope of its applicability. Introducing his MotifIndex, Stith Thompson explained that his system is built around
the interest of students of the traditional narrative and would
address a certain type of character, action, as well as attendant circumstances of the action (1955, I.11). (See, respectively,

Motif

Movement

for examples: W10, Kindness; T72.2.1, Prince marries scornful girl and punishes her; R216, Escape from ship while captors
quarrel).
Influenced by anthropology, Thompsons system may be
compared to the analytical-classificatory devices of culture element, culture complex, and culture institution with culture element being the smallest identifiable component of culture. In
congruence with his companions in the Historic-Geographical
School, Thompson saw folk literature (especially narratives) as
analyzable in terms of motifs, episodes (or motif complexes/
sequences), and full narrative plots constituting tale-types. A
motif, though considerably more intricate, is comparable to
culture element; culture complex is comparable to episode; and
culture institution is comparable to tale-type. For Thompson,
motifs are those details out of which full-fledged narratives are
composed (1955, I.10).
Explaining the rationale for his motif system and its main
objective, Thompson declares that it emulates what the scientists have done with the worldwide phenomena of biology
(1955, I.1011). In this respect, the underlying principle for
motif identification and indexing is comparable to that devised
by anthropologists at Yale for categorizing culture materials
in terms of 78 macrounits (1088) and 629 subdivisions thereof
used to establish The Human Relations Area Files (HRAF);
these files, begun almost contemporaneously with the first publication in the 1930s of Thompsons Motif-Index, may be viewed
as an unprinted index. Twenty-three divisions make up the spectrum of sociocultural materials covered in Thompsons MotifIndex, each treated in an independent chapter (e.g., B. ANIMALS;
C. TABU; F. MARVELS; X. HUMOR). These cardinal themes are
divided into 1,730 subdivisions (El-Shamy 1995).
Because Thompsons Motif-Index seeks global coverage,
numerous geographic regions and national entities did not
receive adequate attention. Consequently, significant fields of
human experience are missing or sketchily presented. Major
expansions are offered in ensuing works, for example (note: the
sign indicates addition to Thompsons system): F70, Ascent
to other planets (worlds) by space ship (flying saucer); J70,
Teaching (training) by cruel example; P610, Homosociality
[]; P770, Markets: buying, selling, trading; X580, Humor
concerning misers and miserliness (El-Shamy 1995; cf. Birkhan,
Lichtblau, and Tuczay 2005).
An offshoot of the motif system is the concept of motifeme,
a hybrid of folklore and linguistics modeled after morphological
analyses of folktales. In this system, an abstract unit of action or
state is labeled motifeme, and its manifestations are motifs. The
variety in which a given motifeme is manifested is termed allomotif (Dundes 1964).
Hasan El-Shamy
WORKS CITED AND SUGGESTED FURTHER READING
Birkhan, Helmut, Karin Lichtblau, and Christa Tuczay. 2005. Motif-Index
of German Secular Narratives from the Beginning to 1400. 6 vols. Berlin
and New York: W. de Gruyter.
Bdker, Laurits. 1965. Folk Literature (Germanic). Vol. 2 of International
Dictionary of Regional European Ethnology and Folklore.
Copenhagen: Rosenkilde and Bagger.

Dundes, Alan. 1964. The Morphology of North American Indian Folktales.


Folklore Fellows Communications, no. 195. Helsinki: Academia
Scientiarum Fennica.
El-Shamy, Hasan. 1995. Folk Traditions of the Arab World: A Guide to
Motif Classification. 2 vols. Bloomington: Indiana University Press.
. 1997. Psychologically-based Criteria for Classification by Motif
and Tale-Type. Journal of Folklore Research 34.3: 23343.
Garry, Jane, and Hasan El-Shamy, eds. 2005. Archetypes and Motifs in
Folklore and Literature: A Handbook, Armonk, NY: M. E. Sharpe.
Jost, Franois. 1988. Introduction. In Dictionary of Literary Themes
and Motifs, ed. Jean-Charles Seigneuret, xv-xxiii. Westport,
CT: Greenwood.
Murdock, G. P., Clelland S. Ford, and Alfred E. Hudson. 1938. Outline of
Cultural Materials. New Haven, CT: Institute of Human Relations, Yale
University.
Sperber, Hans, and Leo Spitzer. 1918. Motiv und Wort, Studien zur
Literatur-und Sprachpsychologie. Leipzig: O. R. Reisland.
Thompson, Stith. 19558. MotifIndex of Folk Literature. 6 vols. Revised
ed. Bloomington: Indiana University Press.

MOVEMENT
Movement is an operation posited in theoretical syntax, in which
words, phrases, and perhaps also morphemes are relocated from
one part of a sentence to another. Movement is generally invoked
in cases in which a phrase has a combination of properties associated with distinct positions in the sentence. Consider (1):
(1)

Whom did you see?

Approaches to syntax that posit movement typically claim that


the word whom in (1) has moved from the end of the sentence,
where the direct object would ordinarily be, to the beginning.
This explains why, for instance, whom bears the accusative case;
whom has properties associated with direct objects, because it
has occupied the direct objects position.
One type of argument for movement tries to establish that
an apparently empty position must be filled, and that the logical
filler is a moved phrase. For example, a verb like fix must take a
direct object, which is typically in immediately postverbal position in English:
(2)

a. You fixed the car.


b.*You fixed.
c.*You fixed yesterday the car.

However, a question like (3a) is well formed (unlike (3b):


(3)

a. What did you fix?


b. *When did you fix?

The ill-formedness of (3b) shows that the required transitivity of


fix is not suspended in questions; fix must have a direct object in
(3a). The contrast in (3) suggests that the direct object is what,
since what is present just in the well-formed (3a). Example (2c)
shows that the direct object in English must typically be immediately postverbal; we can preserve this generalization in (3a)
by positing movement of what from postverbal position to the
beginning of the sentence.
Other arguments for movement try to show that a moved
phrase has occupied positions in the sentence that it no longer
occupies. For instance, James McCloskey (2000) discusses a

531

Movement
West Ulster dialect of English in which the questions in (4a) and
(4b) are synonymous:
(4)

a. What all did you buy?


b. What did you buy all?
c. *What did all you buy?

The position of all is not completely free in this dialect, as


the ill-formed (4c) shows. In fact, all can only appear in positions that what has occupied. In (4b), all appears in the direct
object position; this is the position that what occupies before
moving to the beginning of the sentence. Theories that posit
movement can account for the distribution of all in this dialect; the phrase what all can leave the word all behind when
it moves.
Another argument for movement is based on the phenomenon of reconstruction, in which moved phrases are treated by
certain semantic dependencies as though they had not moved.
Space constraints prevent further discussion here (cf. Romero
1997; Fox 2000).
Syntacticians distinguish several subtypes of movement. The
examples thus far have all involved wh-movement, which forms
questions by moving certain phrases to the beginning of the
clause. Another type, head-movement, derives (5b) from (5a), via
movement of is:
(5)

a. He is leaving.
b. Is he ___ leaving?

A third type is sometimes called NP-movement; examples


include movement of John in (6a) (compare the roughly synonymous [6b]), and movement of Mary from object to subject
position in (7):

(9) a. What did he say that he wanted all?


b. What did he say all that he wanted?
c. What all did he say that he wanted?

The examples in (9) differ with respect to how far all travels with
what: not at all (9a), only to the intermediate site (9b), or to the
beginning of the clause (9c).
Another controversial type of movement is involved in the
Chinese and Japanese wh-questions in (10a) and (10b), both of
which have the same meaning as the English (10c):
(10)

a. John mai-le
sheme?
John buy Perf what
b. John-wa nani-o
kaimasita ka?
John TOP what ACC bought Q
c. What did John buy?

In English, what is moved to the beginning of the clause, but in


Japanese and Chinese, the corresponding words can appear in
noninitial positions, just where they would be if they were not
wh-phrases (wh-in-situ). On one approach to these data, Chinese
sheme and Japanese nani, like their English translation what,
undergo wh-movement to the beginning of the clause; unlike
what, however, these words move in a way that does not affect
where they are pronounced (covert movement).
Some arguments for covert movement center on the interaction of wh-in-situ with established constraints on movement
(cf. Huang 1982; Richards 2009). For instance, wh-movement
in many languages is unable to pass out of an embedded interrogative clause; we say that interrogative clauses are islands for
wh-movement. Thus, (11a) is well-formed, but (11b), with whmovement out of an interrogative clause, is not:
(11)

a.

b.
(6)

*What does Mary wonder [who bought __ ]?

a. John seems __ to be happy.


b. It seems that John is happy.

(7)

What does Mary think that John bought __ ?

Mary was promoted __.

Other types of movement are more difficult to detect. For


instance, much work argues that wh-movement in (8) is not a
single move from the end of the sentence to the beginning, but
stops in at least one intermediate landing site (successive-cyclic
movement):

Japanese wh-in-situ exhibits a similar constraint; the sentences


in (11) have the Japanese translations in (12) with a similar contrast in grammaticality:
(12) a. Mary-wa John-ga nani-o
katta to omoimasu ka?
Mary TOP John NOM what ACC bought that thinks
Q
b. * Mary-wa [dare-ga nani-o
katta ka] siritagatteimasu
ka?
Mary TOP who NOM what ACC bought Q wonders Q

Norvin Richards
WORKS CITED AND SUGGESTIONS FOR FURTHER READING

(8)

What did he say that he wanted __?

One argument for this type of movement comes from West Ulster
English. We saw that this dialect allows all to be stranded in positions formerly occupied by a wh-moved phrase. The all-stranding facts for this dialect in (9) show that wh-movement has an
intermediate landing site:

532

Fox, Danny. 2000. Economy and Semantic Interpretation. Cambridge,


MA: MIT Press.
Huang, James. 1982. Logical relations in Chinese and the theory of
grammar. Ph.D. diss., Massachusetts Institute of Technology.
McCloskey, James. 2000. Quantifier float and wh-movement in an Irish
English. Linguistic Inquiry 31: 5784.
Richards, Norvin. 2009. Wh-questions. In The Oxford Handbook
of Japanese Linguistics, ed. Shigeru Miyagawa and Mamoru Saito.
Oxford: Oxford University Press.

Music, Language and


Romero, Maribel. 1997. The correlation between scope reconstruction and connectivity effects. In Proceedings of WCCFL 16, 35166.
Stanford, CA: CSLI.

MUSIC, LANGUAGE AND


Connections between music and language have been a perennial concern of scholars, poets, music and literary theorists, and
musicians going back to antiquity. The basis of this interest lies
in certain commonalities that are intuitively understood to lie at
the heart of the two capacities, but which become complex and
problematic when one attempts to elucidate their precise nature.
Confusion on the question of the interaction of music and language is not surprising since the underlying basis of music and
language as independent objects has been poorly understood
until recently. Nor is it surprising, given traditional disciplinary divisions, that scholars have tended to focus on developing
descriptive and explanatory frameworks (see descriptive,
observational, and explanatory adequacy) for each,
which take for granted their status as independent rather
than common faculties. With the exception of Jean-Jacques
Rousseaus ([1763] 1997) Essai sur lorigine des langues, which
claims an ancestral proto-language from which language and
music both derive, most attempts at engaging the question have
stressed that music and language not only serve distinct ends,
one mainly aesthetic and the other mainly communicative, but
also access distinct underlying psychological means.
The umbrella of cognitive science has provided the context
for a renewed discussion of some of these points of comparison.
Perhaps most striking is the reemergence of arguments for an
evolutionary precursor in the form of what Stephen Brown (2000)
refers to as musilanguage whose essential characteristics are
identified in Stephen Mithen (2005) by the acronym hmmmm
(holistic, manipulative, multimodal, musical, and mimetic).
These approaches are somewhat controversial not only in their
endorsement of what might be called a neo-Rousseauvian perspective but also in their assumption of a significant overlap in
some of the cognitive structures that underlie both music and
language. The best-known explorations of the common ground,
Leonard Bernsteins The Unanswered Question, Deryk Cookes
The Language of Music, and Joseph Swains Musical Languages
have not suggested any specific shared mechanisms. Rather,
they and others have applied certain aspects of the descriptive
apparatus and general methodologies of linguistic theory to

yield, as Brown notes, analogies between music and language,


helpful and suggestive analogies, to be sure, but which do not
constitute arguments for a shared cognitive basis.

Musicalist Representation of Linguistic Structure


The studies just alluded to are representative of recent scholarship in that they attempt to ground subjective judgments with
respect to musical structure on the hard foundation provided by
linguistic science. While this has been the prevailing direction
of influence, it has at times extended in the opposite direction.
Most notably, musical notation for several centuries constituted
the only effective means for visually representing the structure
of audible sound. Among the acoustical phenomena rendered
visible and thereby amenable to a structural analysis were speech
sounds of English carefully transcribed into a modified form of
musical notation by Joshua Steele in his 1775 Essay Towards
Establishing the Melody and Measure of Speech.
In a recent review of the work, Jamie Kassler (2005) credits
Steele as being among the first to identify linguistic suprasegmentals the tier of linguistic structure computed independently of and mapped onto phonemic segments. tone, the
hierarchically related sequence of pitch locations assigned to
voiced segments, is one such suprasegmental and is relatively
naturally represented in musical notation. Steele also recognized that unlike musical pitch, which tends to be discrete, the
target pitches of speech are consistently connected by continuous glissandi or slides, represented in his scores by diagonal
line segments of various types attached to note stems, shown in
Figure 1.
The other linguistic suprasegmental identified by Steele,
stress, emerges somewhat obliquely from his transcriptions.
One of Steeles important insights was to have recognized that a
particular type of musical accent, the metrical accent, is associated with linguistic stress. Thus, for example, the initial beat of
the musical measure is metrically strong, and the most stressed
syllables of a text assigned to a tune (generally the stressed
syllables of polysyllabic words Peter, going, mistake, and coming
in Figure 1) are assigned to what he refers to as the ictus position.
Finally, and most significantly, Steele recognized that metrical
accent is not an objective feature of the musical event but is a
psychological attribute inherited from its temporal location. A
strong position will be perceived as such regardless of whether the
event occupying the position is objectively accented in the form
of higher pitch, amplitude, or length. Indeed, it may be heard as
strong even when it is vacant occupied by a rest. Metrical accent

Figure 1.

533

Music, Language and

Figure 2.

*
*
**
*
** ** *
Ticonderoga

level
level
level
level

3
2
1
0

Figure 3.
is therefore, in Steeles words, a subjective mental sensation
deriving from a sense of pulsation giv[ing] the mind an idea
of emphasis independent of any actual increment of sound or
even of any sound at all (1775, 117). In recognizing the abstract
character of meter, he anticipated twentieth-century cognitivist approaches that view linguistic stress, along with most other
salient characteristics of language, as mental constructs, phonological rather than phonetic, psychologically real but
only obliquely related to the acoustical or physiological surface
form.

The Grid Representation


The measure of speech referred to in Steeles title the patterned occurrence of strong and weak metrical positions is
represented in his transcriptions by a three-level hierarchy
appearing below the staff in example 1: Heavy, light, and lightest locations within each measure are assigned a triangle, three
dots, and two dots, respectively. This would be the first, and for
many years one of the few, attempts to make explicit the underlying form corresponding to the way that meter is mentally constructed by listeners. When this objective would be reinitiated
in the 1970s, most notably within the generative theory of Ray
Jackendoff and Fred Lerdahl (1983), the representation would
take the form of the metrical grid shown in example 2 from
Mozarts Symphony 40 (Figure 2). It will be noticed that this
example omits the conventional notational means for indicating the metrical hierarchy barlines, the beaming of eighth notes,
and the time signature. It can do so since these are indicated
with greater precision by the grid, which identifies the relative
prominence of particular locations by their inclusion at successive horizontal tiers, referred to as higher levels of the grid.
Relatively strong positions at the measure, half note, and quarter notes are represented by columns appearing above these
locations, while weak positions at the eighth-note level appear
only at the lowest level of the grid.
That metrical structure is a fundamental component of
music or, to put it informally, that music frequently has a
beat is, of course, self-evident to most listeners. That normal
linguistic utterances are rhythmic in anything like the same
sense is less apparent and remains a subject of some controversy
within linguistics. For this reason, it might appear surprising
that as phonologists confronted a range of data provided by a
cross section of the worlds languages, it became apparent that

534

the same representation, namely, the grid, would emerge as the


optimal means for representing linguistics stress. Indeed, metrical stress theory, the dominant explanatory framework within
the generativist paradigm would be defined by the grid representation, one variant of which is shown in example 3 (from Halle
and Vergnaud 1987) (Figure 3).
A comparison of the grids in examples 2 and 3 reveals two
essential differences between linguistic and musical structure.
First, while the stress grid projects syllables onto higher metrical levels, the bottom level of the musical grid indicates not
actual musical events (i.e., notes) but, rather, temporal locations. As a consequence, empty metrical locations, such as those
in example 2, which are a necessary component of any reasonable description of musical meter, are excluded from the stress
grid. Secondly, as Jackendoff and Lerdahl (1983) show, musical
structure imposes strict requirements on the geometric form
that grids may assume, limited to what they refer to as a small
class of well-formed structures. In contrast, there are no a priori
constraints on the form taken by the stress grid. The successive
positions projected onto line 1 of example 3 would be ruled out
as a potential metrical structure in music, where strong positions
need to be separated by at least one position on level 0. This violation of musicalist well-formedness does not, however, prevent
example 3 from accurately characterizing the pattern of secondary and primary stress for the word in question.
The asymmetries in the two forms of representation are, it
would seem, necessary for a description of the output of each
system: As mentioned, the projection of musical meter requires
an underlying temporal periodicity that is neither intuitively
obvious nor empirically demonstrable in language except as a
statistical regularity (see Patel and Daniele 2003 for discussion).
In addition, the asymmetry reflects essential differences in the
character of the basic elements of musical versus linguistic structure. The assignment of stress is a formal computation effected
on syllables from the rich phonemic inventory of particular
languages. In contrast, the computation of musical meter what
is known in the music perception literature as beat induction
can be effected on a highly impoverished musical input. As has
been shown repeatedly, a listener will unproblematically assign
a metrical structure, even when the events to which it is assigned
appear as series of pitchless claps, clicks, or drumbeats. The sorts
of subtle variation in timbre and pitch characteristic of the phonemic repertoire may tip the balance between competing metrical interpretations when these appear in a musical context;
however, they are in themselves insufficient for the inference of
meter.

Rhythmic Structure in Language and Music


It is worth noting that the uncoupling of a musicalist interpretation from metrical grids in their application in most variants
of metrical stress theory is, in some respects, inconsistent with
the grid notation as it was proposed in work by Mark Liberman
([1975] 1979), undertaken concurrently with Jackendoff and
Lerdahl (1983). Here, the intention was explicitly musicalist,
namely, to relate the metrical structure of simple childrens
songs and chants to the syntactic and phonological structure
of the words and phrases assigned to them. In contrast to most
approaches, cognitivist and traditional, which view musical and

Music, Language and

L (2)
L (1)
L (0)
a) * Thirteen men
L (2)
L (1)
L (0)
b) Thirteen men
Figure 4.
linguistic computations as independent and self-contained,
Libermans objective was to establish an equivalence between
the underlying representation of song and speech, a connection
that was understood by Liberman to be in some ways a very
deep one (1975, 81). This hypothesis, whatever its ultimate heuristic or conceptual value, has not been influential within the
field of linguistics or in music theory.
Two partial exceptions should, however, be mentioned.
While word-level stress provides evidence for the disassociation
of metrical structure in language and music, higher levels of linguistic structure provide some evidence for a musicalist interpretation of linguistic performance. In particular, phrasal stress,
unlike word stress, is not phonologically austere and requires
for its computation, in addition to morphological, syntactic and pragmatic factors, the quasi-musical considerations
of what is referred to as phrasal euphony. Most conspicuous
among these is the stress clash resulting from stressed syllables
from two words appearing adjacent to each other within the
same phrase. The unacceptable form in a) triggers the application of the rhythm rule, which achieves euphony by retracting
leftward the first of the two syllables involved in the stress clash
to produce the acceptable form b). While not a validation of the
musicalist view, the terminology that is adopted by linguists, as
well as the mechanisms by which this particular phenomenon is
explained, is suggestive of a shared basis underlying the computation of phrasal stress and the assignment of musical meter.
A second point of contact, as noted by Jamie Kassler (2005),
is the musicalist interpretation of metrical structure incorporated into certain approaches to formal prosody and in its
transformational variant, generative metrics. The stated goal
of these approaches is to define the abstract structure of lines
of texts composed in a poetic meter. But even here, the adoption of a musicalist view as representing a significant aspect of
the relevant empirical domain remains controversial. Although
poets routinely invoke the music of poetry and the rhythms
of verse and appeal to explicitly musical terms such as phrasing, staccato, harmony, and so on, it remains an open question
whether poetic rhythm has any relationship to musical rhythm
as musicians understand as the term. The reaction to Derek
Attridges Rhythms of English Poetry (1982) sheds some light on
these questions. Attridge posits a scansion that assigns syllables
to alternating beat and offbeat positions (indicated by b and o,
respectively), some of which can remain vacant, most notably at
the end of lines (see verse line). The general approach and the
representation of empty positions, in particular, is criticized by
Marina Tarlinskaya for failing to recognize that though music

and language used to be intertwined, they parted ways long ago.


Musical meter and meter of verse texts cannot be equated; musical theories of meter need no resurrection (2002, 39).
Tarlinskaya probably represents a consensus position among
scholars of metrics in doubting that there is significant evidence
for the influence of musicalist rhythm on the structure of literary
verse. An offshoot of generative metrics has been able to avoid
this problem by taking as its primary empirical domain texts
that are unambiguously intended as functioning with a musical
context, namely, lyrics of familiar strophic song forms. The basis
of this work is the recognition that average listeners encountering unfamiliar texts for a familiar strophic song will sometimes
effect considerable modifications in the structure of the original
to accommodate the text. Thus, as noted in Halle and Lerdahl
(1993), those minimally competent in the relevant linguistic and
musical idiom will delete 3 of the original 10 notes of the song
The Drunken Sailor when they encounter the 7-syllable text
Keel haul him til hes sober, while augmenting the original
melody with 3 additional notes when confronted with the 13 syllables Scrape the hail off his chest with a hoop iron razor. These
strikingly uniform intuitions constitute the core data of what
Bruce Hayes (in press) designates as the textsetting problem,
for which he proposes an optimality theoretic solution. It
remains to be seen whether this work will validate Liebermans
initial insight of the deep connection between linguistic and
musical structure, or whether the relevant intuitions will apply
solely to the narrow artistic domain with which these analyses
are concerned.

Conclusion: An Internalist Perspective on Language


and Music
One possible explanation for the discrepancies in the forms
taken by linguistic and musical representations, even when they
are superficially similar, is that this discontinuity simply reflects
the fact of the matter. That is, no significant overlap in the empirical domains of music and language exists beyond the fact that
both are, in an important sense, products of our minds, which
make use of our psychological capacities for structuring the
external world. But it should be recognized that however close or
distant the ultimate relationship, granting a significant psychological basis to musical structure is itself testimony to the influence of linguistics, namely, the recognition by modern linguists
of the status of language as grounded not in the external reality of
speech its acoustic and physiological structure (E-language)
but in the underlying psychological mechanisms that give rise to
linguistic behavior (i-language).
In contrast, musical scholarship has remained largely a
structuralist enterprise, devoted primarily to describing
the external tokens of music, most commonly musical scores.
Approaches that take as primary the unconscious knowledge
that listeners (and composers in their capacity as listeners)
access in making sense of what they hear and compose are decidedly peripheral within the field. Consequently, confusions as to
what a theory of music is a theory of arise more or less routinely.
Insofar as traditional structuralist theories are seen as offering
the only empirically sound and intellectually satisfying accounts
of musical form, then linguistic and musical scholarship, aside
from occasional points of convergence, are likely to continue on

535

Narrative, Grammar and


their separate paths. If, on the other hand, the posing of interesting questions and a viable theoretical framework relies crucially
on viewing musical works as psychologically based natural
objects in the Chomskyan sense, then in this respect, what we
understand about language has a great deal to offer our understanding of music.
John Halle
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Attridge, Derek. 1982. The Rhythms of English Poetry. New York:
Longman.
Bernstein, Leonard. 1976. The Unanswered Question: Six Talks at
Harvard. Cambridge: Harvard University Press.
Brown, Stephen. 2000. The musilanguage model of music evolution.
In The Origins of Music, ed. N. L. Wallin, B. Merker, and S. Brown,
271300. Cambridge, MA: MIT Press.
Cooke, Deryk. 1959. The Language of Music. Oxford: Clarendon.
Halle, John, and Fred Lerdahl. 1993. A generative textsetting model.
Current Musicology 55: 326.
Halle, Morris, and Jean-Roget Vergnaud. 1987. An Essay on Stress.
Cambridge, MA: MIT Press.
Hayes, Bruce. Textsetting as constraint conflict. In Toward a
Typology of Poetic Forms, ed. Jean-Louis Aroui and Andy Arleo.
Amsterdam: Elsevier. In press.
Jackendoff, Ray, and Fred Lerdahl. 1983. A Generative Theory of Tonal
Music. Cambridge, MA: MIT Press.
Kassler, Jamie. 2005. Representing speech through musical notation.
Journal of Musicological Research 24: 22739.
Liberman, Mark. [1975] 1979. The Intonational System of English.
New York and London: Garland.
Mithen, Stephen. 2005. The Singing Neanderthals: The Origins of Music,
Language, Mind and Body. London: Weidenfeld and Nicolson.
Patel, A. D., and J. R. Daniele. 2003. An empirical comparison of rhythm
in language and music. Cognition 87: B35B45
Rousseau, Jean-Jacques. [1763] 1997. Essai sur lorigine des langues.
Paris: H. Champion. Reproduction of the Neuchtel manuscript.
Steele, Joshua. 1775. An Essay Towards Establishing the Melody and
Measure of Speech to Be Expressed and Perpetuated by Peculiar Symbols.
London: J. Almon.
Swain, Joseph. 1997. Musical Languages. New York: Norton.
Tarlinskaya, Marina. 2002. Verse text: Its meter and its oral rendition.
In Meter, Rhythm, and Performance, ed. C. Kueper, 3955. Frankfurt
am Main: Peter Lang.

N
NARRATIVE, GRAMMAR AND
Analysts of stories have developed two broad strategies for
studying the relations between narrative and grammar. The first
emphasizes grammar of narrative, and the second grammar in
narrative.
Drawing on definitions of grammar as a model of the categories and processes underlying intuitive knowledge of (or competence in) a language, theorists working in fields such as
cognitive psychology, artificial intelligence research, and narratology have proposed story grammars, that is, grammars of narrative. (A key question is whether this work involves

536

a principled extension or, rather, a more or less metaphorical


extrapolation of concepts of grammar developed within the
language sciences.) Such higher-order grammars take the form
of rule systems designed to capture the basic units of narrative
and specify their distributional patterns in a more or less clearly
defined corpus of narrative texts (see corpus linguistics).
This tradition of research can be traced back to the early precedent set by Vladimir Propp ([1928] 1968), who analyzed a corpus
of 100 folktales into a finite number of structural constituents
that he termed functions (or character actions defined in terms
of their sequential position within an unfolding plot) and identified rules for their patterning in the corpus he studied.
In the 1970s and 1980s, spurred by the (re)discovery of Propp
by structuralist narratologists, as well as by the attempt to develop
automated systems for story understanding and story generation,
theorists tried to create story grammars with the widest possible
scope. Gerald Prince (1973) drew on Chomskyean transformational generative grammar in an effort to develop a grammar of
stories with greater descriptive adequacy than the one presented by Tzvetan Todorov in a 1969 book entitled Grammaire
du Dcamron. Another narratologist, Thomas G. Pavel (1985),
proposed a move grammar that drew not on Chomskyean theory but rather on Propps foundational work, analyzing narratives into problems and moves performed by characters seeking
to bring about their solution. Meanwhile, in a contribution to the
cognitive-psychological strand of story grammar research, J. M.
Mandler (1984, 22) argued that stories have an underlying, or
base, structure that remains relatively invariant in spite of gross
differences in content from story to story. This structure consists
of a number of ordered constituents that include a setting and
an episode, which is in turn decomposable into a beginning that
causes a development that causes an ending.
Given that another entry in this encyclopedia covers story
grammars in greater detail, the remainder of the present discussion will focus on the second broad approach to studying
the relations between narrative and grammar. The aim of this
approach is to provide not a grammar of stories but, rather, a
principled account of how stories exploit grammatical resources
in narrative-pertinent (or even narrative-specific) ways. Here,
the emphasis shifts from a grammar of narrative to the functions
of grammar in narrative or, put another way, to narrative uses
of grammatical structures. This second approach is arguably
the predominant one in current linguistically oriented research
on narrative, in part because of critiques of the story grammar
enterprise by theorists such as P. N. Johnson-Laird (1983), who
contends that settings, reactions, moves, and other basic units
posited by story grammarians are not specified clearly enough
to be construed as elements of a grammar, strictly defined. They
are, rather, heuristic constructs based on a prior, unstated gloss
or interpretation of the narrative.
Grard Genettes ([1972] 1980) tripartite model of narrative
structure, which encompasses 1) the story (= the basic sequence
of states, actions, and events recounted), 2) the text on the basis
of which interpreters reconstruct that story, and 3) the act of narration that produces the text, provides a framework for studying grammar and narrative. Specifically, analysts can focus on
the role of grammar in narrative viewed under profiles 2 and
3 story grammars being a perhaps quixotic attempt to model

Narrative, Grammar and


profile 1 via grammatical paradigms. Studied as both textual
structure and narrational process, narrative can be analyzed as a
discourse genre that draws in distinctive ways on the same stock
of grammatical resources used differently in other forms of discourse, such as lists, scientific descriptions, lyric poems, and so
on. Although a range of grammatical resources from lexical
relations and phrase structure to discourse anaphora,
gapping (see ellipsis), and topicalization are all potentially relevant for the study of grammar in narrative, in this brief
discussion I focus on just two elements of morphosyntax verbs
and deictic expressions (see deixis) and map out some of
their narrative functions at both the level of text and the level of
narration.

Functions of Verbs in Narrative


Verbs perform crucial functions in narratively organized discourse. At the level of narration, the selection of a particular
tense for the primary or matrix narrative can be used to indicate the relation between event time and narration time, as
when past tense verbs are used to signal retrospective narration. Further, shifts in verb tense can be used to mark especially
salient episodes, as when (in English-language narratives) storytellers engage in shifts between the simple past tense and the
conversational historical present as a strategy for underlining the
significance of the events being recounted (Wolfson 1982). The
selection of particular verbal moods can also be used to position a tellers account on the continuum stretching between the
realis and irrealis modalities. At issue is a scale that ranges
from expressions indicating a speakers full commitment to the
truth of a proposition about the narrated world to more or less
hedged expressions, which indicate varying degrees of noncommitment. In a foundational contribution to the study of stories
told in face-to-face communicative interaction, William Labov
(1972) argued that one of the identifying features of properly narrative clauses (i.e., clauses that cannot be reordered in discourse
without changing the original semantic interpretation of the
narrative that they convey) is their reliance on past tense indicative verbs. By contrast, subjunctive and other nonindicative verbal moods are used by storytellers to evaluate (signal the point
of) the narrated events. In a series of studies, however, David
Herman found that tellers of supernatural tales in face-to-face
interaction regularly use nonindicative verbal moods (not went,
but would go or used to go) to accomplish fuzzy or strategically
inexact reference to the spatiotemporal positions and behavior
of ghosts (cf. Herman 2002, 335).
Meanwhile, at the level of text, verbs play a key role in the process that can be characterized as storyworld (re)construction
that is, the use of textual cues to encode or interpret the special
class of mental models that support narrative understanding
(Herman 2002). For one thing, both conversational narratives
and more complex literary narratives rely on verbs like come
and go (and cognate forms) to map characters paths of motion
through space and time in a process sometimes correlated with
patterns of alliance or conflict, as well as with internal or psychological growth, as in the classical Bildungsroman, or novel of
development. Verbs and verb phrases also express Aktionsarten,
or aspectual values, including states (I was a hiker), activities
(I hiked), and accomplishments (I hiked up the mountain).

Different distributions of verb types in texts cue interpreters to


reconstruct storyworlds in which these aspectual values may play
a more or less prominent role. Contrast the emphasis on accomplishments in sports broadcasts and hard-boiled detective fictions
with the emphasis on mental states in the modernist novel of consciousness or narratives of the self told in therapeutic settings.
What is more, in ways that M. A. K. Hallidays (1994) functional grammar helps illuminate, patterns of verb selection
assign more or less static or dynamic roles to participants in storyworlds. From a functionalist perspective, and in parallel with
cognitive linguistic research on how grammar reflects
underlying perceptual and conceptual processes used to make
sense of the world (see also cognitive grammar ), verbs
encode construals of experience in terms of processes of various types; in turn, each such process type specifies preferences
for assigning roles to the participants involved. For example, the
material process type, encoded in verbs like put or get, assigns
the roles of actor and goal to participants: e.g., She [actor] kicked
the ball [goal]. By contrast, the mental process type, encoded
in verbs like saw, felt, and thought, assigns to participants the
roles of senser and phenomenon: e.g., He [senser] saw the sunrise
[phenomenon]. Although the functionalist approach originally
focused on process types and participant roles at the level of
the clause, aspects of the approach can be scaled up to account
for discourse-level patterning in narrative. Thus, Herman
(2002) suggests that storytelling genres can be characterized
as preference rule systems in which variable preference
rankings obtain for different kinds of process types, yielding,
in turn, preferred and dispreferred role assignments. Whereas
epics preferentially rely on material processes, with participants
slotted in the roles of actor and goal, psychological novels prefer
mental over material processes, and with them the participant
roles of senser and experiencer.

Deixis in Narrative
Deictic expressions such as I, here, and now that is, expressions
with interpretations that depend on who utters them to whom
in what communicative circumstances (see pragmatics)
are another part of the grammatical system exploited by narratives in distinctive ways. At the text level, deictic expressions
serve to locate narrators and characters in time and space vis-vis objects, events, and situations in the storyworld, whose
space-time coordinates often do not match those of the current
moment of narration.
Consider, for example, the first two sentences of Ernest
Hemingways 1927 story Hills Like White Elephants: The hills
across the valley of the Ebro were long and white. On this side
there was no shade and no trees and the station was between
two lines of rails in the sun. The preposition across in the first
sentence and the demonstrative pronoun this in the second sentence must both be interpreted in light of the assumed position
of the vantage point from which events are being narrated a
vantage point that here overlaps with that of the characters.
David A. Zubin and Lynne E. Hewitt (1995) propose the notion
of deictic shift to account for such displaced or transposed
modes of deictic reference, which must be anchored in the storyworld evoked by the text, rather than in the world(s) that the
text producer or the text interpreter occupies when producing or

537

Narrative, Grammar and

Narrative, Neurobiology of

decoding these textual signals. This model builds on a number


of prior theoretical frameworks, including Karl Bhlers account
of Deixis am Phantasma (= imaginary relocation to the alternative sets of space-time coordinates implied by utterances about
fictional or imaginary situations) and Kte Hamburgers argument that only fictional narrative can provide direct access to the
consciousness or I-originarity of another to felt, experiential knowledge of the world as presented via someone elses vantage point on events.
In natural-language narratives told in contexts of face-toface communication, deictic expressions can serve other functions as well. In particular, when stories are told on-site, that
is, where the events being recounted are purported to have
taken place, deictic references can cue recipients to map features of the here-and-now circumstances of narration onto the
space-time environment evoked by the narrative text. Thus, in
the narrative discussed in Herman (2007), Monicas use of person deixis her references to I and we create a referential link
between Monica as the teller in the here and now and Monica
as the coexperiencer (with her friend Renee) of the supernatural encounter she tells about. More than this, however, Monica
refers deictically to spatial features of the current communicative context, as indicated by the items in bold in the following
partial transcript:
Because she is telling her story on-site, Mary can use deictic
expressions to recruit from features of the current environment
Monica:

we walkin up the hill,


this way, coming up through here.
[]
And Im like on this side and Renees right here.

and thereby orient her interlocutors vis--vis the storyworld;


those features provide spatiotemporal coordinates for the situations and events of which Mary is giving an account.
More generally, on the basis of fundamental cognitive abilities studied in research on conceptual blending, storytellers
like Mary exploit aspects of grammar to build complex mapping
relationships between two mentally projected worlds: the world
evoked by the narrative text and world in which the act of narration unfolds.
David Herman
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Genette, Grard. [1972] 1980. Narrative Discourse: An Essay in Method.
Trans. Jane E. Lewin, Ithaca, NY: Cornell University Press.
Halliday, M. A. K. 1994. An Introduction to Functional Grammar. 2d ed.
London: Edward Arnold.
Herman, David. 2002. Story Logic: Problems and Possibilities of Narrative.
Lincoln: University of Nebraska Press.
. 2007. Storytelling and the sciences of the mind: Cognitive narratology, discursive psychology, and narratives in face-to-face interaction. Narrative 15: 30634.
Johnson-Laird, P.N. 1983. Mental Models. Cambridge: Harvard University
Press.
Labov, William. 1972. The transformation of experience in narrative syntax. In Language in the Inner City, 35496. Philadelphia: University of
Pennsylvania Press.

538

Mandler, J. M. 1984. Stories, Scripts, and Scenes. Hillsdale, NJ: Lawrence


Erlbaum.
Pavel, Thomas G. 1985. The Poetics of Plot. Minneapolis: University of
Minnesota Press.
Prince, Gerald. 1973. A Grammar of Stories. The Hague: Mouton.
Propp, Vladimir. [1928] 1968. Morphology of the Folktale. Trans. Laurence
Scott, rev. Louis A. Wagner. Austin: University of Texas Press.
Wolfson, Nessa. 1982. The Conversational Historical Present in American
English Narrative. Dordrecht, the Netherlands: Foris.
Zubin, David A., and Lynne E. Hewitt. 1995. The deictic center: A theory of deixis in narrative. In Deixis in Narrative: A Cognitive Science
Perspective, ed. Judith F. Duchan, Gail A. Bruder, and Lynne E. Hewitt,
12955. Hillsdale, NJ: Lawrence Erlbaum.

NARRATIVE, NEUROBIOLOGY OF
Despite cognitive neurosciences long-standing interest in the
processing of individual words and sentences, most neurobiological enquiry into the comprehension and production of
more holistic and contextual forms of narrative discourse has
come only recently. Chief among the obstacles to such studies
has been a tension between scientific control and ecological
validity: Experimental control demands repeatable and rather
predictable stimuli and constrained responses, whereas freeranging narrative discourse eschews these properties. In addition, an analytical focus on mapping individual brain regions
to individual phonological, morphological, and local
contextual properties has upstaged examination of the ways in
which the construction of meaning may be subserved by network
interactions among many brain regions. With newly broadened
experimental paradigms, though, and with novel analytical methods addressing functional interactions, neuroscience promises
to add a biological dimension to cognitive psychological theories
of narrative discourse.
Central to cognitive neuropsychological explorations of narrative has been the realization of shared structure between the
explicit narratives of literature and the implicit narratives of
everyday mental representations: Information processing
in both these spheres depends on the representational power
conferred by narrative abstraction. In the everyday world no
less than the literary, disconnected percepts gain meaning and
separability from time, place, and action insofar as they become
transformed into representative mental texts, stories whose distinct scenes contain recognizable characters that act in coherent
plots and evince meaningful themes. A neurobiological perspective only reinforces these observations from human thought and
behavior: Neurophysiological study of brain dynamics reveals
that human cognitive architecture may be engineered to
represent its processing in series of discrete frames somewhat
analogous to those of cinema (Freeman 2006), and cognitive
neuroimaging has begun to reinforce a neuropsychological
view of thought as an activity of constructing blended spaces
between narrative schemata. The central role of narrative
scripting in cognition was explored in connection with early
efforts at constructing symbolic systems capable of artificial
intelligence, and was latent in literary criticism even before the
cognitive revolution of the late twentieth century. What is
new about this connection between thought and narrative is its

Narrative, Neurobiology of
explicit elaboration in light of cognitive neuroscience, connecting literature and philosophy with psychobiological information
and constraint.
Neuropsychologically, comprehension has been studied more
completely than production, and this entry focuses on comprehension. Narrative organization is implemented by interacting
and not entirely separable processes of perceptual organization,
attention, and memory. Perceptual organization is the process
that binds separate physical stimuli into coherent higher-order
objects within a scene, replacing, for example, a horizontal plane
and four perpendicular posts with the single entity of a table.
Attention, a group of many subprocesses, focuses processing
on those parts of a scene deemed relevant to the current script
or story schema. Memory encoding, maintenance, and
retrieval, by holding in mind the higher-order representations
of what one has seen before and what one expects to see next,
inform attention and perceptual organization with the context of
this story (Gerrig and McKoon 2001). Although much is known
about the neural substrates of these processes individually, their
significance in narrative processing lies in their interactions. An
understanding of these interactions subserving the comprehension or production of narrative discourse might best begin at the
beginning, with a discussion of contextual integration at the level
of individual words.
The principal physiological index of a words integration into
its context is the N400 (Kutas and Federmeier 2000), a negative
voltage produced by the brain in response to a word, and maximal about 400 milliseconds after the word is presented. The N400
reflects a truly textual process largely independent of the particular sensory mode of representation: Although there are some
more subtle effects of sensory modality, the N400 arises no matter whether the word is read from a page or heard from a speaker.
The N400 is thought to reflect a process or processes of contextual integration since its amplitude varies parametrically with a
words predictability; the canonical method of evoking a large
N400 is to present a word whose semantics violate contextual
expectation, for instance, At breakfast we ate toast with sand.
The initial words in this sentence, and the syntax into which
theyre arranged, prime activations for appropriate breakfast
foods. The ongoing construction of meaning is then challenged
by the non sequitur sand, eliciting a large N400 response. The
N400 arises no matter whether the conflicting context is established by a surrounding sentence, as in this brief example, or
simply by a single adjacent word, or by an entire discourse. For
instance, the following discursive context reduces the N400 in
the example above: We camped on the beach. A stiff wind blew
off the dunes into all our supplies. At breakfast we ate toast with
sand. Thus, words that are semantically related to their surroundings or are otherwise contextually expected evoke small
N400s, whereas words that cannot be predicted from context
and which, by inference, supply new information with which the
context must be updated evoke large N400s.
Anatomically, electromagnetic source localization and functional neuroimaging place the generators of the N400 primarily
in the superior temporal lobe and temporo-parietal junction (Van Petten and Luka 2006). These sources are mainly in the
left hemisphere but have some contribution from the right.
Across the period of the N400 response, they proceed posteriorly

to anteriorly, progressing toward the anterior medial temporal


lobe and its memory-related structures. These physiological
results in normal volunteers agree very well with the locations of
lesions that impair comprehension in aphasia patients.
An important question is whether the N400 reflects the comparison of new information against the context maintained in
working memory, or the encoding and integration of this
information into the context, or both these processes in combination. In addition to these associations with working memory,
N400 amplitude seems correlated with the difficulty of retrieving related information from long-term memory: Words that
are used rarely, for instance, evoke greater N400s than do common words, and semantic incongruities that introduce unrelated categories (buttersand) evoke greater N400s than
within-category violations of congruity (butteroil). N400
amplitude may thus reflect the complexity of constructing
blended spaces between the semantic space signified or evoked
by a term and the space established by its context. It remains an
open question as to what extent similar processes may underlie the construction of more complex and temporally extended
blends during the comprehension of complex discourses. It is
interesting to note that negative voltages with timing similar
to the N400 are evoked by all manner of nonlinguistic stimuli,
such as pictures and drawings, suggesting that all forms of
semantic evaluation may involve processes akin to those active
during narrative comprehension or, more abstractly, that the
computations involved in all forms of cognition may have narrative character.
The stronger N400 in the left hemisphere seems more driven
by category structure and affected by the retrieval of information from long-term semantic memory, whereas the right
hemisphere may be more driven by broader contextual integration and affected by the retrieval and/or updating of working
memory. The left hemisphere is, therefore, most affected by the
sense of a word considered individually or in relation to its local
context, and the right hemisphere by the broader context of the
narrative (Gernsbacher and Kaschak 2003). This computational
distinction of more local semantic evaluation by the left hemisphere and more extended contextual evaluation by the right
hemisphere may map fairly directly onto an anatomical distinction of small and more isolated dendritic arbors in the left
hemisphere and larger and more overlapping dendritic arbors in
the right (Jung-Beeman 2005), although evoked potentials suggest that this relation may be more a product of left hemisphere
specialization for local processing than of any complementary
right hemisphere specialization for broader context (Coulson
and Van Petten 2007). Cognitively, the pattern is reflected in the
right hemispheres involvement in strongly context-dependent
constructions, for instance, those involving frame shifts, such as
metaphor or humor (see verbal humor, neurobiology
of). Activation of the right hemisphere strengthens as one proceeds from the level of individual words to sentences to entire
discourses and as a narratives contextual complexity builds
from its beginning to its resolution (Xu et al. 2005).
In addition, in comparison to words and sentences, discourse
uniquely activates medial prefrontal cortex, the temporo-parietal
junction, and the precuneus (at the junction of posterior parietal
and anterior occipital lobes), as well as subcortical regions

539

Narrative, Neurobiology of

Figure 1.
(caudate nucleus and dorsomedial thalamus) that communicate
with prefrontal cortex (Xu et al. 2005). (See Figure 1.) The involvement of these regions likely reflects discourses demands to imagine scenes perceptually and especially visually, to place scenes and
events in spatial relation and temporal sequence, to take up and to
shift between spatial perspectives and personal points of view,
and to emote and empathize. In particular, the precuneus seems
associated with visual spatial perception and attention, medial
prefrontal cortex and its linked subcortical nuclei with perception
of events in sequence and context, the temporo-parietal junction
with theory of mind, and medial temporal lobe structures with
emotion and memory. Narrative representation can be viewed as
an emergent property of interactions among these and other structures subserving a broad array of cognitive processes.
One of the most discussed capacities involved in narrative
comprehension and production is theory of mind. Theory of
mind was first characterized as the general ability to understand
or to model the thoughts and beliefs of other people. However,
more recent neuroscientific results show that a great deal of
such social attribution can be accomplished using principally
perceptual mechanisms. These social perceptual capacities are
computationally and developmentally prior to theory of mind
and include specialized representations for qualities that typify
agency, such as autonomous movement and direction of gaze.
Such perceptual qualities underlie the attribution of volitional
mental states (she/he/it wants or she/he/it wants to) and
perceptual mental states (she/he/it sees) attributions that
form a ubiquitous shorthand in narrative descriptions even in
the case of plainly mechanical and nonsubjective entities (for
example, the computer doesnt see the network, or the printer
wants attention). Theory of mind in its most specific sense is
essential only for the attribution of belief (she/he/it thinks/
believes/knows), is associated with activation of brain regions
distinct from those subserving more elementary forms of social
attribution, and is distinguished from these more elementary
forms by its appearance at a later stage in child development,
at or near four years of age. This developmental connection
is significant: Theory of mind seems to arise from simpler processes that deal in sensory and especially visual data. Although
early studies associated theory of mind most strongly with the
medial frontal cortex, later work has suggested that this medial

540

frontal activation reflects a more general association with complex social narratives perhaps related to contextual selection of
details that build coherence within a narrative and that engage a
works suggestion structure to make it relevant to personal
experience and self-representation (Ferstl and von Cramon
2002). Such contextual selection may instantiate the prefrontal
cortexs more general involvement in the inhibition of responses
deemed inappropriate to the current behavioral and cognitive
context. In contrast, an experimentally based argument has been
made for a more selective association of temporo-parietal junction with the late-developing, belief-oriented variety of theory of
mind (Saxe and Powell 2006).
It remains an open question in evolutionary psychology as to what extent theory of mind may be a modular cognitive capacity independent of other aspects of cognition, versus to
what extent it may depend on developmental specialization arising in the interaction of earlier-maturing, more general capacities for social perception and executive function; recent views
on genes and language suggest that human cognitive adaptation for narrative discourse may combine these modular and
generalist perspectives, perhaps by putting to novel uses a large
collection of small modules specialized for cognitive processes
that are applicable to language but not necessarily restricted to
language (Bookheimer 2002). Proponents of the modular view
have often pointed to autism and language, or more specifically to autisms impairments in social communication, as
evidence for a modular dysfunction of theory of mind. However,
many people with autism pass theory of mind tests; the dysfunction of medial prefrontal cortex found in imaging studies of
autism seems consistent with a more general deficit in automatically engaging contextual evaluation and self-representation,
and in any case, such abnormalities within specific regions in the
autistic brain may be reflections of a more fundamental disruption in the information transfer and integration between regions.
Behaviorally observed deficits in theory of mind may thus stem
from a more general perturbation of narrative processing, and
may appear especially prominent only because theory of mind is
so frequently applied in everyday social interaction.
In addition to those regions uniquely activated by discourse
processing, most other brain regions involved in language
or higher-order cognition become more heavily recruited by

Narrative, Neurobiology of
discourse than by individual sentences or words. In particular, the middle frontal gyrus, on the dorsolateral surface of the
prefrontal cortex, seems involved in placing sentential or other
discursive elements in temporal, causal, or logical sequence
(Gernsbacher and Kaschak 2003; Mar 2004). This prefrontal sequencing and coordination of ideas seems analogous to
the more concrete executive sequencing and coordination of
body movements implemented in more posterior regions of
the frontal cortex. In a computational sense, therefore, narrative comprehension can be viewed as an elaboration of motor
control or, conversely, motor control itself can be said to
have narrative character, in the sense that it fundamentally
involves sequencing and relating actions in contexts. This relation between language and action gives crucial context to evolutionary biologys efforts to explain the phylogenesis of such
an abstract cognitive capacity, in that the roots of this capacity
may rest in the very concrete domain of motor control. Along
the same lines, the anterior inferior frontal gyrus seems distinguished from neighboring cortex by an involvement in selecting
semantic relations in communication with semantic retrieval
processes in the temporal lobe, whereas posterior regions of the
inferior frontal gyrus are more immediately bound up with the
more concrete, sequencing-related details of syntax and phonology (Bookheimer 2002; Jung-Beeman 2005).
In general, the neural implementation of narrative comprehension seems to take advantage of individual capacities for
movement, sensory perception, and emotion, activating these
systems in an internal simulation of the events evoked by the
narrative. The neural implementation of narrative understanding
thus depends crucially on embodiment. At the most concrete
level, that of simulating movements, this process of comprehension engages the mirror neuron system (see mirror systems,
imitation, and language) in the ventrolateral frontal lobe
and the supplementary motor area in the dorsomedial frontal
lobe (Wilson, Molnar-Szakacs, and Iacoboni 2008).
Cognitively, narrative understanding seems to emerge from
the interaction of many specialized subsystems. Neurally, therefore, a full description of narrative processing must encompass not
only the individual brain regions engaged, but also the dynamics
with which these regions connect and interact over a wide variety of narrative processes and subprocesses. These analyses of
functional connectivity are just beginning (Karunanayaka et al.
2007) via techniques including structural equation modeling,
dynamic causal modeling, and model-free multivariate methods such as partial least squares and independent components
analysis and initial work has demonstrated information flow
from regions of the temporal lobe adjoining the auditory cortex
(often described as wernickes area, though the anatomical
definition of this term has never been precise) to higher-order
processing in the inferior frontal lobe (brocas area) and near
the temporo-parietal junction. From these areas, the network of
interactions becomes more complex, with the inferior frontal lobe
projecting to the dorsolateral prefrontal cortex, temporo-parietal
junction, and medial frontal cortex, implementing wide-ranging effects of core language processing on complex semantics,
sequencing and coordination of ideas, and social attribution.
The data now in hand suggest functional specialization and
subdivision of brain regions beyond the anatomical resolution

that, has thus far been realized in multisubject functional neuroimaging. Contributing to this limited resolution is a potentially
high degree of variation in detailed functional anatomy across
individual subjects. A further challenge is an intersubject variability in information transfer between brain areas that actually
reflects individual differences in cognitive style: For instance, different individuals may make more or less use of working memory in comprehending and producing narrative, and there are
indications that these differences may be reflected in functional
connectivity with cooperating structures in the medial temporal
lobe related to long-term memory.
This connectivity frame holds out the potential for a rapprochement between connectionism and explicitly representational, modularist views of language and narrative processing,
since modular functions may reside not so much in any particular anatomical locus as in the incoming and outgoing links
among these loci: In this sense, the more closely the localization
problem is examined, the more ill-posed it may become. The
aforementioned characterizations in terms of regional functional
mapping may, therefore, be understood as a first approach to a
description in terms of regional functional interaction. It is in this
sense that the new neuroscientific study of discourse processing
is approaching an understanding of narrative connectivity in
terms of neural connectivity.
This entry is current as of 15 August 2007.
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bookheimer, Susan. 2002. Functional MRI of language: New approaches
to understanding the cortical organization of semantic processing.
Annual Review of Neuroscience 25: 15188.
Coulson, Seanna, and Cyma Van Petten. 2007. A special role for the right
hemisphere in metaphor comprehension? ERP evidence from hemifield presentation. Brain Research 1146: 12845.
Ferstl, Evelyn C., and D. Yves von Cramon. 2002. What does the frontomedian cortex contribute to language processing: Coherence or theory
of mind? NeuroImage 17.3: 15991612.
Freeman, Walter J. 2006. A cinematographic hypothesis of cortical
dynamics in perception. International Journal of Psychophysiology
60.2: 14961.
Gernsbacher, Morton Ann, and Michael P. Kaschak. 2003. Neuroimaging
studies of language production and comprehension. Annual Review
of Psychology 54: 91114.
Gerrig, Richard J., and Gail McKoon. 2001. Memory processes and
experiential continuity. Psychological Science 12.1: 815.
Jung-Beeman, Mark. 2005. Bilateral brain processes for comprehending
natural language. Trends in Cognitive Sciences 9: 51218.
Karunanayaka, Prasanna R., Scott K. Holland, Vincent J. Schmithorst,
Ana Solodkin, E. Elinor Chen, Jerzy P. Szaflarski, and Elena Plante.
2007. Age-related connectivity changes in fMRI data from children
listening to stories. NeuroImage 34.1: 34960.
Kutas, Marta, and Kara D. Federmeier. 2000. Electrophysiology reveals
semantic memory use in language comprehension. Trends in
Cognitive Sciences 4: 46370.
Mar, Raymond A. 2004. The neuropsychology of narrative: Story comprehension, story production, and their interrelation. Neuropsychologia
42: 141432.
Saxe, Rebecca, and Lindsey J. Powell. 2006. Its the thought that
counts: Specific brain regions for one component of theory of mind.
Psychological Science 17: 6929.

541

Narrative, Scientific Approaches to


Van Petten, Cyma, and Barbara J. Luka. 2006. Neural localization of
semantic context effects in electromagnetic and hemodynamic studies. Brain and Language 97.3: 27993.
Wilson, Stephen M., Istvan Molnar-Szakacs, and Marco Iacoboni. 2008.
Beyond superior temporal cortex: Intersubject correlations in narrative speech comprehension. Cerebral Cortex 18.1: 23042.
Xu, Jiang, Stefan Kemeny, Grace Park, Carol Frattali, and Allen Braun.
2005. Language in context: Emergent features of word, sentence, and
narrative comprehension. NeuroImage 25.3: 100215.

NARRATIVE, SCIENTIFIC APPROACHES TO


Knowledge is acquired in many ways and may exhibit many
degrees of generality, certitude, and power one can use to predict the future. It can be the product of direct observation and
a small number of general assumptions, or the result of a very
elaborate and long chain of hypotheses and deductions. It can
possess a rich factual content or be almost devoid of it, but it
must always lead back to factual observations. Knowledge, in its
most developed form, called science, rests on the basic assumption that the whole universe is structured and functions according to laws that hold without exceptions, in a precise way, and
throughout all time. Sciences main tool is mathematics, which
is as universal as science but is not itself a science; it is sciences
language (or logic). In its turn, science (and sometimes its language, mathematics) is used as a tool by many intellectual activities and fields of study that do not have the status of a science.
Such is the case of narrative, a cognitive endeavor wherein the
methods used and the classifications arrived at are still far from
yielding scientific (factual) observations and scientific (factual)
predictions. Classifications are an important step in science; scientists generally proceed by identifying and isolating a group of
phenomena that seem related, formulating hypotheses about
their main characteristics, and trying to connect them by means
of a theory. When they succeed, a branch of science becomes
established.
The study of narrative as such has not followed this path far
enough to become a scientific discipline. But during the last 15
years, more and more scholars have specialized in the study of
narrative have been using findings from neurobiology, cognitive science, and evolutionary psychology to enlarge our understanding of narrative and to ground it in the architecture of the
human brain.
The first attempts to establish narrative as an autonomous
scientific discipline took place in Russia just before the 1917
Revolution and continued until about 1930, when they were
stopped by Stalins regime in the Soviet Union. The Russian
Formalists, as the researchers were called, made salient contributions to their field such as the search for literariness,
or the formal properties defining the literary text; the distinction between plot (syuzhet) and story (fabula); the concept of ostraniene, translated into English by the neologism
estrangement; the notion that a literary text is a system, as
is literature itself; and the setting of boundaries between the
study of a text itself and the scientifically irrelevant study of
its production or its reception. These contributions were conceived within a research program that acknowledges the fact
that narratives are infinite in subject and presentation, but

542

their formal devices are limited in number while also being


universal in nature. In a way, the Russian Formalists took up
where Aristotle had left off in his Poetics and his Rhetoric on
the study of the architecture of narrative. But their work, as in
the case of Aristotle, did not go beyond the boundaries of classification and typology.
The research program developed by the French Structuralists
in the 1960s and 1970s in the guise of a narratology draws on the
work of the Russian Formalists and that of Vladimir Propp concerning the structural analysis of fairy tales (a group included
in the Aarne-Thompson Index of folktale types). The program
also consists mainly in a taxonomy, but its clarification of concepts and criteria of classification is much more advanced and
sophisticated.
The definition of narrative proposed by H. Porter Abbott is
simple, capacious, and sufficiently precise for our purpose here.
Abbott says: Narrative is the representation of an event or a series
of events (2002, 13). And he explains: The difference between
events and their representation is the difference between story
(the event or sequence of events) and narrative discourse (how
the story is conveyed) (2002, 15). From this point of view, narrative can be studied without regard to the medium through which
the event or events are represented. Because of this property,
narrative in texts, in films, in comic books, and so on can be studied by the means furnished by narratology, with a small amount
of adjustments in each case. As the most fundamental taxonomy,
narratology deals with formal universals, and in that capacity it
is indispensable for a research program seeking to establish the
scientific study of narrative.
Since the mid-1990s there has been a renewed interest in thematics and its empirical study with the purpose of determining
how narratives relate to human universals. The multidisciplinary
research by Max Louwerse, Willie van Peer, and Donald Kuiken
should be mentioned here, as well as the imaginative and solid
research program being developed by Patrick Colm Hogan on the
basis of his identification of three narrative universal structures
or genres (romantic, heroic, and sacrificial tragicomedic) generated by emotion prototypes. Another group of approaches concerns the study of the universal features of the relation between
reader or audience and narrative. Here, empirical work, such as
that being realized by David Miall and Kuiken, as well as Deirdre
Wison and Dan Sperber, to name but a few once again, is also
contributing to a scientific approach to very complex relations.
Evolutionary psychology and cognitive science are tools used by
these researchers in the wake of Joseph Carroll and Robert Storey.
And with respect to the universal features of rhythm, metered
speech, onomatopaeia, and prosody, the long and patient work
of Reuven Tsur has led to many fruitful results. The same is true
of the much more recent research relating music, neurobiology,
and the emotions.
Indeed, a common experiential link among these groups
of approaches is the emotions. Recent neuroscience has given
solid evidence to the hypothesis that emotions, too, are universal. This is a very important finding, since emotions are a central
concern in the study of narrative. The discovery of a mirror neuron system in humans (after its discovery in macaque monkeys
in 1996) has opened the way to a multitude of new explorations
concerning the brain and our social behavior, including the

Narrative, Scientific Approaches to

Narrative Universals

production and reception of narrative. The mirror neuron system in humans has been consistently reported as being related
to imitation, action observation, intention understanding, and
understanding of the emotional states of others, to mention a
few of the human faculties that are essential for the right perception of fictional narratives and, of course, for the survival and
evolution of the human race. And there seems to be even more
to it. In 2004, Vittorio Gallese, Christian Keysers, and Giacomo
Rizzolatti published a paper in which they explore the possibility that the mirror neuron system, by providing us with an experiential (precognitive) insight into other minds, could provide
the first unifying perspective of the neural basis of social cognition (2004, 401). The mirror neuron system is apparently the
basic mechanism that allows us to grasp the intentions of others
and to experience similar emotions (empathy). As these authors
say in their article, A crucial element of social cognition is the
brains capacity to directly link the first- and third-person experiences of these phenomena (i.e., link < I do and I feel > with < he
does and he feels >) (ibid., 396).
The question arises: Would all art then be nothing but a specific and specialized activity aimed at firing the mirror neurons
in a certain direction? Would the will to style displayed by
authors, composers, film directors, and so on be nothing but the
deliberate use of syntax, semantics, pragmatics, and phonemics
to trigger the mirror neuron system in specific and predetermined ways in order to elicit specific and predetermined insights
and emotions in the reader and the audience?
Frederick Aldama
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abbott, H. Porter. 2002. The Cambridge Introduction to Narrative.
Cambridge: Cambridge University Press.
Aristotle. 1984. The Rhetoric and Poetics of Aristotle. Trans. Rhys Roberts
and Ingram Bywater. New York: McGraw-Hill.
Carroll, Joseph. 2004. Literary Darwinism: Evolution, Human Nature,
and Literature. New York: Routledge.
Gallese, Vittorio, Christian Keysers, and Giacomo Rizzolatti. 2004. A unifying view of the basis of social cognition. Trends in Cognitive Science
8.9: 396403.
Hogan, Patrick Colm. 2003. The Mind and Its Stories: Narrative Universals
and Human Emotion. Cambridge: Cambridge University Press.
Louwerse, Max, and Don Kuiken, eds. 2004. The Effects of Personal
Involvement in Narrative Discourse Processes. Philadelphia: Lawrence
Erlbaum.
Louwerse, Max, and Willie van Peer, eds. 2002. Thematics: Interdisciplinary
Studies. Amsterdam: John Benjamins.
Miall, David. 2006. Literary Reading: Empirical and Theoretical Studies.
New York: Peter Lang.
Miall, David, and Donald Kuiken. 1994. Foregrounding, defamiliarization, and affect response to literary stories. Poetics 22: 389407.
Propp, Vladimir. 1968. Morphology of the Folk Tale. Austin: University of
Texas Press.
Rizzolati, Giacomo, Vittorio Gallese, Leonardo Fogassi, and Luciano Fadiga.
1996. Action recognition in the premotor cortex. Brain 119: 593609.
Storey, Robert. 1996. Mimesis and the Human Animal: On the Biogenetic
Foundations of Literary Representation. Evanston, IL: Northwestern
University Press.
Tsur, Reuven. 1997. Poetic rhythm: Performance patterns and their
acoustic correlates. Versification 1: 1997. This electronic journal of

literary prosody is available online at: http://www.arsversificandi.net/


backissues/vol1/essays/tsur.html.
Wilson, Deirdre, and Dan Sperber. 2006. Relevance theory. In
Handbook of Pragmatics, ed. Gregory Ward and Laurence R. Horn,
60732. Oxford: Blackwell.

NARRATIVE UNIVERSALS
In one Arawak story, a tiger sees a great hunter in the forest,
changes herself into a woman, and marries him. The two have a
happy married life until the good wife suggests that they visit the
hunters family. She warns him that he must not reveal her origin to anyone. The hunter, however, tells his mother the secret.
Feeling ashamed in front of the community, the woman changes
back into a tiger and returns to the forest. The poor husband
would often go into the bush and call his wife, but there never,
never came a reply (Roth 1915, 2034).
At first glance, the story might seem strange and clearly not
universal. A man marries a tiger and must keep it a secret from
his mother? Claude Lvi-Strauss analyzes the story in relation
to a complex of South American myths (see homologies and
transformation sets) that recount the decline from a golden
age. He also connects the tale with cannibalism (1973, 259). LviStrausss analysis does implicitly link the tale with other traditions, simply by referring to a decline from a golden age, for this
topic is found in different cultures. In the Judeo-Christian tradition, the story of the Fall is a case in point. However, in some ways,
this link only makes the story more alien. Pairing Adam and Eve
with a hunter and a tiger seems to highlight cultural difference
and pairing the eating of an apple with cannibalism seems to put
the Arawak at quite a distance from anything Western.
On the other hand, there is something deeply familiar about
the story. It tells of a couple joined by attachment (witness the
husbands pathetic search at the end of the story), suffering conflict due to the husbands divided loyalties to his mother and his
wife; it treats shared secrets, feelings of betrayal and shame, and a
concern about social origins. Moreover, there is nothing in the tale
itself that suggests cannibalism, despite Lvi-Strausss analysis.
In short, it is a story that seems both strange and familiar, both
culturally particular and imbued with cross-cultural concerns.

Cultural Construction, Universality, and Narrative


There is a common view in cultural studies the interdisciplinary
area of the humanities and social sciences devoted to analyzing
culture that practices, dispositions, artifacts, communicative
actions are socially constructed. This is to say that they are
not innate or biologically determined but result from cultural
developments. In this view, anything from individual emotions
to political structures might not be analyzed in terms of relatively constant human predispositions, but rather in terms of the
historically contingent organizations and imperatives of social
practice or performance. Additionally, writers in cultural studies
commonly understand social construction as widely variable. In
the most extreme versions, this variation may be seen as limited
by little beyond the laws of physics.
An alternative view, often associated with evolutionary
approaches to culture, takes a wide range of social practices to
be very narrowly constrained by genetic propensities. These

543

Narrative Universals
propensities are thought to have resulted from adaptations that
are specifically social. While writers in cultural studies tend to see
social practices as quite variable, writers adopting this approach
tend to see societies as manifesting a wide range of universals. Language study has been one area in which universalism
has been prominent, though there has been some disagreement
as to the precise evolutionary origins of language (see
biolinguistics).
narratology has incorporated both tendencies. A common view in cultural studies is that narrative is socially constructed and can vary widely from culture to culture. In contrast,
some researchers drawing on models from linguistics and psychology have argued that there are remarkably consistent narrative patterns across cultures. There are, however, differences
between the study of narrative and the sorts of study that have
occupied linguists for example, the study of syntax. While
writers in cognitive linguistics have viewed syntactic principles as resulting from general cognitive structures and processes, a common view within the field is that there are some
aspects of cognitive architecture that are specially devoted to
syntactic processing (see autonomy of syntax). The case for
an autonomy of narrative is much weaker. It seems much more
likely that narrative results from the interaction of various cognitive structures and processes that are not specially devoted to
narrative. As such, narrative is a less likely candidate for simple
evolutionary analysis. Put differently, if narrative results from
our cognitive abilities to draw causal inferences, to attribute
intentions, to imagine counterfactual or hypothetical situations,
to adopt varying physical points of view, to simulate experiences,
and so on, then it is less likely that there is any single adaptive
function for cross-cultural narrative patterns (comparable to the
commonly posited communicative function for language).
The point is consequential for a number of reasons, relating
to narrative and to other areas of study in the language sciences.
Specifically, if narrative patterns are unlikely to be genetically
coded in any detail, one of two conclusions may be drawn: One
may simply see this as further evidence for the culturalist position, further reason to believe that narrative may vary across
cultures with few limitations. However, the existence or nonexistence of universals is an empirical issue. It cannot be decided
a priori. If one believes that the evidence supports conclusions
of universality, then one is likely to draw a different conclusion
from the indirectness of the relation between narrative and
adaptation. Certainly, some universal patterns will derive more
or less directly from aspects of cognitive architecture (e.g., causal
attribution) that are defined by genetic programs resulting from
selective pressure. Others will derive from commonalities in the
physical environment. But that is not all.
Sticking close to biology, we may note that some cross-cultural patterns are likely to derive from the fact that adaptations
are mechanisms, not functions which means that there are
cases where the mechanism fails. More exactly, the genetic predispositions that serve us so well in daily life do so because they
set out relatively simple procedures that approximate advantageous functions. For example, although evolutionary psychologists often refer to our ability to read minds, we do not
directly know other peoples intentions. Rather, we engage in
complex processes of simulating and inferring those intentions.

544

These processes have adaptive value because they approximate


the function of giving us access to other peoples states of mind.
However, since they are mere approximative mechanisms, they
are fallible a point that is highly consequential both in life and
in stories. Consider narratives that focus particular attention on
the relative opacity of others intentions (e.g., between lovers in
cases of misguided jealousy). If these recur cross-culturally (as
they do), they do so not because adaptive mechanisms succeed
but because they fail in certain systematic ways.
nongenetic universals may also arise due to patterns in
childhood development that are not genetically programmed.
For example, we seem to have innate predispositions to emotional attachment. However, our attachment responses are not
wholly hardwired. They are shaped in some crucial ways by
childhood experiences, as a number of writers have stressed following John Bowlby. There appear to be some crucial parameters
in our early childhood experiences that have lasting effects on
the quality and durability of our attachments in later life. While
cultures may vary in the degree to which one or another parenting/attachment style predominates, it is inevitably the case that
every society has variation in parenting/attachment styles. Thus,
insofar as narratives cross-culturally focus attention on social
emotion, we would expect to find cross-cultural expression of
the same basic attachment styles.
The preceding example indicates that there are two problems with the usual framing of the division between those who
claim that there are cross-cultural universals and those who
deny that claim. First, universality does not entail innateness. A
pattern may be universal without being genetically determined.
Second, social construction does not entail cultural difference.
Though they include a genetic component, attachment styles are
socially constructed in that they result from the childs experience of parenting. Yet it seems likely that similar divisions of, for
example, secure and insecure attachment will recur everywhere,
even if they do so in different proportions. These considerations
suggest that common dichotomies regarding universality and
cultural construction are false. The point has consequences for
our understanding of universality in a range of areas, not only
narrative.
Indeed, the point goes further. Research in group dynamics
and elsewhere (see network theory; self-organizing
systems; pragmatics, universals in) suggests that many
patterns may arise through convergent development (independent processes in different societies that give rise to parallel practices) for example, patterns in the ways social networks operate
to define in-groups and out-groups, intragroup inequality, interacting subgroups, and so on. Given the importance of group
antagonism, social hierarchy, and subgroup divisions for any
society and, thus, their importance for the lives of individual
agents we might expect narratives to emplot these relations frequently. Insofar as these relations derive from group dynamics,
cross-cultural patterns of such emplotment would not result
from genetic predispositions per se but from convergent social
developments.
With these points in mind, we might return to the story of
the tiger woman or jaguar woman, in Lvi-Strausss version.
It is, as it turns out, framed by attachment (along with sexuality)
and group opposition. Moreover, there are hints that the group

Narrative Universals
opposition may point either toward in-group hierarchy or toward
in-group/out-group antagonism. Alexandra Aikhenvald reports
that at least among some Arawaks, members of one low-prestige
subgroup are referred to as people of jaguar (2006, 12). Walter
Roth cites an Arawak proverb that identifies tigers with enemies
(1915, 367). Moreover, the story relies on, indeed elaborates on,
the failure of mind reading, which is precisely what allows the
secret and the issue of spousal loyalty to arise in the first place.
So there is certainly commonality here, commonality that makes
cognitive sense. But is that all there is to it? After all, we knew that
there was some commonality already. Is there any greater universality to this story or to narrative patterns more broadly?
Attachment combined with sexual desire (as in pair bonding
or marriage) points toward a set of stories that recur across cultures. Perhaps we will get a better understanding of the issues if
we consider some other stories of this sort, particularly paradigmatic works from other traditions. (In a brief entry, we cannot
consider many stories, or other evidence for narrative universals.
For a range of cases, and for references to other accounts of narrative universality, see Hogan 2004.)

Four Romances
To begin, lets consider what is almost certainly the paradigm of
romantic narrative in the English-speaking world Romeo and
Juliet. Romeo and Juliet fall in love. However, they are prevented
from uniting by the group antagonism of their parents. With the
help of a friar, they are briefly united, but then Romeo is exiled
and Juliet is confined to the home. Juliet is to be married to a
rival by her father. She fakes her death to escape this fate. Romeo
returns, kills his rival, then commits suicide just at the moment
when he might have been united with Juliet. Juliet, too, kills herself, but after their deaths, the families are reunited.
Now consider the Romance of the Western Chamber, Chinas
most popular love comedy, both on stage and in print, beginning in the twelfth century (Idema 2001, 800). Chang and YingYing fall in love. Chang goes off to take the imperial exams.
Meanwhile, a rival comes to marry Ying-Ying with her mothers
approval. Chang succeeds in the examination and returns to
elope with Ying-Ying. He is successful due to the help of a monk.
The rival commits suicide (see Idema 2001, 798800).
The Recognition of akuntal, the most revered work of
Sanskrit drama, begins when Duyanta and akuntal fall in
love. Duyanta worries that they cannot marry due to caste
(thus, an internal group hierarchy). akuntal worries that they
cannot marry due to her fathers disapproval. Both turn out to
be mistaken. They are united. However, akuntal violates her
obligations to a holy man, who curses her with separation from
Duyanta. In consequence, akuntal is exiled, while Duyanta
remains at home, suffering conflict with the demands of an earlier wife (thus, a rival). Duyanta defeats an army of demons
in battle and is subsequently reunited with akuntal and
their son.
In the Arab and Muslim world, few stories have been as popular and influential as that of Layla and Majnun. Layla and Majnun
fall in love, but Laylas father refuses the marriage. Majnuns
father tries to cure Majnun of his love madness through religion,
but Majnun only calls on God to make him worship Layla more.
Majnun wanders the desert, eventually trying to win Layla by

force of arms. However, Layla is married to another man. When


Layla and Majnun die, they are reunited in paradise.

Romance and Prototypes


Although this is only a tiny selection of narratives, it is significant
in part due to their prominence in distinct narrative traditions.
(On complications of establishing this distinctness in literary
study, see areal distinctness and literature.) We may
already begin to see the ways in which these narratives may
share certain prototypical characteristics. I say prototypical characteristics because these examples do not suggest a set
of necessary and sufficient conditions but a gradient of
more or less standard cases. Specifically, one common sort of
narrative that recurs cross-culturally tends toward a prototype
involving the following elements.
We begin with two lovers. Their mutual interest combines
sexuality and attachment. However, they face inhibition.
That inhibition is frequently a matter of conflict with authority usually parental or religious or group division, or both.
(The priming of religious figures due to the prototype may
explain their surprising presence as helpers in some stories.)
The group division is itself regularly one of in-group hierarchy
or in-group/out-group antagonism. Works that do not involve
such a conflict commonly suggest it, as in akuntal. After a
brief union, the lovers are separated often, one is confined
to home while the other is sent away. In tragic versions, one or
both die. In comic versions, the separation may be associated
with death. During this separation, one lover proves himself
(or, less often, herself) worthy of the beloved, sometimes by
defeating the rival (who may ultimately die). This demonstration may overturn the disapproval of the parents or society. In
the end, the lovers are reunited and the conflicting families are
reconciled.
Of course, individual stories must vary this pattern. However,
as a standard case, it appears to be remarkably consistent across
cultures and across time. The Arawak story varies the pattern
more than the others we have considered. But it remains recognizable. The differences are largely a matter of order. The
hunter proves himself worthy of the beloved through his successful hunting right at the outset (the point is related to Arawak
cultural practices in which potential bridegrooms must prove
themselves; see Roth 1915, 31516). The conflict occurs after
marriage, rather than before, and it is in part the fault of the
man for violating the trust of his wife and preferring his mothers
interests over hers, no matter how briefly. Is there a reason for
these differences? The comic form of the romantic plot involves
suggestions of death or unending separation of the lovers
prior to their ultimate reunion in part because this intensifies
the final joy of their union. The point is a simple matter of the
psychology of emotion the joy of an outcome is intensified by
the difficulty of achieving the outcome (see Ortony, Clore, and
Collins 1988, 73) and by the gradient of change from a previous
emotional state. The same point holds for the Arawak story, but
in reverse. Here, the atypically early union, the apparently happily ever after condition of the couples married life, and the
hunters subsequent tragic error serve to intensify the pathos of
the conclusion.

545

Narrative Universals

Understanding Narrative Universals


Perhaps surprisingly, the explanation of the Arawak storys difference from the cross-cultural prototype begins to suggest why
there is a cross-cultural prototype to begin with. Cross-culturally,
there are two common purposes of narrative verbal art the
communication of emotionally satisfying experiences (roughly,
a psychological purpose) and the treatment of thematically significant issues, often ethical or political (roughly, a social purpose). The explanation of narrative universals bears importantly
on these two elements.
Narratives involve sequences of action engaged in by intentional agents pursuing goals that we share and that engage
us emotionally. One thing that cross-cultural patterns suggest is that these narrative goals are much more limited, and
much more cross-culturally widespread, than one might have
imagined. For example, they include union with a partner in
an enduring relationship that is both sexual and founded in
attachment thus, romantic love (on other happiness goals, the
related emotions, and the associated narrative structures, see
Hogan 2004). The precise development of narratives results in
part from the means necessary to intensify emotional experiences such as creating a relatively sharp change from separation anxiety to reunion, enhancing conflict by involving people
who themselves have attachment bonds (e.g., parents and children), and so on.
Again, the development of romantic narratives also crucially
includes real social concerns. Most obviously, these involve ingroup/out-group divisions and group hierarchies, which presumably result from group dynamics. But group organization
does not delimit the entire social world. Individual biological
endowments, developmental idiosyncrasies, and experiential
accidents in later life guide personal affiliations. There is, in consequence, no way of guaranteeing that personal affiliations will
conform to the principles of group hierarchization or in-group/
out-group antagonism. Societies are, then, condemned to face
conflicts between interpersonal attachments and the segregations imposed by social organization. Romantic plots tell the
story of that conflict.
In sum, there seem to be significant narrative universals
(many, of course, statistical; others absolute). These universals arise from a complex interaction of factors, including
biological endowment (e.g., in basic emotional responses), patterns in childhood development, and convergent developments
arising through group dynamics. In this way, narrative universals are in part derived from biological adaptations. However,
they are no less derived from social constructions, which are
themselves universal. An understanding of narrative universals
is important for at least three reasons: 1) Narratives are a central part of human life everywhere. Understanding narratives is
therefore crucial to understanding the human mind and human
experience. 2) The precise narrative universals we discover tell
us some surprising things about human society. For example,
it is striking that most romantic plots develop our sympathy for
the lovers, not for the society. This suggests not only that certain
sorts of conflict are inevitable in society but also that we share
a deep sympathy with individuals or couples working against
social hierarchization and group antagonism a surprising and
in many ways hopeful fact. Finally, 3) the complex nature of

546

Narratives of Personal Experience


narrative universals would seem to have consequences for our
understanding of universals elsewhere and for our understanding of the place of both biology and social construction in an
account of universals.
Patrick Colm Hogan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aikhenvald, Alexandra. 2006. A Grammar of Tariana. Cambridge:
Cambridge University Press.
Bowlby, John. 1982. Attachment and Loss. New York: Basic Books
Hogan, Patrick Colm. 2004. The Mind and Its Stories: Narrative Universals
and Human Emotion. Cambridge: Cambridge University Press.
. Affective Narratology: The Emotional Structure of Stories.
Lincoln: University of Nebraska Press. In press.
Idema, Wilt. 2001. Traditional dramatic literature. In The Columbia
History of Chinese Literature, ed. Victor Mair, 785847. New
York: Columbia University Press.
Lvi-Strauss, Claude. 1973. From Honey to Ashes (Introduction to a
Science of Mythology: 2). Trans. John and Doreen Weightman. New
York: Harper & Row.
Ortony, Andrew, Gerald Clore, and Allan Collins. 1988. The Cognitive
Structure of Emotions. Cambridge: Cambridge University Press.
Roth, Walter. 1915. An Inquiry into the Animism and Folk-Lore of the
Guiana Indians: Thirtieth Annual Report of the Bureau of American
Ethnology to the Secretary of the Smithsonian Institution: 19081909.
Washington, DC: Government Printing Office, 103453.

NARRATIVES OF PERSONAL EXPERIENCE


The study of narrative extends over a broad range of human
activities: novels, short stories, poetic and prose epic, film, folktale, interviews, oral memoirs, chronicles, histories, comic strips,
graphic novels, and other visual media. These forms of communication may draw upon the fundamental human capacity to
transfer experience from one person to another through oral
narratives of personal experience.
A focus on spontaneous recounting of experience was greatly
stimulated by the development of sociolinguistic research in
the 1960s, designed to capture the closest approximation to
the vernacular of unmonitored speech. Narratives of personal
experience were found to reduce the effects of observation to a
minimum (Labov 2001). Since then it has appeared that such
narratives are delivered with a similar organization in a wide variety of societies and cultures as, for example, in the Portuguese of
fishermen in northeastern Brazil (Maranho 1984). The following discussion of oral narratives is based on the initial analysis of
William Labov and Joshua Waletzky (1967), as developed further
in the suggested reading.
The discussion first treats the structural organization of narrative (temporal organization, orientation, coda), then turns to
the evaluative component and finally to the construction of narrative as a folk theory of causality instrumental to the assignment
of praise and blame.

Structural Organization
A narrative is defined here as one way of recounting past events,
in which the order of narrative clauses matches the order of
events as they occurred. Example (1) is a minimal narrative organized in this way:

Narratives of Personal Experience


(1)

a.
b.
c.
d.

Well, this man had a little too much to drink


and he attacked me
and a friend came in
and she stopped it.

The same events could have been reported in the non-narrative


order c,d,a,b as in (2), which employs a variety of grammatical
devices within a single clause.
(2)

A friend of mine came in just in time to stop this person who


had had a little too much to drink from attacking me.

Narrative structure is established by the existence of temporal


juncture between two independent clauses. Temporal juncture
is said to exist between two such clauses when a change in the
order of the clauses produces a change in the interpretation of
the order of the referenced events in past time. These are narrative clauses. Narrative clauses respond to a potential question
what happened then? and form the complicating action of the
narrative.
A narrative normally begins with an orientation, introducing and identifying the participants in the action: the time, the
place, and the initial behavior. The orientation section provides answers to the potential questions who? when? where?
what were they doing? In the minimal narrative (1), the first
clause (a) is the orientation. More information is usually
provided:
(3)

a.
b.
c.
d.

my son has awell, it was a fairly new one then.


Its a 60 cc Yamaha.
and it could move pretty good.
This fella and I were going down the road together

The end of a narrative is frequently signaled by a coda, a statement that returns the temporal setting to the present, precluding
the question and what happened then?
(4)

a. And you know the man who picked me out of the water?
b. Hes a detective in Union City,
c. and I see him every now and again.

Evaluation
Most adult narratives are more than a simple reporting of events.
A variety of evaluative devices are used to establish the evaluative point of the story (Polanyi 1989). Thus, we find that narratives, which are basically an account of events that happened,
frequently contain irrealis clauses negatives, conditionals,
futures which refer to events that did not happen or might have
happened or had not yet happened:
(5)

And the doctor just says Just that much more, he says, and
youd a been dead.

(6)

Ill tell you if I had ever walloped that dog Id have felt some
bad.

(7)

a. And he didnt come back.


b. And he didnt come back.

Irrealis clauses serve to evaluate the events that actually did


occur in the narrative by comparing them with an alternate
stream of reality: potential events or outcomes that were not in

fact realized. Frequently, such evaluative clauses are concentrated in an evaluation section, suspending the action before
a critical event and establishing that event as the point of the
narrative.
Evaluative clauses vary along a dimension of objectivity. At
one extreme, narrators may interrupt the narrative subjectively
by describing how they felt at the time:
(8)

a. I couldnt handle any of it


b. I was hysterical for about an hour and a half

In a more objective direction, narrators may quote themselves


(I said to myself This is it), or with more credibility, cite a third
party witness, as in (5). At the other extreme, objective events
speak for themselves, as in the account of a plane developing
motor trouble over Mexico City:
(9) And you could hear the prayer beads going in the back of the
plane.

Evaluation provides justification for the narratives claim on


a greater portion of conversational time than most turns of talk,
requiring an extended return of speakership to the narrator until
it is finished (Sacks 1992). Evaluation thus provides a response
to the potential question So what? (Spanish Y que?; French Et
alors?).
Narratives of personal experience normally show great variation in the length of time covered by the clauses in the orientation, complicating the action and evaluation sections, ranging
from decades to minutes to seconds. Sequences of clauses of
equal duration may be termed chronicles; these are not designed
to report and evaluate personal experience.

Reportability and Credibility


A reportable event is one that itself justifies the delivery of the
narrative and the claim on social attention needed to deliver it.
Some events are more reportable than others. The concept of
reportability or tellability (Norrick 2005) is relative to the situation
and the relations of the narrator with the audience. At one end of
the scale, death and the danger of death are highly reportable in
almost every situation. At the other end, the fact that a person ate
a banana for lunch might be reportable only in the most relaxed
family setting. Most narratives are focused on a most reportable
event. Yet reporting this event alone does not make a narrative; it
only forms the abstract of a narrative.
For a narrative to be successful, it cannot report only the most
reportable event. It must also be credible if the narrative is not
to be rejected as a whole by the listener. There is an inverse relationship between reportability and credibility: The more reportable, the less credible. Narrators have available many resources
to enhance credibility. In general, the more objective the evaluation, the more credible the event.

Narrative Preconstruction
When a narrator has made the decision to tell a narrative, he or
she must solve the fundamental and universal problem: Where
should I begin? The most reportable event, which will be designated henceforth as e0, is most salient, but one cannot begin with
it. Given the marked reportability of e0 and the need to establish its credibility, the narrator must answer the question How

547

Narratives of Personal Experience


did this (remarkable) event come about? The answer requires a
shift of focus backwards in time to a precursor event e-1, which is
linked to e0 in the causal network in which events are represented
in memory (Trabasso and van den Broek 1985). In traversing this
network in reverse, the causal links found may be event-to-goal,
goal-to-attempt, or attempt-to-outcome. The process will continue recursively to e-2, e-3, and so on. until an ordinary, mundane event e-n is reached, for which the question Why did you
do that? is absurd, since en is exactly what we would expect the
person to do in the situation described. The event en is, of course,
the orientation. Thus, a narrator telling of a time he was on shore
leave in Buenos Aires begins,
(10)

a. Oh, I was settin at a table drinkin.

TRIGGERING EVENTS. Given the mundane and nonreportable


character of the orientation, it follows that the first link in the
causal chain is a triggering event, which drives the narrative
along the chain toward the most reportable event. Thus, (10) is
followed by (11):
(11)

b. an this Norwegian sailor come over


c. an kep givin me a bunch o junk about how I was sittin
with his woman.

How ordinary situations like (10) can give rise to the reportable and violent events that followed is a mystery that narrative
analysis can only contemplate, since they are part and parcel of
the contingent character of history.

The Transformation of Experience


The participants in many narratives include protagonist,
antagonist, and third-party, witnesses, of which the first is the
most complex. Elaborating on Goffman (1981, 1445), one can
identify many egos present: the self as original author of the
narrative and its immediate animator; the self as actor; the self
as generalized other (normally as you); the anti-self as seen
by others; and the principal, the self in whose interest the story
is told. That interest is normally advanced through a variety
of techniques that do not require any alteration in the truthfulness of the events reported. The re-creation of the causal
network involves the assignment of praise and blame for the
critical events and their outcomes. Most narratives of conflict
involve linguistic devices that contribute to the polarization
of protagonist and antagonist, though within the family, other
linguistic forms lead to the integration of participants. The
devices used to adjust praise and blame include most prominently the deletion of events, an operation that can often be
detected by close reading. Key elements in further manipulation are the grammatical features of voice: active versus passive, but also zero causatives that assign agency (He drove
through town with a chauffeur) or verbs that imply the exertion of authority and resistance to it (My dad let me go with
him). Other narrative devices function to increase the impression of agency: pseudoevents that may not correspond to any
physical event (I turned to him and, I took this girl and, I
started to hit him but).
Narrative analysis can show how the prima facie case is built
to further the interests of the principal. This involves detecting

548

Narratology
insertions of pseudoevents and removing them, detecting deletions and replacing them, and exchanging excuses for the action
excused. It is then possible to approximate the original chain of
events on which the narrative is based. A useful exercise is to
develop a complementary sub rosa case in the interests of the
antagonist. The comparison of these two constructions deepens
our understanding of how narrative skills are enlisted to transform the social meaning of events without violating our commitment to a faithful rendering of the past.
William Labov
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Goffman, Erving. 1981. Forms of Talk. Oxford: Blackwell.
Labov, William. 2001. The Social Stratification of English in New York City.
2d ed. Cambridge: Cambridge University Press.
. 2003. Uncovering the event structure of narrative. Georgetown
University Round Table 2001. Ed. Deborah Tannen and James Alatis,
6383. Washington, DC: Georgetown University Press. This article
develops the search for the events underlying the narrative.
. 2004. Ordinary events. In Sociolinguistic Variation: Critical
Reflections. ed. C. Fought, 3143. Oxford: Oxford University Press. An
exploration of the evaluative effect of inserting ordinary events into
narrative.
. 2006. Narrative preconstruction. Narrative Inquiry 16: 3745. A
fuller development of this topic.
Labov, William, and Joshua Waletzky. 1967. Narrative analysis. In Essays
on the Verbal and Visual Arts, ed. J. Helm, 1244. Seattle: University
of Washington Press. Repr. Journal of Narrative and Life History 7
(1997): 338.
Maranho, Tulio. 1984. The force of reportive narratives. Papers in
Linguistics 17 (3): 23565.
Norrick, Neal R. 2005. The dark side of tellability. Narrative Inquiry
15: 32344.
Polanyi, Livia. 1989. Telling the American Story. Cambridge, MA: MIT
Press.
Sacks, Harvey. 1992. Lectures on Conversation. Vols. 1 and 2. Ed. Gail
Jefferson. Oxford: Blackwell.
Trabasso, T., and P. van den Broek. 1985. Causal thinking and the
representation of narrative events. Journal of Memory and Language
27: 122.

NARRATOLOGY
The French term narratologie (formed in parallel with biology,
sociology, etc. to denote the study of narrative) was coined by
Tzvetan Todorov in his 1969 book Grammaire du Dcamron.
The early narratologists participated in a broader structuralist
revolution that sought to use Saussurean linguistics as a pilot
science for studying diverse forms of cultural expression, which
structuralist theorists characterized as rule-governed signifying
practices or languages in their own right (see structuralism;
Culler 1975). Likewise, narratologists such as Todorov, Roland
Barthes, Claude Bremond, Grard Genette, and Algirdas Julien
Greimas, adapted Ferdinand de Saussures distinction between
la parole and la langue to construe particular stories as individual narrative messages supported by an underlying semiotic
code (see semiotics). And just as Saussurean linguistics privileged code over message, focusing on the structural constituents
and combinatory principles of the semiotic system of language,

Narratology
rather than on situated uses of that system, structuralist narratologists privileged narrative in general over individual narratives,
emphasizing the general semiotic principles according to which
basic structural units (characters, states, events, actions, etc.) are
combined and transformed to yield specific narrative texts.
In this brief overview, I trace in further detail some of the
developments from which structuralist narratology took rise
and outline key contributions by early theorists. I also review
limitations of the structuralist approach to narrative inquiry
limitations that manifested themselves as story analysts began
to engage more fully with recent research in the language sciences, among other areas of study. To map the evolution of the
field, I draw a distinction between classical and postclassical
approaches to narratological analysis (cf. Herman 1999). Classical
narratology encompasses the tradition of research, rooted in
Russian Formalist literary theory as well as earlier precedents,
that was extended by structuralist narratologists starting in the
mid-1960s and refined and systematized up through the early
1980s by scholars such as Mieke Bal, Seymour Chatman, Wallace
Martin, Gerald Prince, and others. The Anglo-American tradition
of scholarship on fictional narrative can also be included under
the rubric of classical approaches, though for reasons of space,
this discussion focuses mainly on the Formalist-structuralist tradition. Postclassical narratology, meanwhile, designates frameworks for narrative research that build on this classical tradition
but supplement it with concepts and methods that were unavailable to story analysts during the heyday of structuralism. In
developing postclassical approaches, which not only expose the
limits but also exploit the possibilities of older models, theorists
of narrative have drawn on a range of fields, from gender theory,
philosophical ethics, and comparative media studies to sociolinguistics, the philosophy of language, and cognitive science.
Given the focus of the present encyclopedia, I concentrate here
on productive synergies between postclassical narratology and
research in the language sciences.

The (Recent) Prehistory of Narratology


The Russian Formalists authored a number of pathbreaking
studies that served as foundations for narratological research.
Crucially, the Formalists sought to create a stylistics suitable for larger verbal structures found in prose narratives of all
sorts, from Leo Tolstois historically panoramic novels to tightly
plotted detective novels to (Russian) fairy tales. This widened
investigative focus would prove to be a decisive development
in the history of modern-day narratology. The new focus helped
uncouple theories of narrative from theories of the novel, shifting
scholarly attention from a particular genre of literary writing to
all discourse or, in a broader interpretation, all semiotic activities
that can be construed as narratively organized. The Formalists
thus set a precedent for the transgeneric and indeed transmedial
aspirations of French structuralist theorists such as Bremond
and Barthes, who came later.
Not only was the general orientation of Formalist research
narratologically productive; more than this, specific Formalist
concepts were taken over more or less directly by structuralist
story analysts. For example, in distinguishing between bound
(or plot-relevant) and free (or non-plot-relevant) motifs,
Boris Tomashevskii provided the basis for Barthess distinction

between nuclei and catalyzers in his 1966 Introduction to the


Structural Analysis of Narratives (Barthes [1966] 1977). Renamed
kernels and satellites by Chatman (1978), these terms refer to core
and peripheral elements of story content, respectively. Delete or
add to the kernel events of a story and you no longer have the
same story; delete or add to the satellites and you have the same
story told in a different way. Related to Tomashevskiis work on
free versus bound motifs, Viktor Shklovskiis early work on plot
as a structuring device established one of the grounding assumptions of structuralist narratology: namely, the fabula-sjuzhet or
story-discourse distinction (see story and discourse), that
is, the distinction between the what and the how, or what is being
told versus the manner in which it is told.
Another important precedent was furnished by Vladimir
Propps Morphology of the Folktale ([1928] 1968), whose first
English translation appeared in 1958. Propp distinguished
between variable and invariant components of higher-order narrative structures more specifically, between changing dramatis
personae and the unvarying plot functions performed by them
(e.g., act of villainy, punishment of the villain, etc.). In all, Propp
abstracted 31 functions, or character actions defined in terms of
their significance for the plot, from the corpus of Russian folktales
that he used as his data set; he also specified rules for their distribution in a given tale. His approach constituted the basis for later
accounts of narrative structure. For instance, extrapolating from
what Propp had termed spheres of action, Greimas ([1966]
1983) sought to create a typology of general behavioral roles to
which particularized actors in narratives could be reduced. He
initially identified a total of six roles (which he termed actants)
underlying individual narrative actors: subject, object, sender,
receiver, helper, and opponent.

Establishing the Field: Structuralist Narratology


I have already begun to discuss how the structuralist narratologists built on Russian Formalist ideas to help consolidate what I
am referring to as the classical tradition of research on narrative.
As originally conceived (cf. Barthes [1966] 1977), the new science
of narratology aimed to be not a school or method of literary criticism that is, not a way of interpreting novels or other specifically literary narratives but, rather, a transmedial investigation
of stories of all kinds, naturally occurring as well as artistically
elaborated, verbal (spoken or written) as well as image based,
painted as well as filmed. It also aimed to be transcultural and
transgeneric, investigating everything from legends and fables
to epics and tragedies. Ethnographic and sociological impulses,
reflecting the linguistic, anthropological, and folkloristic bases
for structuralist analysis of narrative, reveal themselves, when
Barthes writes: All classes, all human groups, have their narratives, enjoyment of which is very often shared by men with different, even opposing, cultural backgrounds ([1966] 1977, 79).
Narratologys grounding assumption is that a common, more
or less implicit model of narrative explains peoples ability to
recognize and interpret many diverse productions and types
of artifacts as stories; the same model allows them to compare
an anecdote with a novel or an opera with an epic. In turn, the
raison dtre of narratological analysis is to develop an explicit
characterization of the model underlying peoples intuitive
knowledge about stories, in effect providing an account of what

549

Narratology
constitutes humans narrative COMPETENCE. Hence, having
conferred on linguistics the status of a founding model ([1966]
1977, 82), Barthes identifies for the narratologist the same object
of inquiry that (mutatis mutandis) Saussure had specified for the
linguist: the code or system from which the infinity of narrative
messages derives and on the basis of which they can be understood as stories in the first place.
Narratologists like the early Barthes used structuralist linguistics not just to identify their object of analysis but also to elaborate their method of inquiry. In this connection, the adaptation of
structuralist-linguistic concepts and methods was to prove both
enabling and constraining. On the positive side, the example of
linguistics did provide narratology with a productive vantage
point on stories, affording terms and categories that generated
significant new research questions. For example, the linguistic
paradigm furnished Barthes with what he characterized as the
decisive concept of the level of description (Barthes [1966]
1977: 8588). Imported from grammatical theory, this idea suggests that a narrative is not merely a simple sum of propositions
but, rather, a complex structure that can be analyzed into hierarchical levels in the same way that a natural-language utterance
can be analyzed at the level of its syntactic, its morphological,
or its phonological representation. Barthes himself distinguishes three levels of description. At the lowest or most granular level are basic meaning-bearing elements that he termed
functions, which can be mapped out both distributionally and
in terms of paradigmatic classes; then come characters actions
that collocate to form narrative sequences; and finally there is
the level of narration, or the profile that narrative assumes when
viewed as a communicative process.
Likewise, Genette ([1972] 1980) drew on a broadly grammatical paradigm in using the categories of tense, mood, and
voice to characterize the relations among the story (= the basic
sequence of states, actions, and events recounted), the text on
the basis of which interpreters reconstruct that story, and the
act of narration that produces the text. Indeed, Genettes work
in the area of narrative temporality constitutes one of the truly
outstanding achievements in the field. Developing distinctions
that bear an interesting resemblance to Hans Reichenbachs
(1947) discriminations among event time, reference time, and
speech time, Genette focuses on two kinds of temporal relationships: 1) that between between narration and story and
2) that between text and story. In connection with the first,
Genette distinguishes between simultaneous, retrospective,
prospective, and intercalated modes of narration; in connection with the second, he develops the categories of duration,
order, and frequency. Duration can be computed as a ratio
between the length of time that events take to unfold in the
world of the story and the amount of text devoted to their narration, with speeds ranging from descriptive pause to scene to
summary to ellipsis. Order can be analyzed by matching the
sequence in which events are narrated against the sequence
in which they can be assumed to have occurred, yielding
chronological narration, analepses or flashbacks, and prolepses or flashforwards, together with various subcategories
of these nonchronological modes. Finally, frequency can be
calculated by measuring how many times an event is narrated
against how many times it can be assumed to have occurred in

550

the storyworld. In singulative narration, there is a one-to-one


correspondence between these frequency rates; in repetitive
narration, events are recounted more often than they occur;
and in iterative narration, events that happen more than once
are recounted fewer times than the frequency with which they
actually occur.
For all the gains it achieved by drawing on linguistics as a pilot
science (or, rather, as a metaphor for disciplinary practice), however, structuralist narratology was also limited by the linguistic
theories it treated as exemplary. Barthes unintentionally reveals
the limits of structuralist narratology when he remarks that a
narrative is a long sentence, just as every constative sentence is in
a way the rough outline of a short narrative, suggesting that one
finds in narrative, expanded and transformed proportionately,
the principal verbal categories: tenses, aspects, moods, persons
([1966] 1977, 84). By contrast, post-Saussurean language theory
has underscored that certain features of the linguistic system
conversational implicatures, discourse anaphora, protocols for turn-taking in conversation (see adjacency pair),
and so on emerge only at the level beyond the sentence. In
other words, attempting to bring to bear on narrative texts a
code-centered linguistics that ignores distinctive features of language in use, the early narratologists lacked crucial resources for
the analysis of stories. The problem, then, is not with the original intuition of the narratologists namely, that linguistics can
serve as a pilot science for narratological research. The problem,
rather, is with the particular linguistic concepts they used to flesh
out that intuition.

Beyond Structuralism: Postclassical Narratology and the


Sciences of Language
Ironically, the narratologists embraced Saussures structuralist linguistics as their point of reference just when its deficiencies were becoming apparent in the domain of linguistic inquiry
itself. The limitations of the Saussurean paradigm were thrown
into relief, on the one hand, by emergent formal (e.g., generative-grammatical) models for analyzing language structure
(see generative grammar). On the other hand, powerful
tools were being developed in the wake of Ludwig Wittgenstein,
J. L. Austin, H. P. Grice, John Searle, and other post-Saussurean
language theorists interested in how contexts of language use
bear on the production and interpretation of socially situated
utterances. Theorists working in this tradition began to question
what they viewed as counterproductive modes of abstraction and
idealization in both structuralist linguistics and the Chomskyan
paradigm that displaced it. Indeed, the attempt by later narrative
scholars to incorporate ideas about language and communication that postdate structuralist research has been a major factor
in the advent of postclassical models for research on stories and
storytelling. To put the same point another way, one reason for
the shift from classical to postclassical narratology has been an
ongoing effort to move from using linguistics as a metaphor for
narrative research to using linguistic models in the actual practice of narratological inquiry.
The following are just some of the domains of narratological
research in which theorists have begun to import concepts and
methods from the modern-day sciences of language, in an effort
to build models with greater descriptive and explanatory power

Narratology
than those developed by the classical narratologists. In each of
these domains, story analysts are working to adopt more narrative-appropriate tools from the language sciences that is, tools
that can throw light on how narrative, as a distinctive kind of language use, constitutes a cognitive and communicative resource
by means of which human beings make sense of themselves, one
another, and the world.
Narrative Comprehension: To explore aspects of narrative
processing, story analysts have drawn on a range of theoretical frameworks that were unavailable to the structuralist narratologists, including artificial intelligence research (work on
knowledge representations), accounts of mental models,
cognitive linguistics, and research on text processing.
For example, Catherine Emmott (1997) presents a powerfully
integrative theory of narrative comprehension as a process of
using textual cues to build and update complex mental representations that she terms contextual frames, which contain
information about narrative agents, their situation in time
and space, and their relationships with one another. Mark
Turner (2003), meanwhile, relates story comprehension to
more general cognitive processes that involve conceptual
blending.
Speech and Thought Representation: Narratologists have
drawn on fields ranging from dialectology, pragmatics,
discourse analysis (linguistic), and historical
linguistics to study aspects of speech representation in
narrative, including dialect representations and fictional
portrayals of scenes of conversational interaction. Likewise,
to study representations of characters mental functioning, analysts have begun to work toward a rapprochement
between narratological theory and ideas from cognitive and
social psychology, research on emotion, cognitive linguistics,
and other frameworks for inquiry.
Focalization Theory: Initially given impetus by Genettes
([1972] 1980) attempt to reformulate theories of narrative
perspective or point of view in more rigorous terms, focalization theory has in recent years taken on an increasingly
interdisciplinary profile. Manfred Jahn (1996) has drawn on
the cognitive science of vision to propose a powerful account
of the perspective-marking features of narrative. Meanwhile,
David Herman (2009) uses ideas from cognitive grammar
to propose refinements to Genettes theory.
Quantitative, Corpus-Based Research: Story analysts have
begun to work with large text corpora (see corpus linguistics ) to study whether the distributional facts support
accounts proposed by earlier narratologists on the basis of
their own readerly intuitions. On the one hand, hypothesisdriven approaches to corpus study use a top-down method,
attempting to map assumed categories of structure onto specific texts or corpora to test the validity of prior theories. On
the other hand, bottom-up approaches, seeking to reduce
theoretical presuppositions to a minimum, work to induce
categories and models from surface features that can be
identified through automated analysis of narrative corpora.
It is not just that narratologists have begun to engage more fully
with concepts and methods from the language sciences, however; more than this, the field is currently being revolutionized by

a greater awareness that narratology is itself one of the sciences


of language specifically, the domain of inquiry whose focal concern is narratively organized sign systems across all media and
communicative settings. Once a subdomain of literary study,
narratology is now coming into its own as the comprehensive
science of narrative-pertinent phenomena originally envisioned
by the structuralist narratologists.
Evidence of this reconfiguration of the field can be found in
narratologists reengagement with natural-language narratives
told in contexts of face-to-face interaction. Although William
Labov and Joshua Waletzky (1967) developed their model for the
analysis of narratives told in contexts of face-to-face communication just as structuralist narratologists were proposing their key
ideas, and although the Labovian model has been extraordinarily
influential in social-scientific research for some four decades, initially there was little interaction between sociolinguistic research
on storytelling and other traditions of narrative scholarship. But
now there is increasing interest in building an integrative theory
that can accommodate both the study of written, literary narratives and the analysis of everyday storytelling (Fludernik 1996;
Herman 2002). At the same time, among researchers concerned
with face-to-face narrative communication, there has been a shift
analogous to the one I have characterized as a transition from
classical to postclassical approaches. Precipitating this shift is
the recognition that the Labovian model captures one important
subtype of natural-language narratives namely, stories elicited
during interviews but does not necessarily apply equally well
to other storytelling situations, such as informal conversations
between peers, he-said-she-said gossip, conversations among
family members at the dinner table, or, for that matter, written,
literary texts (see conversation analysis).
This convergence of sociolinguistic, discourse-analytic, and
narratological research suggests that narratology is now coming
into its own as a bona fide member of the language sciences. As
such, its chief aim is to enhance our understanding of stories not
only as a means of artistic expression or a resource for communication but also as a fundamental human endowment.
David Herman
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Barthes, Roland. [1966] 1977. Introduction to the structural analysis
of narratives. In Image Music Text, trans. Stephen Heath, 79124.
New York: Hill and Wang.
Chatman, Seymour. 1978. Story and Discourse: Narrative Structure in
Fiction and Film. Ithaca, NY: Cornell University Press.
Culler, Jonathan. 1975. Structuralist Poetics: Structuralism, Linguistics,
and the Study of Literature. Ithaca, NY: Cornell University Press.
Emmott, Catherine. 1997. Narrative Comprehension: A Discourse
Perspective. Oxford: Oxford University Press.
Fludernik, Monika. 1996. Towards a Natural Narratology. London:
Routledge.
Genette, Grard. [1972] 1980. Narrative Discourse: An Essay in Method.
Trans. Jane E. Lewin. Ithaca, NY: Cornell University Press.
Greimas, Algirdas-Julien. [1966] 1983. Structural Semantics: An Attempt
at a Method. Trans. Danielle McDowell, Ronald Schleifer, and Alan
Velie. Lincoln: University of Nebraska Press.
Herman, David. 1999. Introduction. In Narratologies: New Perspectives
on Narrative Analysis, ed. David Herman, 130. Columbus: Ohio State
University Press.

551

Nationalism and Language


. 2002. Story Logic: Problems and Possibilities of Narrative.
Lincoln: University of Nebraska Press.
. 2009. Cognitive approaches to narrative analysis. In Cognitive
Poetics: Goals, Gains, and Gaps, ed. Geert Brne and Jeroen Vandaele,
79118. Berlin: Mouton de Gruyter.
Jahn, Manfred. 1996. Windows of focalization: Deconstructing and
reconstructing a narratological concept. Style 30.3: 24167.
Labov, William, and Joshua Waletzky. 1967. Narrative analysis: Oral versions of personal experience. In Essays on the Verbal and Visual Arts,
ed. June Helm, 1244. Seattle: University of Washington Press.
Propp, Vladimir. [1928] 1968. Morphology of the Folktale. Trans. Laurence
Scott, rev. Louis A. Wagner. Austin: University of Texas Press.
Reichenbach, Hans. 1947. Elements of Symbolic Logic. New
York: Macmillan.
Turner, Mark. 2003. Double-scope stories. In Narrative Theory and the
Cognitive Sciences, ed. David Herman, 11742. Stanford, CA: CSLI.

NATIONALISM AND LANGUAGE


While the link between language and nationality is often presented as though it developed at some primordial point in the
past, its appearance is, in fact, quite recent. This is hardly surprising inasmuch as the conception of the nation itself is relatively
modern. Thus, the idea that language is the medium by which
nationality is established, that language is the key to the nation,
has to be traced historically. Two distinct but related contexts
may serve as examples concerning how and why the connection
was made.
In the sixteenth century, the Tudor monarchy sought to exercise its dominion over Ireland, a colony which had been nominally under English rule since 1169 but which had never quite
been successfully subjugated. Part of its centralizing project was
the imposition of English upon the whole of the island of Ireland
on the ground that the use of the native language, Gaelic, along
with other cultural factors such as behavior and dress, led the
Irish to think of themselves as being of sundry sorts, or rather of
sundry countries (Statutes 1786, 28H8.cxv) rather than as members of one polity united under the English crown. This stress on
the significance of linguistic difference, embodied in the Act for
the English Order, Habit and Language (1537), formed the basis
of the English policy of linguistic colonialism in Ireland, but, of
equal importance, it heralded the connection between language
and national identity. In his 1617 Itinerary, Fynes Moryson,
an English adventurer in Ireland, articulated the lesson that the
colonialists learned from their struggle to impose English language and order: [C]ommunion or difference of language hath
always been observed a special motive to unite or alienate the
minds of all nations. And in general all nations have thought
nothing more powerful to unite minds than the community of
language (Moryson [1617] 1903, 213). Under specific historical
conditions the clash between an early modern nation-state and
one of its colonies linguistic difference came to signify national
difference through the operation of military and discursive
power. The link established in this context served as a portent
of a more general connection that appeared later in Europe and
beyond.
Although his seminal account of nationalism identifies its origins in the New World in the late eighteenth and early nineteenth
centuries, Benedict Anderson also discusses the appearance of a

552

whole set of ethnolinguistic nationalisms in Europe immediately afterwards. Though the historical differences between the
various social movements cannot be elided, they were inspired
by a number of German post-Kantian idealist thinkers. J. G.
Herders assertion in 1768 that each national language forms
itself in accordance with the ethics and manner of thought of
its people (2002, 50) was an important articulation of the link
between language and nation; by the time that William von
Humboldt gave his definition of a nation in 1836 (a body of men
who form language in a particular way [1988, 153]), the connection appeared almost axiomatic. In 1808, J. G. Fichte spelled out
the political significance of linguistic nationalism by arguing that
wherever a separate language is found, there a separate nation
exists, which has the right to take charge of its independent
affairs and to govern itself (1968, 49). The implications of the
doctrine were realized in the role that it played in national independence campaigns conducted by Greeks, Czechs, Hungarians,
Bulgarians, Ukrainians, Finns, Norwegians, Afrikaners, and
the Irish. Some postcolonial activists today, the Kenyan writer
Ngg wa Thiongo, for example, use the same model of linguistic
nationalism in their contemporary struggle, not so much against
colonialism but in order to counter the legacy of colonial rule.
Andersons account of the nation as an imagined community drew attention to the constructedness of the concept by
pointing to its precise historical origins. Yet the role of language
in the imagining of the community of the nation is also one that
arises at particular moments in history and serves specific functions; it is neither transhistorical nor general. It is also worth
noting that the conception of language underpinning this act of
imagination is one that has been criticized. Thus, M. M. Bakhtin,
in the important essay Discourse in the Novel, points to the fact
that national languages are produced by various types of institutional forces intellectual (linguistic theorizing), educational
(grammars and dictionaries), political (legislation) which act
centripetally in order to create a determinate, fixed, and knowable form. As part of this process, the realities of heteroglossia
(see dialogism and heteroglossia) social difference
inscribed in language by means of variation past and present
have to be banished. Historians, such as E. J. Hobsbawm (1990),
have noted the historical significance of such linguistic selection
and ranking, while linguistic anthropologists have drawn attention to the fact that the homogeneous language of nationalism is
as imaginary as the community that accompanies it (Irvine and
Gal 2000).
The extent to which such insights will have an impact in
political and linguistic thought remains to be seen. It is certainly
the case, however, that the postulated relationship between language and nation is now treated much more skeptically. At the
reactionary edge of forms of linguistic nationalism, there are still
those who argue for the purity of language as a way of guaranteeing the integrity of the nation. But the very fact that the
vast majority of nations past and present have been multilingual
communities including a number of those whose very entrance
into history depended on an emphasis on their supposed monolingualism radically undermines the ideological case for linguistic nationalism.
Tony Crowley

Natural Kind Terms


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, Benedict. 1991. Imagined Communities: Reflections on the
Origin and Spread of Nationalism. London: Verso.
Bakhtin, M. M. 1981. The Dialogic Imagination: Four Essays. Ed. Michael
Holquist, trans. Caryl Emerson and Michael Holquist. Austin: University
of Texas Press.
Barbour, Stephen, and Cathie Carmichael, eds. 2002. Language and
Nationalism in Europe. Oxford: Oxford University Press.
Fichte, J. G. [1808] 1968. Addresses to the German People. Ed. G. Armstrong
Kelly. New York: Harper.
Herder, J. G. [1768] 2002. Philosophical Writings. Ed. and trans. Michael
N. Forster. Cambridge: Cambridge University Press.
Hobsbawm, E. J. 1990. Nations and Nationalisms since 1780.
Cambridge: Cambridge University Press.
Humboldt, William von. [1836] 1988. On Language: On the Diversity
of Human Language Construction and Its Influence on the Mental
Development of the Human Species. Ed. Michael Losonsky, trans. Peter
Heath. Cambridge: Cambridge University Press.
Irvine, Judith T., and Susan Gal. 2000. Language ideology and linguistic differentiation. In Regimes of Language: Ideologies, Polities and
Identities, ed. Paul Kroskrity, 3583. Oxford, UK: James Currey.
Joseph, John. 2004. Language and Identity: National, Ethnic, Religious.
London: Palgrave.
Moryson, Fynes. [1617] 1903. Shakespeares Europe: Unpublished
Chapters of Fynes Morysons Itinerary. Ed. C. Hughes. London: Sherratt
and Hughes.
Ngg Wa Thiongo. 1986. Decolonising the Mind: The Politics of Language
in African Literature. London: James Currey.
The Statutes at Large Passed in the Parliaments Held in Ireland. 1786
1801. 20 vols. Dublin.

NATURAL KIND TERMS


Natural kind terms (NKTs) are, to use Platos ancient metaphor,
those terms that carve Nature at her joints; they are the terms
that correspond to unities and diversities in nature (Phaedrus,
265e266b). They therefore enable lawlike generalizations,
descriptions of natural patterns, and explanations of natural
phenomena.
From this characterization of NKTs it is clear that science
strives to use such terms in its classification and explanation
of nature. It is also clear that, as a rule, NKTs are developed
together with the growth of our knowledge of nature, and they
both result from a better understanding of phenomena and
advance that understanding. For instance, the biblical classification of plants into grass, the herb yielding seed and fruit tree
yielding fruit whose seed is in itself (Genesis 1:11) is no longer
used in botany, which classifies some trees together with some
grass as angiosperms, the flowering plants, in contrast to some
other trees, which are gymnosperms. The same point is illustrated by the recent scientific controversy over the definition of
planet: Scientists aimed at forming a concept that would reflect
and allow a better understanding of the different characteristics
and origins of bodies orbiting the sun.
The most common examples of NKTs are names of substances. Gold, water, alcohol, and metal are names of natural
kinds of matter; Homo sapiens sapiens, primates, mammals, animals, and eukaryotes are names of natural kinds of organisms.
But often enough one finds names of natural phenomena, such
as heat or pain, counted among these terms as well.

Various terms and phrases can be cited as examples of


non-natural kind terms. Student with a long nose who visited
Malaysia denotes a kind whose defining properties are not
related together in any lawlike regularity, and is, therefore, of
no use for the understanding of nature. A term like nonhuman
designates a group that is too heterogeneous. Another example
often cited is that of artificial kind terms, such as pencil or apartment. But this is perhaps problematic: It seems to presuppose
that humans, with their artifacts, constitute a kingdom within
a kingdom. But if Homo sapiens sapiens is a natural kind, and
as such part of nature, then terms useful for describing its life
and behavior for example, apartment should perhaps count
as NKTs.
Recent philosophical discussion has concentrated on the
meaning of NKTs. Until the 1960s, philosophers spoke of these
terms as if they were synonymous with a group of identifying
descriptions of the kinds. The statement that some liquid is water,
say, would then be synonymous with the statement that it has (at
least most of) the properties that would be used, for instance, in a
good, scientifically informed dictionary to characterize water.
This description theory of NKTs is problematic. According to
it, if a scientist asks a child for a glass of water, what the scientist means by water is very different from what the child means
by it, and the latter cannot even understand the former. But
this is unacceptable, for fluent communication is a criterion for
understanding.
The most influential theory of the meaning of NKTs nowadays, essentialism, was developed during the 1970s by Saul
Kripke (1980) and Hilary Putnam (1975). Both claimed that the
meaning of an NKT is determined not by descriptions but by
ostensive reference to samples. Natural kinds are assumed to
have essential properties, and the NKT means something having
the same essential properties as (most of) these samples, although,
as a rule, when introducing an NKT, people would be ignorant of
these essential properties.
Kripke also claimed that NKTs are rigid, but this seems confused. First, an NKT say, tiger is not rigid in the sense of designating the same particulars in every possible world, since
in different possible worlds there exist different tigers. Secondly,
it is not rigid in the sense that if it designates a particular in one
possible world, it designates it in every possible world in which it
exists: The queen bee is presumably a natural kind, but whether
larvae develop into queen bees depends on how they are fed. So
an insect that is a queen bee might not have been one, and queen
bee designates it only in some of the possible worlds in which it
exists. Lastly, and perhaps more importantly, if what was meant
in calling NKTs rigid is that they preserve their meaning across
possible worlds, then this is true of non-NKTs as well, such as
student with a long nose who visited Malaysia, and it would
trivialize the meaning of rigidity (cf. Schwartz 2002).
A hypothetical example supporting essentialism that many
found convincing was developed by Putnam. He asks us to imagine a remote planet identical to ours (Twin Earth), apart from
the fact that instead of water, that is, H2O, it contains a superficially identical liquid of an entirely different composition, say
XYZ. (Let us ignore the fact that such a liquid would not quench
our thirst, and so wouldnt even be superficially like H2O.) Since
Twin Earths liquid is superficially indistinguishable from water,

553

Necessary and Sufficient Conditions

Negation and Negative Polarity

we would mistake it for water; but then, the argument continues,


because of its different essential properties (composition, in this
case), it is not water. And thus we are supposed to conclude that
the essential properties, even if unknown to us, determine the
meaning of our NKTs.
Nonetheless, this example is problematic. Ever since the
composition of water was discovered, it is among waters known
and defining properties. Accordingly, after this discovery, we
wouldnt consider XYZ water because we would know it isnt, and
so this case does not support the claim that essential unknown
properties are sometimes involved in NKTs meanings. On the
other hand, no case has been made for the claim that we would
have been mistaken if, before that discovery, we had considered
XYZ water. We would have been mistaken had we then claimed
that Twin Earths liquid has the same unknown composition as
Earths water; but the moot point is whether the claim that it is
water would then have implicitly involved such an additional
claim. Examination of actual similar cases does not support
Putnams contention; moreover, essentialism has been shown to
be problematic in additional respects as well (cf. Ben-Yami 2001,
and additional references there). So despite its current popularity, essentialism remains far from established.

This debate can be understood as being about the necessary


and sufficient conditions for the correct predication of Taituas-afraid. While some claim that a particular configuration of
changes in the autonomic nervous system are both necessary
and sufficient for predicating of Taitu that she is afraid, others
respond that while such changes might be necessary conditions
they do not amount to sufficient conditions, for such changes can
be brought about by the administration of drugs. Thus, the truth
conditions for the predication of Taitu-as-afraid must involve
more than the existence of these bodily changes, for example,
Taitu also holding the belief such that she is subject to a threat and
the desire to act in a way that diminishes or absents that threat.
Furthermore, some might (and do) hold that conditions such as
patterned changes in the autonomic nervous system might be
sufficient for a particular token, such as Taitu being afraid on
this occasion, but not for all applications the term afraid. This
latter claim might then draw on Ludwig Wittgensteins discussion of family resemblance and hold that all correct applications of a term are not necessarily conditional on something
being common to all.
Phil Hutchinson

Hanoch Ben-Yami

NEGATION AND NEGATIVE POLARITY


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ben-Yami, H. 2001. The semantics of kind terms. Philosophical Studies
102: 15584.
Kripke, S. 1980. Naming and Necessity. Oxford: Blackwell.
Putnam, H. 1975. The meaning of meaning. In Mind, Language and
Reality: Philosophical Papers, II: 21571. Cambridge: Cambridge
University Press.
Schwartz, S. P. 2002. Kinds, general terms, and rigidity. Philosophical
Studies 109: 26577.

NECESSARY AND SUFFICIENT CONDITIONS


If we take two conditions, A and B, and we take A to be a necessary and sufficient condition (or set, thereof) for B, then condition B cannot hold in the absence of condition A. However, if
A is merely a necessary condition (or set, thereof) for condition
B, then the presence of condition A does not entail condition B.
Further, if A is merely a sufficient condition (or set, thereof) for B,
then it is possible that condition B holds in the absence of condition A.
In the philosophy of language, necessary and sufficient conditions have been employed by some philosophers in response
to the question: What are the conditions for the correct application of a word?
Take our employment of the words fear and afraid. To say of
a person that he or she is afraid, we want to know how we might
judge whether that application is correct. We are here asking for
the truth conditions: That is, what needs to hold such that our
predication of Taitu-as-afraid is true? One answer might be that
Taitu must hold a belief such that she is subject to a threat and a
consequent desire to act in a way that will diminish that threat.
Alternatively, one might argue that predicating of Taitu that she
is afraid is conditional upon a particular sensation, or patterned
change in the autonomic nervous system, being elicited in Taitu.

554

Negation is a linguistic, cognitive, and intellectual phenomenon.


Ubiquitous and richly diverse in its manifestations, negation is
fundamentally important to all human thought. As Laurence R.
Horn and Yasuhiko Kato put it:
Negative utterances are a core feature of every system of human
communication and of no system of animal communication.
Negation and its correlates truth-values, false messages, contradiction, and irony can thus be seen as defining characteristics of
the human species. (2000, 1)

Cognitively, negation is elementary off-line thinking; it involves


some comparison between a real situation lacking some particular element and an imaginal situation that does not lack it. The
particular element in focus anchors and contextualizes the negative element (which, being constrained by grammar, frequently
doesnt provide enough information for a listener to determine
what its focus is intended to be). There are many different conversational and written strategies for indicating and interpreting
focus elements, and even more for modulating them.
Formally (see logic and language), a functor called by
logicians negation is the only significant monadic functor; its
behavior is described by the most basic axiom of logic, the Law
of Contradiction ( (pp), NKpNp, also known as The Law of
Non-Contradiction), which asserts that no proposition is both
true and not true. Pragmatically (see pragmatics), negation
provides, among many other concepts, the basic cancelation test
for presupposition, as well as the fundamental observations that
underlie theories of politeness.
In natural language, negation functions as an operator,
along with quantifiers (see quantification ) and modals
(see modality ); operators are more basic and have more
properties than ordinary predicates or functors. In particular, operators have a scope; that is, there is always some other

Negation and Negative Polarity


element either assumed or verbally present in the discourse
to which a negative, modal, or quantifier refers. That linked element is said to be the focus or to be in the scope of the negative
(or modal; quantifiers are said to bind rather than focus on
another element).
Negation produces significant complexities and occasional
ambiguities when it interacts with other scope operators, because
the scopes can get twisted about. Every boy didnt leave is ambiguous, depending on the relative scope of the negative didnt and
the quantifier every (rather like Every boy read some book, where
two different quantifiers produce ambiguity). Negation combines in idiosyncratic ways with modals; for example, in You
may not go, and thats final! the deontic may not means not
possible, but in This may not be the place, the epistemic may
not means possibly not.
Every language develops its own idiomatic sets of negative
elements, and its own rules for using them. English negative phenomena are by far the best studied; examples include syntactic
constructions (This is it, isnt it? Not any big ones, he didnt), variation (so didnt I; aint got none), morphology (nt, -free, un-),
(morpho)phonology (do/dont), intonations (Riight), and lexemes sporting negation overt (never), incorporated (doubt, lack),
calculated (few), entailed (prohibit), or presupposed (only).
Included also is a large, complex, and diverse system of negative polarity items (NPIs like ever in He didnt ever see it), which
felicitously occur only in the scope of some negative element
(*He ever saw it). The details of what scope actually is, and of how
and which and why NPIs can occur within it, vary among specific
negative and NPI elements.
Negative polarity is a variety of negative concord (e.g., French
Je ne regrette rien; literally I dont regret nothing; Yiddish Ix
hob nit kin gelt; literally I dont have no money), but instead
of negative concord, which uses negative elements in the focus
of another negative, negative polarity uses other, non-negative
elements, which can sometimes pick up negativity by association
and occur without overt negative (could care less < couldnt care
less). An interesting typological question is whether languages
like English that lack significant negative concord develop more
negative polarity phenomena to compensate.
NPI is a term applied to lexical items, fixed phrases, or syntactic construction types that demonstrate unusual behavior
around negation. NPIs might be words or phrases that occur
only in negative-polarity contexts (fathom, in weeks) or have an
idiomatic sense in such contexts (not too bright, drink a drop);
or they might have a lexical affordance that only functions in
such contexts (need/dare (not) reply); or a specific syntactic rule
might be sensitive to negation, like subject-verb inversion with
adverb fronting in Never/*Ever/*Frequently have I seen such a
thing.
The grammatical occurrence of NPIs in an utterance is prima
facie evidence that it contains some sort of negation, and this
allows NPIs to function as indicators for various types of semantic opposition and syntactic structure. This has turned out to be a
sensitive tool in other research areas of linguistics, and linguists
using NPIs have discovered many covert negative phenomena;
for instance, NPIs can also occur in questions (Have you ever
been there?), hypothetical clauses (Tell me if he ever arrives), and
comparatives (Hes better than we ever expected).

Network Theory
Besides NPIs, English also has positive-polarity items
(would rather, sorta) that dont occur in negative polarity contexts; possible polarity items (tell time) that can occur only
within the scope of a possible-type modal; and combinations,
like the impossible polarity item fathom that require both negative scope and a modal.
John M. Lawler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Atlas, Jay D. 1996. Only noun phrases, pseudo-negative generalized
quantifiers, negative polarity items, and monotonicity. Journal of
Semantics 13.4: 265328. Negative polarity infiltrates logic.
Baker, C. L. 1970. Double negatives. Linguistic Inquiry 1: 16986.
Horn, Laurence R. 1969. A presuppositional analysis of only and even.
Chicago Linguistics Society: CLS 5: 97108.
. 1989. A Natural History of Negation. Chicago: University of
Chicago Press. Horns revision and extension of his 1972 dissertation.
The classical neo-Gricean analysis.
Horn, Laurence R., and Yasuhiko Kato. 2000. Introduction: Negation and
polarity at the millennium. In Studies in Negation and Polarity, ed.
Laurence R. Horn and Yasuhiko Kato, 119. Oxford: Oxford University
Press. An excellent survey.
Israel, Michael. 2004. The pragmatics of polarity. In The Handbook of
Pragmatics, ed. L. Horn and G. Ward, 70123. Oxford: Blackwell.
Klima, Edward S. 1964. Negation in English. In The Structure of
Language, ed. J. Fodor and J. Katz, 246323. Englewood Cliffs,
NJ: Prentice Hall. The first modern syntactic/semantic study.
Ladusaw, William A. 1980. Polarity Sensitivity as Inherent Scope Relations.
New York: Garland. The origins of the downward-entailment theory,
using visual metaphors like scope and focus.
Lakoff, Robin. 1970. Some reasons why there cant be any some-any
rule. Language 45: 60815.
Lawler, John M. 1974. Ample negatives. Chicago Linguistics Society: CLS
10: 35777. After Klima, negative polarity was developed extensively in
the generative semantics tradition (Horn, Lakoff, Ross, Lawler,
and many others), largely using a negative polarity field metaphor,
along with negative triggers and secondary triggering.
Linebarger, Marcia. 1981. The grammar of negative polarity. Ph.D. diss.,
Massachusetts Institute of Technology. A then-orthodox generative
treatment, using rule-based metaphors like NPIs being licensed.
. 1991. Negative polarity as linguistic evidence. Papers from the
Parasession on Negation. Chicago Linguistic Society: CLS 27: 16588.
McCawley, James D. 1993. Everything That Linguists Have Always Wanted
to Know About Logic (But Were Ashamed to Ask). Chicago: University
of Chicago Press. Oxford: Blackwell. McCawleys modern generative
semantic analysis; for example, In natural language, negation is not
truth-functional.
Ross, John R. 1973. Negginess. Paper delivered to the Winter Meeting of
the Linguistic Society of America, San Diego.
van der Wouden, Ton. 1996. Negative Contexts: Collocation, Polarity
and Multiple Negation. London and New York: Routledge. A useful
University of Groningen dissertation.
Zeijlstra, Hedde, and Jan-Philipp Soehn, eds. 2007. Proceedings of the
Workshop on Negation and Polarity. Tbingen: University Collaborative
Research Center. Recent evidence of polarity research expansion into
other languages.

NETWORK THEORY
Network theory concerns itself with the study of elements, called
vertices (e.g., words), and their connections, called edges or links

555

Network Theory

Figure 1. A subset of a word association network appearing in


Steyvers and Tenebaum (2005, 50). Links go from the stimulus to the response word. Reproduced by permission of the
Cognitive Society, Inc., copyright 2005.

(e.g., two words are connected if one word has been elicited by
the other in a word association experiment; see Figure 1). This
theory has many applications in language sciences and is the
outcome of intersecting work of mathematicians and physicists,
who usually call it graph theory (Bollobs 1998) or complex
network theory (Newman 2003), respectively.
One of the major contributions of physicists has been to
unravel the statistical properties of real networks (Newman
2003), for example, the World Wide Web or protein interaction
networks. Firstly, physicists discovered that practically all real
networks exhibited the small world phenomenon. The term
small world comes from the observation that everyone in the
world can be reached through a short chain of social acquaintances, although the number of people in the whole social network is huge. In the word association network, partially shown
in Figure 1, volcano is reached from ache through a chain of at
least four links, while only one link separates fire from volcano.
Secondly, physicists found that many real networks had a heterogeneous degree distribution. Loosely speaking, this property
means that there are vertices (words) with a disproportionately
large number of connections (the so-called hubs). For instance,
in the network partially shown in Figure 1, the five words with the
highest degrees are food, money, water, car, and good (Steyvers
and Tenenbaum 2005). Finally, another fundamental property of
real networks is clustering; that is, roughly speaking, if two vertices are connected to the same vertex they are likely to be directly
connected as well.
Network theory has contributed to the study of language in
three ways: a) by characterizing the statistical properties of linguistic networks, such as networks of word association (Steyvers
and Tenenbaum 2005), thesauri (Sigman and Cecchi 2002),
and syntactic dependencies (Ferrer i Cancho, Sol, and Khler
2004); b) by modeling the properties of these networks (Steyvers
and Tenenbaum 2005; Motter et al. 2002); and c) by proposing
abstract models that provide a further understanding of the faculty of language (Ferrer i Cancho, Riordan, and Bollobs 2005).
Although the systematic application of network theory to language is a young field (starting in the early twenty-first century)
within quantitative linguistics, it can be concluded that

556

the small world phenomenon, high clustering, and heterogeneous degree distribution are common properties of linguistic
networks (Mehler 2008). Most models are based on the preferential attachment principle proposed by Albert Lszl Barabsi
and Rka Albert (1999): Vertices with many connections are
more likely to become more connected in the future than those
with few connections (Steyvers and Tenenbaum 2001, 2005;
Dorogovtsev and Mendes 2001; Motter et al. 2002).
The challenges of the application of network theory are to
explain the properties of these networks (most studies are merely
descriptive); to incorporate deeper statistical techniques, for
example, degree correlation analysis (Serrano et al. 2007); and
to extend the studies to more languages (most studies are in
English). For these reasons, it is too early to argue that the heterogeneous degree distributions and other statistical patterns
constitute laws of language in the sense of absolute and
statistical universals. When applied to syntactic networks,
network theory has helped to explain the origins of the properties
of the syntactic dependency structure of sentences, for example,
the exceptionality of syntactic dependency crossings (Ferrer
i Cancho 2006) and has provided new tracks for understanding syntax at the large scale of syntactic organization (Ferrer i
Cancho, Sol, and Khler 2004), above the traditional sentence
level (see syntax, universals of).
In their pioneering application of network theory, Mark
Steyvers and Joshua B. Tenenbaum (2001, 2005) studied the
large-scale organization of various kinds of semantic networks
(e.g., word association networks, as in Figure 1) and proposed
a simple model for explaining the small-worldness, high clustering and a heterogeneous degree distribution of semantic
networks. Over time, new vertices (e.g. words) are added and
attached to existing vertices according to two principles: a)
Barabsi and Alberts preferential attachment and b) differentiation. Differentiation means that a new vertex tends to mimic the
connectivity pattern of an existing vertex.
Network theory has shed new light on the evolution of language by defining the necessary conditions for the existence of
language (e.g., word ambiguity) and also by suggesting the
possibility that language could have appeared for free as a side

Neurochemistry and Language


effect of communication principles (Ferrer i Cancho, Riordan,
and Bollobs 2005).
Ramon Ferrer i Cancho
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bibliography on linguistic and cognitive networks. Available
online at: http://www.lsi.upc.edu/~rferrericancho/linguistic_and_
cognitive_networks.html.
Bollobs, Bla. 1998. Modern Graph Theory. New York: Springer. A helpful introduction to graph theory.
Barabsi, Albert Lszl, and Rka Albert. 1999. Emergence of scaling in
random networks Science 286: 50912.
Dorogovtsev, Sergey, and Jos Fernando Mendes. 2001. Language as an
evolving word web. Proceedings of the Royal Society of London Series
B, Biological Sciences 268: 26036.
Ferrer i Cancho, Ramon. 2006. Why do syntactic links not cross?
Europhysics Letters 76: 122834.
Ferrer i Cancho, Ramon, Oliver Riordan, and Bla Bollobs. 2005.
The consequences of Zipfs law for syntax and symbolic reference.
Proceedings of the Royal Society of London Series B 272: 5615.
Ferrer i Cancho, Ramon, Ricard V. Sol, and Reinhard Khler. 2004.
Patterns in syntactic dependency networks. Physical Review E
69: 051915.
Mehler, Alexander. 2008. Large text networks as an object of corpus linguistic studies. In Corpus Linguistics: An International Handbook of
the Science of Language and Society, ed. Anke Ldeling and Merja Kyt,
32882. Berlin and New York: de Gruyter.
Motter, Adilson E., Alessandro P. S. de Moura, Ying-Cheng Lai, and
Partha Dasgupta. 2002. Topology of the conceptual network of language. Physical Review E 65: 065102.
Newman, Mark. 2003. The structure and function of complex networks.
SIAM Review 45.2: 167256. A helpful introduction to complex network
theory.
Serrano, Mari ngeles, Marian Bogu, Romualdo Pastor-Satorras,
Alessandro Vespignani. 2007. Correlations in complex networks.
In Large Scale Structure and Dynamics of Complex Networks: From
Information Technology to Finance and Natural Science, ed. Guido
Caldarelli and Alessandro Vespignani, 3565. Singapore: World
Scientific.
Sigman, Mariano, and Guillermo A. Cecchi. 2002. Global organization of
the Wordnet lexicon. Proceedings of the National Academy of Sciences
USA 99: 17427.
Steyvers, Mark, and Joshua B. Tenenbaum. 2001. The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Available online at: http://arxiv.org/ftp/cond-mat/
papers/0110/0110012.pdf. This is the first version of their 2005 journal
article.
. 2005. The large-scale structure of semantic networks: Statistical
analyses and a model for semantic growth. Cognitive Science
29.1: 4178.

NEUROCHEMISTRY AND LANGUAGE


Why study a potential neurochemistry of language? There are
two reasons, one practical, the other theoretical. First, pharmacologic treatments of various speech and language disorders
depend on, and will be enhanced by, an understanding of how
selective neurochemical networks facilitate, inhibit, or mediate
language functions. Second, understanding the adaptive function of a given trait requires detailed knowledge of the design
specifications that mediate or implement the trait in question.

Design complexity is one of the hallmarks of adaptive function


and, thus, inquiry into the brain systems that support the language faculty is mandatory if one wishes to understand the evolutionary history and potential adaptive functions of language.
Neurochemical, neurophysiological, and neuroanatomical
studies define a widely distributed neural network that supports
speech and language functions. This network includes the motor
and supplementary motor area (SMA) of the prefrontal lobes;
brocas area in the dorsal prefrontal region; wernickes
area in the medial temporal lobe; the anterior cingulate
gyrus and the subcortical basal ganglia; and the periaqeductal gray matter (PAG). The anterior cingulate gyrus sends efferents directly onto the PAG central gray and appears to influence
the initiation and voluntary control of vocalization. Destruction
of the central gray substance at the subcortical level or the SMA at
the cortical level can cause mutism. Patients with bilateral lesions
within the cingulate area often undergo a period of mutism, followed by slow recovery during which speech is aprosodic and
initiation of speech is rare. The anterior cingulate gyrus receives
efferents from the dopamine-rich supplementary motor area in
the cortex and sends afferents, along with other dopaminergic
fibers coming from the basal ganglia, up and into the prefrontal
regions. Thus, all of these language-related areas are interconnected via dopaminergic fibers and the prefrontal cortex.
The prefrontal cortex (PFC) constitutes approximately onethird of the human cortex and is the last part of the human
brain to become fully myelineated in ontogeny, with maturation
occurring in late childhood/early adolescence (Huttenlocher
and Dabholkar 1997). The PFC receives projections from the
mediodorsal nucleus and encompasses primary motor cortex,
as well as premotor, supplementary motor, and the dorsal and
orbital sectors of the prefrontal (proper) lobes. All of these PFC
areas are addressed by mesocortical dopaminergic projections
and play a role in language functions.
Dopamine (DA) is manufactured in the pigmented neurons of
the substantia nigra (SN) and the ventral tegmental area (VTA).
There are three major ascending dopaminergic systems: the striato-nigral tract, which ascends from the SN to the corpus striatum; the mesolimbic system, which ascends from the SN and
medial VTA to limbic sites, including the cingulate gyrus; and the
mesocortical system, which ascends from the anteromedial tegmentum and VTA to neocortical sites, including supplementary
motor area and prefrontal cortex (Nieoullon 2002; Girault and
Greengard 2004). Important language regions are linked directly
to these dopaminergic frontal lobe structures. Brocas area, for
example, is in the frontal lobes, as is the SMA. Posterior
language sites, such as Wernickes area, the angular gyrus, and
the inferior and superior parietal lobules, are densely interconnected via the superior and inferior longitudinal fasiculi, with
meso-prefrontal dopaminergic systems.
Language-related semantic and working memory networks in humans can be modulated by dopaminergic stimulation
(Williams and Goldman-Rakic 1995; Kischka et al. 1996; Luciana
and Collins 1997; Jay 2003; Angwin et al. 2004). Dopaminergic
activation may even support key components of the sentence
comprehension system in patients with Parkinsons disease
(PD) (Grossman et al. 2001). Dopaminergic agents may be effective treatment for nonfluent aphasia. M. Albert and colleagues

557

Neurochemistry and Language


(1988), using an on/off design, reported improved fluency
and naming scores in a patient with nonfluent aphasia treated
with bromocriptine (a drug that stimulates selected dopamine
receptors in the brain). Fluency and naming scores returned to
baseline after the drug was discontinued. S. R. Gupta and A. G.
Mlcoch (1992) replicated the effect of improved fluency scores
after bromocriptine in two aphasic patients, but L. Sabe and
colleagues (1995) and D. L. MacLennan and colleagues (1991)
could not document any improvement in speech and language
scores in nonfluent aphasics who were treated with bromocriptine late in the recovery process. Y. Tanaka and D. L. Bachman
(2000) conducted a double-blind, crossover study with bromocriptine. They administered the drug (57.5 mg/day for four
weeks) to 10 patients with a Broca-type aphasia. Statistically significant improvement (pre- to posttreatment) on naming and
fluency scores was documented in the mild aphasics, but not
in the severely impaired aphasics. M. Bragoni and colleagues
(2000) used a double-blind, placebo-controlled design focused
only on chronic nonfluent aphasics at a dosage of 30 mg/day,
with participants maintained at that dosage for three months.
While significant gains in verbal fluency were evidenced with
bromocriptine, these findings are based on the performance of
only five participants. Consistent, however, with the claim of
positive dopaminergic effects on fluency is the fact that the dopaminergic drug levodopa (LD) has also demonstrated beneficial
effects on speech fluency in midstage patients with Parkinsons
disease (McNamara and Durso 2000). S. Knecht and colleagues
(2004) showed that healthy volunteers given 100 mg of LD per
day exhibited more rapid and more accurate learning of verbalvisual associations than a group of controls given a placebo.
Learning effects in this carefully controlled study could not be
attributed to changes in arousal, autonomic function, motor
response times, affect, or response biases.
Acetylcholine, one of the neurotransmitters that interacts
with dopaminergic systems at the level of the cortex, has also
been implicated in language functions. Tanaka, M. Miyazaki,
and Albert (1997) documented naming and comprehension
improvement in fluent aphasics using the cholinergic agent bifemelane. They built on the work of L. Moscowitch, P. McNamara,
and Albert (1991) who reported that an anticholinesterase agent
(which boosts cholinergic activity) improved language performance in eight fluent semantic aphasics. Albert (2000) provides
an in-depth discussion of both dopaminergic influences on
nonfluent aphasia and cholinergic influences on fluent aphasia.
His conclusion is that there appears to be a strong and consistent effect of dopaminergic agents on verbal fluency and mild
effects of cholinergic agents on naming and semantic memory.
Dopaminergic effects, however, are better documented than
cholinergic effects.
One of the major regulatory genes that control dopamines
metabolic pathways in the prefrontal cortex is the gene that
codes for the enzyme catechol-O-methyltransferase (COMT).
Significant associations between COMT variations with variations in prefrontal cognitive function have been identified (Egan
et al. 2001; Joober et al. 2002). Studies in rats, knockout mice,
and monkeys suggest that COMT is of particular importance
with respect to intrasynaptic dopamine regulation in the prefrontal cortex, where an alternative route of dopamine removal

558

(i.e., dopamine transporter reuptake, as in the striatum) is largely


nonexistent. In humans, the COMT gene contains a highly functional and common variation in its coding sequence that appears
to be a unique human mutation because it has not been found
in great apes. This uniquely human change in dopaminergic
functional capacity in the prefrontal cortex suggests that it may
have been a factor in the evolution of the human prefrontal cortex and thereby of human speech and language functions more
generally.
An inherited deficit in spoken grammatical language among
several members of a family (family KE) in England has been
associated with a mutation in the forkhead box P2 (FOXP2) gene
on chromosome 7 (see genes and language). Persons with
the FOXP2 mutation evidence underactivity in dopaminergic
neural networks linking subcortical striatal networks with prefrontal cortical sites, including Brocas area during word-generation tasks. FOXP2 is subject to the effects of genomic imprinting,
with relatively high expression from the paternal chromosome
(Feuk et al. 2006). Such a pattern of gene expression evolves
in the context of evolutionary conflict due to paternity uncertainty in polygynous mating systems, as is the case with most
mammals including humans. Genetic conflict occurs between
asymmetrically related kin (i.e., between mothers and offspring,
and between siblings in the context of paternity uncertainty),
with genes that are paternally expressed in offspring promoting
behaviors in offspring that are designed to monopolize resources
from the mother and exclude resources going to siblings. The
FOXP2-related defect implies that some aspects of spoken language may have evolved under pressures of genetic conflict.
This conflict view of the evolution of language is consistent
with recent findings linking handedness and cognitive/language
deficits of schizophrenics (another disorder involving dopaminergic dysfunction) to the parent of origin effects (Francks et al.
2003), as well as other evidence identifying potential imprinting
effects on genes that regulate dopaminergic systems of the language-related areas of the prefrontal cortex. In short, investigation of dopaminergic influences on language functions leads us
into two seemingly disparate realms of inquiry: 1) the development of rational pharmacotherapeutic strategies for treatment
of language disorders (e.g., dopaminergic drugs for fluency
disorders), and 2) reconstruction of the evolutionary conflicts
that led to the emergence of speech and language functions
themselves.
Patrick McNamara
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Albert, M. 2000. Towards a neurochemistry of naming and anomia. In
Language and the Brain, ed. Y. Grodzinsky, L. Shapiro, and D. Swinney,
15765. San Diego, CA: Academic Press.
Albert, M., D. L. Bachman, A. Morgan, and N. Helm-Estabrooks. 1988.
Pharmacotherapy for aphasia. Neurology 38: 8779.
Angwin, A. J., H. J. Chenery, D. A. Copland, W. L. Arnott, B. E. Murdoch,
and P. A. Silburn. 2004. Dopamine and semantic activation: An
investigation of masked direct and indirect priming. Journal of the
International Neuropsychological Society 10.1: 1525.
Bannon, M. J., E. B. Bunney, and R. H. Roth. 1981. Mesocortical dopamine neurons: Rapid transmitter turnover compared to other brain
catecholamine systems. Brain Research 218.1: 37682.

Neurochemistry and Language


Bragoni, M., M. Altieri, V. Di Piero, A. Padovani, C. Mostardini, and
G. L. Lenzi. 2000. Bromocriptine and speech therapy in non-fluent
chronic aphasia after stroke. Neuroscience 21.1: 1922.
Egan, M. F., T. E. Goldberg, B. S. Kolachana, J. H. Callicott,
C. M. Mazzanti, R. E. Straub, D. Goldman,and D. R. Weinberger. 2001.
Effect of COMT Val108/158 Met genotype on frontal lobe function
and risk for schizophrenia. Proceedings of the National Academy of
Sciences USA 98: 691722.
Feuk, L., A. Kalervo, M. Lipsanen-Nyman, J. Skaug, K. Nakabayashi,
B. Finucane, D. Hartung, M. Innes, B. Kerem, M. J. Nowaczyk,
J. Rivlin, W. Roberts, L. Senman, A. Summers, P. Szatmari, V. Wong,
J. B. Vincent, S. Zeesman, L. R. Osborne, J. O. Cardy, J. Kere, S. W.
Scherer, and K. Hannula-Jouppi. 2006. Absence of a paternally
inherited FOXP2 gene in developmental verbal dyspraxia. American
Journal of Human Genetics 79: 96572.
Francks, C., L. E. DeLisi, S. H. Shaw, S. E. Fisher, A. J. Richardson,
J. F. Stein, and A. P. Monaco. 2003. Parent-of-origin effects on handedness and schizophrenia susceptibility on chromosome 2p12-q11.
Human Molecular Genetics 12: 322530.
Gardner, E. L., and C. R. Ashby, Jr. 2000. Hetereogeneity of the mesotelencephalic dopamine fibers: Physiology and pharmacology.
Neuroscience and Biobehavioral Reviews 24: 11528.
Girault, J. A., and P. Greengard. 2004. The neurobiology of dopamine
signaling. Archives of Neurology 61: 6414.
Greener, J., P. Enderby, and R. Whurr. 2002. Pharmacological treatment
for aphasia following stroke. Cochrane Database Systematic Reviews
4: CD000424.
Gupta, S. R., and A. G. Mlcoch. 1992. Bromocriptine treatment of nonfluent aphasia. Archives of Physical Medicine and Rehabilitation
73.4: 3736.
Grossman, M., G. Glosser, J. Kalmanson, J. M. Morris, M. B. Stern,
H. I. Hurtig. 2001. Dopamine supports sentence comprehension in Parkinsons Disease. Journal of the Neurological Sciences,
184.2: 12330.
Huttenlocher, P. R., and A. S. Dabholkar. 1997. Regional differences in
synaptogenesis in human cerebral cortex. Journal of Comparative
Neurology 387.2: 16778.
Jay, T. M. 2003. Dopamine: A potential substrate for synaptic plasticity
and memory mechanisms. Progress in Neurobiology 69: 37590.
Joober, R., J. Zarate, G. Rouleau, E. Skamene, and P. Boksa. 2002.
Provisional mapping of quantitative trait loci modulating the acoustic startle response and prepulse inhibition of acoustic startle.
Neuropsychopharmacology 27: 76581.
Kischka, U., T. Kammer, S. Maier, M. Weisbrod, M. Thimm, and
M. Spitzer. 1996. Dopaminergic modulation of semantic network activation. Neuropsychologia 34: 110713.
Knecht, S., C. Breitenstein, S. Bushuven, S. Wailke, S. Kamping, A. Floel,
P. Zwitserlood, and B. Ringelstein. 2004. Levodopa: Faster and better
word learning in normal humans. Annals of Neurology 56.1: 206.
Luciana, M., and P. Collins. 1997. Dopaminergic modulation of working
memory for spatial but not object cues in normal humans. Journal of
Cognitive Neuroscience 9: 33047.
MacLennan, D. L., L. E. Nicholas, G. K. Morley, and R. H. Brookshire.
1991. The effects of bromocriptine on speech and language function
in a man with transcortical motor aphasia. In Clinical Aphasiology,
ed. T. E. Prescott, 14555. Boston: College Hill.
McNamara, P., and R. Durso. 2000. Language functions in Parkinsons
disease: Evidence for neurochemistry of language. In Neurobehavior
of Language and Cognition: Studies of Normal Aging and Brain
Damage, ed. L. Obler and L. T. Conner, 20112. New York: Kluwer
Academic.
Moscowitch, L., P. McNamara, and M. L. Albert. 1991. Neurochemical
correlates of aphasia. Neurology 41 (Supplement 1): 410.

Neuroimaging
Nieoullon, A. 2002. Dopamine and the regulation of cognition and attention. Progress in Neurobiology 67: 5283.
Sabe, L., F. Salvarezza, A. Garcia Cuerva, R. Leiguarda, and S. Starkstein.
1995. A randomized, double-blind, placebo-controlled study of bromocriptine in nonfluent aphasia. Neurology 45: 22724.
Tanaka, Y., and D. L. Bachman. 2000. Pharmacotherapy of Aphasia. In
Neurobehavior of Language and Cognition: Studies of Normal Aging
and Brain Damage, ed. and M. Albert, L. Connor, and L. Obler, 15978.
Boston: Kluwer Academic.
Tanaka, Y., M. Miyazaki, and M. Albert. 1997. Effects of cholinergic activity on naming in aphasia. Lancet 350: 11617.
Thierry, A. M., J. P. Tassin, G. Blanc, L. Stinus, B. Scatton, J. Glowinski.
1977. Discovery of the mesocortical dopaminergic system: Some
pharmacological and functional characteristics. Advances in
Biochemistry and Psychopharmacology 16: 512.
Williams, G. V., and P. S. Goldman-Rakic. 1995. Modulation of memory fields by dopamine D1 receptors in prefrontal cortex. Nature
376: 5725.

NEUROIMAGING
Neuroimaging technologies provide a major source of new data
about how the language system is organized in the brain. In particular, activation imaging approaches, in which brain activity is
monitored while subjects perform some language task, allow us
to visualize various aspects of language processing in the normal
brain and test hypotheses about component language functions
or systems. Among the imaging techniques most commonly in
use for understanding language in the brain are structural magnetic resonance imaging (MRI), functional MRI (fMRI), positron
emission tomography (PET), electroencephalography (EEG),
and magnetoencephalography (MEG).

Structural Magnetic Resonance Imaging


The lesion deficit model, where one deduces the function of
a brain region by observing what it cannot do when damaged,
marks the basis of our understanding of language organization
in the brain, originating with Paul Brocas work. Whereas Broca
had to wait until his patients death to determine where the
lesion was located, structural or conventional MRI scanning
allows lesion-behavior correlation in vivo.
The MRI scanner is essentially composed of a large, high field
magnet that delivers magnetic pulses and records small changes
in the magnetized atoms in your brain or body. These signals are
picked up by an antenna and, through several transformations, are
translated into pictures of the brain. By altering the direction, frequency, and readout times of these magnetic perturbations, different MRI pulse sequences produce variations in the signals generated
by each tissue type. The difference between these signals is referred
to as contrast, and using these variations, the radiologist can determine what is normal brain tissue, what looks like a clot of blood or a
fatty tumor, what tissue has had a disruption in the normal diffusion
of water molecules, and so on. MRI has excellent spatial resolution,
which refers to the precision with which one can see details. Typical
MRI scans resolve 1 mm; thus, it is easy to locate even small brain
lesions that might explain a particular abnormality.
In the past few years, the sophistication of structural MRI
techniques has increased markedly, offering new approaches
for identifying structure-function correlations in the language

559

Neuroimaging

Figure 1. Top left: Original drawing by Broca.


Bottom left: preserved whole brain of Brocas
patient. Right: axial MRI slice through Brocas area
showing damage to insula, striatum, and underlying
white matter (Dronkers et al. 2007).

system. One approach involves warping scans from different


patients together, showing areas, for example, where lesions
overlap among patients with the same language impairment. An
illustrative study comes from Bates et al. (2003). Here, the authors
used a technique called voxel based morphometry. For each voxel
(a three-dimensional pixel) in the brain, a t-test compares the
extent to which patients with a lesion encompassing that voxel
differ significantly from subjects whose lesions do not encompass that area on a language task. E. Bates and colleagues (2003)
correlated lesion location with verbal fluency (Color Plate 5).
The areas of the brain showing the most significant differences
between groups are depicted in red, indicate the brain area that
is most likely responsible for the deficit observed. Contrary to
Brocas report, reduced verbal fluency was associated not primarily with brocas area lesions but, rather, with lesions of
the insula and underlying white matter. Indeed, a recent MRI
study of Brocas patient confirmed involvement of these structures (see Figure 1).
Another approach in MRI analysis compares aspects of brain
structure, such as gray matter thickness or sulcal position, to performance on language tasks using voxel-based correlations. For
instance, L. Lu and colleagues (2007) examined the relationship
between the thickness of the cortex in the left inferior frontal
region and ability on a phonological processing task in children.
The development of gray matter changes in this area correlated
with improving scores on phonology tasks, indicating a dynamic
relationship between emerging brain growth and language
development (Color Plate 6).
A third structural MRI approach examines the integrity of the
white matter underlying the cortical ribbon. Diffusion tensor
imaging is an MRI approach that measures the diffusion of water
molecules in the brain. White matter (WM) fibers tend to be bundled together, lined up in parallel sheaths. Because water is more
likely to diffuse in parallel to these white matter tracts as opposed
to crossing them on the perpendicular, imaging techniques that
track diffusion will tend to emphasize the direction of these fiber
tracts. Image-processing techniques can identify the uniformity of these fiber directions in each voxel in the brain, indicating whether the WM is intact. Color Plate 7 shows an example

560

of tractography, where the WM tracts in and out of a language


area have been mapped. This approach can identify inputs and
outputs to language regions, indicating possible mechanisms for
distal effects of local lesions through disruption on the connecting pathways.

Activation Imaging and Neurovascular Coupling


Several brain imaging technologies take advantage of neurovascular coupling in identifying brain regions associated with
language performance. Neurovascular coupling refers to the
fact that when neurons increase their firing rate (because that
brain area is working harder), blood flow increases to that
region. Typically, the correlation between blood flow and neuronal activity is extremely high, although the blood flow increase
is delayed in onset by several seconds and falls off gradually in
comparison to neuronal activity (Buxton et al. 2004). In several
clinical conditions it may be decoupled, such as in acute stroke.
Both PET and fMRI take advantage of neurovascular coupling to
identify brain activity.
POSITRON EMISSION TOMOGRAPHY. PET scanning is an imaging tool in which a radioactively labeled compound is injected
into the body and taken up in the brain. Compounds such as
glucose or water are labeled with positrons that rapidly decay;
during the decay process the positrons hit other nuclei and are
annihilated, which causes the emission of two photons that
shoot off in opposite directions simultaneously. The PET scanner is composed of a ring of detectors that detect these simultaneous photons, and software reconstructs their originating
positions, revealing where the compound traveled. The resulting
PET image is a blurry picture showing the amount of radioactive
substance reaching every pixel in the brain. Different radioactive
compounds measure various brain processes, each with a characteristic half-life. For instance, 18-fluorodeoxyglucose (18FDG)
has a half-life of about 45 minutes and measures glucose metabolism. More useful for language research is the compound H215O,
radioactive water. Because of its short half-life (about 2 minutes),
H215O scans can be repeated after a delay of 1012 minutes, up to
between 6 and 10 scans. For this reason, H215O has been used for

Neuroimaging
language activation studies, where subjects might receive several injections while performing one or more language tasks and
during control tasks (Color Plate 8).
One of the first PET studies to visualize language areas in vivo
was that of S. E. Petersen and colleagues (1988). In this study,
normal volunteers performed a series of language tasks ordered
hierarchically: viewing a crosshair on a screen; seeing printed
words on a screen or hearing words over headphones, reading or
repeating heard words, or generating an action verb corresponding to a visually or auditorally presented noun. By subtracting
lower-level tasks from higher-order language tasks, the authors
isolated areas of the brain involved in word generation, while
removing unwanted effects of sensory stimulation. This subtractive logic forms the basis of activation imaging experiments.
While there are theoretical difficulties with assumptions of hierarchical organization, cognitive subtraction models remain the
mainstay of activation imaging research.
An important disadvantage of PET is the need to expose subjects to radioactivity; a second disadvantage is that subjects must
perform the same task for several minutes continuously to obtain
a single brain image. Also, PET is a relatively noisy methodology, and the signal-to-noise ratio is low enough that scans must
be averaged over a group of subjects. Finally, the spatial resolution of PET is low, usually about 6 mm, so that individual brain
structures cannot be resolved and the areas of significant activation are not easily localized to a specific brain structure. Most
investigators solve this problem by performing an MRI scan for
each subject, mathematically moving or registering the brain
images so they are in the same space, then overlaying the PET
activation regions onto the corresponding MRI scans to localize
the regions of activity.
FUNCTIONAL MRI. In the early 1990s, two groups independently
discovered that blood flow increases during neural activity
could be measured directly with MRI (Kwong et al. 1992; Ogawa
et al. 1992). This is due to the accident that oxygenated blood and
deoxygenated blood have slightly different magnetic properties.
During increased brain activity, an increase in blood flow is not
matched by an increase in oxygen consumption; consequently,
more oxygenated blood spills over to the venous side of the capillary bed. Scans of the brain taken during this state of increased
oxyhemoglobin concentration have slightly higher MRI signals
than those taken in the resting state. Thus, fMRI measures this
blood-oxygen-level dependent, or BOLD, signal when comparing scans taken in different cognitive states. Thanks to the discovery of ultrafast MRI scanning, typically using an approach called
echo-planar imaging or EPI, fMRI takes a complete picture of the
brain as quickly as once per second. By taking complete brain
volumes every few seconds over a period of several minutes or
more, fMRI can track those magnetic changes that correlate with
blood flow during experimental and control tasks that can be
varied with tremendous experimental complexity. Because fMRI
has significantly greater spatial and temporal resolution than
PET, it is often possible to see significant language-related brain
activity within a single individual in a matter of minutes.
The study of language organization in the brain has been
revolutionized by fMRI. Because these studies are relatively easy
and inexpensive to conduct, it is possible to examine language

processes of nearly any complexity, resulting in many important


findings. For instance, in the reading system, years of controversy
about whether reading involves a single system versus parallel
systems have been largely resolved. The physical presence of two
anatomically distinct pathways differentially engaged by different reading demands and subject groups lends strong credence
to the dual-route hypothesis (Pugh et al. 1996). Another interesting set of findings explores the role of the right hemisphere
in many aspects of language processing, including affective
prosody, metaphor analysis, and contextual processing (see
Bookheimer 2002 for a review). Within the frontal lobe there
appear to be separate regions for processing aspects of expressive language, including phonological processing, syntax, and
semantic integration. Thus, fMRI appears to have resolved an
ongoing debate over whether part of Brocas area is specialized
for syntax, as opposed to secondary to increased working memory demands of complex syntactic structures, with a recent study
indicating syntax specificity in this region (Santi and Grodzinsky,
2007). In general, language research from fMRI indicates a level
of organization that is far more complex, detailed, and specific
than envisioned on the basis of lesion-deficit studies. Color Plate
9 shows an fMRI exam during a series of language tasks. Even
within a single subject, clear evidence for at least nine different
brain regions contributing to language can be observed.

EEG and MEG. While fMRI offers a tremendous advantage in


both spatial and temporal resolution over PET, the fMRI response
is extremely sluggish in comparison to neural activity measured
directly. Two other technologies offer vastly improved temporal
resolution. Electroencephalography measures the combined
electrical activity of a wide area of the brain. Because electrical
activity directly measures neural firing, EEG runs very close to
real-time activity of neurons. It also has several major disadvantages. The spatial resolution is poor: Electrical signals represent
an average over many centimeters of activity, and results are generally confined to entire lobes. Further, the signals come mostly
from surface brain structures. Finally, data must be averaged over
many trials to yield an averaged electrical response to a class of
stimuli. Nonetheless, EEG is the method of choice for high temporal resolution work, and for young children or infants.
Years of EEG research reveal expected patterns of electrical responses to certain classes of stimuli. Averaged electrical
responses to a class of stimuli are referred to as event- related
potentials or ERPs. An example is the N400 response, meaning
a negatively directed signal occurring 400 milliseconds after
the stimulus. The N400 is found when the subject experiences
an anomalous or unexpected event. For instance, A. Hahne,
K. Eckstein, and A. Friederici (2004) examined ERPs in response
to semantic and syntactic violations embedded within sentences. When there was a syntactic violation, the authors found a
component in anterior brain regions, termed the ELAN (early left
anterior negativity). This was followed by a late positive response
(P600). Semantic violations produced the N400 response. In a
second experiment, instructions to ignore the syntactic violations and focus on the semantic task could not override the brain
response to syntactic violations. These data indicate that syntactic processing is mandatory and not under effortful control.
While EEG cannot locate the brain structures that generate these

561

Neuroimaging

Number

signals, the high temporal resolution of EEG makes it possible


to analyze on-line processing of language, addressing questions
that differ fundamentally from those tapped by fMRI.
Magnetoencephalography bears many similarities to EEG
while offering improved spatial resolution. MEG measures magnetic moments created by electrical activity in the brain that are
picked up by a large magneto-detector sensitive to very small
changes in magnetic fields. These fields are closely related to neural activity and can be tracked over temporal intervals effectively in
real time. Localizing the source of MEG signals remains a significant difficulty with this technique, and sources generated close to
the brain surface are easier to detect than those from deep structures. Nonetheless, source localization appears to be far more
precise than with EEG, while temporal resolution is equivalent.
MEG is more technically challenging than EEG and fMRI and is
also less widely available. Nonetheless, significant contributions
to the field of language continue to emerge from MEG studies.
For example, MEG studies have indicated that letter-string recognition occurs 150 msec after presentation of a word, whereas
children with dyslexia show a weaker response (Tarkiainen et al.
1999). A corresponding delayed response in superior temporal
regions may reflect an earlier-level dysfunction in letter-string
recognition areas (Salmelin, Helenius, and Service 2000).
Both EEG and MEG are limited to event-related designs that
measure the temporal dynamics of evoked responses to a brief
duration stimulus. In some cases, however, temporal resolution
isnt necessary. For instance, studying mood or drug states or
observing a cognitive strategy that evolves over time may require
longer time intervals to measure, favoring fMRI or PET.
Susan Bookheimer
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bates E., S. M. Wilson, A. P. Saygin, F. Dick, M. I. Sereno, R. T. Knight,
and N. F. Dronkers. 2003. Voxel-based symptom mapping. Nature
Neuroscience 6: 44850.
Bookheimer, S. Y. 2002. Functional MRI of language: New approaches
to understanding the cortical organization of semantic processing.
Annual Review of Neuroscience 25: 15188.
Bookheimer, S. Y., T. A. Zeffiro, T. Blaxton, W. D. Gaillard, B. Malow, and
W. H. Theodor e. 1998. Regional cerebral blood flow during auditory
responsive naming: Evidence for cross- modality neural activation.
Neuroreport, 9.10: 240913.
Bookheimer, S. Y., T. A. Zeffiro, T. Blaxton, W. D. Gaillard, and W. H.
Theodore. 1995. Regional cerebral blood flow during object naming
and word reading. Human Brain Mapping 3.2: 93106.
Buxton, R. B., K. Uludag, D. J. Dubowitz, and T. T. Liu. 2004. Modeling
the hemodynamic response to brain activation. Neuroimage. 23
(Supplement 1): S2203.
Cohen, M. S., and S. Y. Bookheimer. 1994. Functional magnetic resonance imaging. Trends in Neurosciences 17.7: 26877.
Dronkers, N. F., O. Plaisant, M. T. Iba-Zizen, and E. A. Cabanis. 2007.
Paul Brocas historic cases: High resolution MR imaging of the brains
of Leborgne and Lelong. Brain 130.5: 143241.
Hahne, A., K. Eckstein, and A. Friederici. 2004. Brain signatures of syntactic and semantic processes during childrens language development. J Cogn Neurosci 16.7: 130218.
Kwong, K. K., J. W. Belliveau, D. A. Chesler, I. E. Goldberg, R. M. Weisskoff,
B. P. Poncelet, D. N. Kennedy, B. E. Hoppel, M. S. Cohen, R. Turner, H.
Cheng, T. Brady, and B. Rosen. 1992. Dynamic magnetic resonance

562

imaging of human brain activity during primary sensory stimulation.


Proceedings of the National Academy of Science 89.12: 56759.
Lee, A., V. Kannan, and A. E. Hillis. 2006. The contribution of neuroimaging to the study of language and aphasia. Neuropsychology Reviews
16.4: 17183.
Lu, L., C. Leonard, P. Thompson, E. Kan, J. Jolley, S. Welcome, A. Toga,
and E. Sowell. 2007. Normal developmental changes in inferior frontal gray matter are associated with improvement in phonological processing: A longitudinal MRI analysis. Cerebral Cortex 17.5: 10929
Ogawa, S., D. W. Tank, R. Menon, J. M. Ellermann, S. G. Kim, H. Merkle, and
K. Ugurbil. 1992. Intrinsic signal changes accompanying sensory stimulation: Functional brain mapping with magnetic resonance imaging.
Proceedings of the National Academy of Science 89.13: 59515.
Petersen, S. E., P. T. Fox, M. Mintun, and M. E. Raichle. 1988. Positron
emission tomographic studies of the cortical anatomy of single-word
processing. Nature 331.6157: 5859.
Pugh, K. R., B. A. Shaywitz, S. E. Shaywitz, R. T. Constable, P. Skudlarski,
R. K. Fulbright, R. A. Bronen, D. P. Shankweiler, L. Katz, J. M. Fletcher,
J. C. Gore. 1996. Cerebral organization of component processes in
reading. Brain. 119.4: 122138.
Salmelin, R., P. Helenius, and E. Service. 2000. Neurophysiology of fluent and impaired reading: A magnetoencephalographic approach.
Journal of Clinical Neurophysiology 17: 16374.
Santi, A., and Y. Grodzinsky. 2007. Working memory and syntax interact
in Brocas area. Neuroimage 37.1: 817.
Tarkiainen, A., P. Helenius, P. L. Hansen, P. L. Cornelissen, and
R. Salmelin. 1999. Dynamics of letter string perception in the human
occipitotemporal cortex. Brain 122:211932.
Wise, R. J. 2003. Language systems in normal and aphasic human subjects: Functional imaging studies and inferences from animal studies.
British Medical Bulletin 65: 95119.

NUMBER
Number is a grammatical feature that quantifies the denotation
of a linguistic element. It can refer to entities or events, and in
language we find both nominal number (very common, discussed in the following) and verbal number (less common, realized on the verb to indicate the number of events or the number
of participants; also called pluractionality).
Languages vary with regard to the part of their nominal inventory that is involved in the number system. In different languages,
the split into nominals that do and do not express number may
occur at different points of the animacy hierarchy: speaker (first
person pronouns) > addressee (second person pronouns) > third
person pronouns > kin > rational > human > animate > inanimate.
Furthermore, not all nouns are number differentiable. Two types of
noun are traditionally distinguished: count nouns and mass nouns,
the latter regarded as lacking the number distinction. At the level of
semantics, the countmass distinction can be captured with two
semantic features boundedness and internal structure (Jackendoff
1991), which corresponds to the distinction between temporally
bounded and unbounded events in verbal semantics. But countability is really a characteristic of nominal phrases (Allan 1980),
since many nouns can appear in both count and mass syntactic
contexts, for example, Would you like a cake/some cake? We need a
bigger table/ There is not enough table for everyone to sit at.
When nominal number is found expressed on the noun or the
noun phrase as such, it is considered inherent. When found on other
elements of the noun phrase or on the verb, it is contextual. The

Occipital Lobe
expressions of nominal number can involve special number words
(of different syntactic status); syntactic means (i.e., agreement,
found most commonly on demonstratives and verbs but also on
articles, adjectives, pronouns, nouns in possessive constructions,
adverbs, adpositions, and complementizers); a variety of morphological means (inflections, stem changes, zero expressions,
clitics); and lexical means (e.g., suppletion). Number is often
marked in more than one way within one language.
All nominal number systems are built on the primary opposition between singular (expressing the quantity one) and plural
(more than one). Other attested number values are dual (two),
trial (three), and paucal (a few). There may be further divisions
into paucal and greater paucal, plural and greater plural (the last
value may imply an excessive number, or all possible instances of
the referent). No genuine quadrals (four) have been found. The
largest number systems involve five values. In many languages,
the absence of plural marking does not necessarily imply the singular, but the form may be outside the number opposition and
express general number, that is, the meaning of the noun without
reference to number.
Associatives, distributives, and collectives all sometimes
listed as additional values of number are better analyzed as
independent features. Associativity expresses the meaning X
and the group associated with X; distributives indicate that
entities (whether count or mass ones), events, qualities, or locations are to be construed as distinct in space, sort, or time; and
collectives indicate that the members of a group are to be construed together as a unit. Many languages have markers for these
categories in addition to various number markers.
Anna Kibort
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Allan, Keith. 1980. Nouns and countability. Language 56: 54167.
Corbett, Greville G. 2000. Number. Cambridge: Cambridge University
Press.
Jackendoff, Ray. 1991. Parts and boundaries. Cognition 41: 945.

O
OCCIPITAL LOBE
Alcmaeon of Croton, probably the first person to suggest that the
mind is located in the brain and not the heart, also suggested that
the optic nerves are light-bearing paths to the brain. His revolutionary ideas, formulated about 2,500 years ago, were ignored
by Egyptian and Greek scholars alike (most notably by Aristotle).
We now know that visual information is delivered to the occipital lobe mainly via the thalamus. The occipital lobe is the most
posterior of the four lobes of the brain (named after the four skull
bones beneath which they lie).
Even until very recently, it was believed that the occipital lobe
was involved only in the processing of visual information per
se. Language was considered to take place in dedicated areas,
mainly in the frontal and temporal cortex. Thus the relevance of
the occipital lobe to language processing and language research

was only indirect in the sense that its role was limited to providing visual input processed later on by language centers in other
brain regions that contribute to certain aspects of language (such
as reading).
New data are now challenging this view of the occipital lobe
as solely processing visual information. First, several key studies
on the blind have shown the involvement of the occipital lobe
in processing other sensory modalities. In particular, there is
a clear link between occipital lobe processing in the blind and
language and verbal memory functions, where this pattern of
activation is attributed to massive reorganization of the occipital lobe in cases of blindness. Further, the involvement of the
occipital lobe in nonvisual processing has clearly been demonstrated in the sighted (i.e., under normal development of the
occipital lobe), and much of todays research is exploring the
extent to which the occipital lobe is involved in language processing under normal development. These topics are covered in
the following sections.

The Occipital Lobe and Vision


We live in a culture that relies heavily on vision and, accordingly, vision research has motivated and dominated neuroscience research. The discovery and analysis of cortical visual areas
using electrophysiological and anatomical techniques was one
of the major milestones in visual neuroscience (e.g., Hubel and
Wiesel 1963, 1965; Zeki 1978). On the basis of the vast amount
of anatomical studies in the primate, we now have a picture of
a highly diverse and hierarchically structured system (Felleman
and Van Essen 1991). This hierarchical organization originates
in the geniculostriate pathway from visual area 1 (V1, primary
visual cortex) and beyond to an array of visual areas. Along this
hierarchical organization there is an increase both in the receptive field size and in the complexity of the optimal stimulus neurons in each area. Converging evidence suggests that the visual
cortex is structured according to several principles of organization and functional neuroanatomical schemes. Following
are descriptions of some of the most important organizing
principles:
1. Topographical Organization. Topographic mapping can
chart an orderly and gradual change in some functional property of cortical neurons laid along the cortical surface. The most
fundamental transformation in vision is retinotopy. This involves
the transformation from a Euclidean coordinate system in the
retina to polar coordinates in the visual cortex. In this transformation, each of the early visual areas maps the visual field along
two orthogonal axes: the polar angle (points that lie on a specific radius whose origin is at the fovea have an identical polar
angle) and eccentricity (the eccentricity distance from the fovea,
the center of the visual field). Areas in the left central vision area
are projected to the back of the right occipital lobe, whereas the
more peripheral areas are projected more anteriorly. The retinal
points on such radii are mapped onto parallel bands across cortical areas. The sequential layout of these bands reverses when
crossing from one visual area to another, providing a way for an
accurate delineation of the borders of these retinotopic areas
(Sereno et al. 1995). There is evidence that higher-order objectrelated areas are also topographically organized, but the basis for

563

Occipital Lobe
this organization is still subject to debate (for a review, see GrillSpector and Malach 2004 and the following).
2. Visual Pathways. Lesion studies in primates and human
fMRI studies both suggest that there are two processing streams
between early, retinotopic visual areas in the occipital cortex
and higher-order processing centers in the occipito-temporal
and occipito-parietal lobes. These two streams are referred to
as the dorsal and ventral streams, respectively (Ungerleider
and Haxby 1994). Because the dorsal stream is also involved
in visuo-motor transformations, a differentiation is made
between vision for action (dorsal stream) and vision for perception (ventral stream) (Goodale and Milner 1992). The ventral stream contains structures devolved to the fine analysis of
a visual scene such as form and color. Thus, it is also known
as the what pathway. It consists of areas V1V4 in the occipital
lobe and several regions that belong to the lateral and ventral
temporal lobe.
3. Functional Specialization. This principle of division of
labor, which leads to a specialization of function in the various cortical areas, was originally suggested for the visual system (Zeki 1978). Electrophysiological studies in nonhuman
primates have identified organizing principles in addition
to retinotopy, such as selectivity for simple features like spatial orientation in V1 and selectivity for categories of complex
stimuli like faces for spatial layouts in inferior temporal (IT)
cortex. In the last decade or so, the use of noninvasive functional imaging, particularly fMRI, has dramatically increased
our knowledge of the functional organization of the human
visual cortex and its relation to vision, due to its ability to provide a large-scale neuro-anatomical perspective (e.g. Martin
and Chao 2001). An active debate is ongoing about the actual
organization of the ventral stream (Grill-Spector and Malach
2004). One area in the lateral occipital cortex, the lateral occipital complex (LOC; Malach et al. 1995) responds strongly to
pictures of intact objects by contrast to scrambled objects or
nonobject textures. In the ventral occipito-temporal cortex,
specialized areas for faces (fusiform face area, FFA; Kanwisher,
McDermott, and Chun 1997), scenes (Epstein and Kanwisher
1998), and human body parts (Downing et al. 2001) have been
described, as well as for visual word forms (visual word form
area, VWFA; McCandliss, Cohen, and Dehaene 2003) which
has direct implications for the discussion here. Developing a
theoretical framework that captures these specialized regions
continues to be problematic, although the notion of widely
distributed and overlapping cortical object representations
remains a likely principle of organization (Haxby et al. 2001).
The effects of perceptual expertise for certain object categories
(Gauthier et al. 2000) and, more recently, different categoryrelated resolution needs (Grill-Spector and Malach 2004) have
also been put forward as candidate organizational principles of
the human ventral stream.

The Occipital Lobe, Multisensory Integration,


and Nonvisual Processing
The perception of objects and the perception of space are cognitive functions of prime importance. In everyday life, these functions benefit from the coordinated interplay of vision, audition,
and touch. A central theme in sensory neurophysiology is that

564

information processing in the primary and secondary sensory


areas is strictly modality-specific. According to this view, the
occipital cortex processes vision, and integration of the different senses occurs only in higher-level areas. Recent evidence
suggests that the occipital cortex does in fact process nonvisual functions. We focus later on object recognition and object
naming, although similar results were obtained in the dorsal
stream (e.g., for visuo-tactile orientation, see Zangaladze et al.
1999).
Recognition of an object can involve a wide range of cues,
for example, a characteristic color, a unique texture, or a typical
sound. However, shape is a particularly fundamental feature for
recognizing and naming objects. Surprisingly, recent neuroimaging studies have found that visual and tactile object-related
information (both contribute to shape information) converges
in a lateral occipito-temporal ventral visual stream area (LOtv;
Amedi et al. 2001). A later study found that shape and not sensory modality is indeed the crucial factor in activating these
regions. The study used visual-to-auditory sensory substitution
devices (Bach-y-Rita and Kercel 2003) in which visual images
are captured by a camera and then transformed by a predetermined algorithm into soundscapes that preserve shape information. The study found that recognizing objects by their typical
sounds or learning to associate specific soundscapes with specific objects do not activate this region. Critically, soundscapes
synthesized to preserve shape information did activate LOtv
robustly. This suggests that LOtv is driven by the presence of
shape information, rather than by the sensory modality that
provides this information (Amedi et al. 2007). It is interesting to
note that a similar phenomenon of a modal representation in
occipito-temporal cortex was found for word recognition. The
study showed that the left basal occipito-temporal area shows
specificity to word processing, regardless of the sensory modality used. This area showed selective activation to words versus
non-word letter strings when subjects read using vision or when
blind individuals read Braille using touch (Buchel, Price, and
Friston 1998).
Another example of cross-modal interactions is the case of
integration of heard and seen speech (letter-sound association), which is another crucial function for normal development
of language abilities. This function, however, was shown to
be mediated primarily by the temporal lobe (especially in the
superior temporal sulcus and gyrus, STS and STG, respectively).
For instance, a recent fMRI study (van Atteveldt et al. 2004)
showed that bilateral STS/STG responded more strongly to
bimodal matching of letter-sound pairs than to their respective
unimodal components. Note that correspondences between
speech sounds and mouth movements are learned implicitly
and early in development by exposure to heard speech together
with the sight of the speaker. In contrast, the visual representation of spoken language by written language is a cultural artifact. Therefore, associations between letters and speech sounds
are not learned automatically but require explicit instruction.
It is interesting that the learning of new letter-sound mappings
involves the occipital cortex, whereas the auditory association
cortex is active during the processing of previously acquired
matching letter-sound combinations (Hashimoto and Sakai
2004).

Occipital Lobe

Language and Verbal Processing in the Occipital Lobe


READING AND DYSLEXIA; NAMING AND ALEXIA. Reading words
and naming visual objects involves the association of visual stimuli with phonological and semantic knowledge. Damage to
the left occipital lobe can result in pure alexia: the inability to read
without losing the ability to write or the loss of any other major
language-related function (Damasio and Damasio 1983). More
recent neuroimaging studies support this view by showing correlations between the left occipito-temporal cortex with linguistic aspects of reading and object naming based on visual input
in normal subjects. No less informative is the study of abnormal
patterns of activation in this part of the brain in subjects with
developmental dyslexia (Schlaggar and McCandliss 2007).
VISUAL WORD FORM IN THE SIGHTED AND TACTILE BRAILLE IN THE
BLIND. An interesting example of such language-related processing in the occipital lobe is the case of visual word form in the
sighted. Related to this is the case of occipital activation during
Braille reading in the blind. In both cases, activation was found
for both words and letter strings. The specifics and significance of
these two examples are discussed in this section.
One of the most hotly debated topics in the context of modular versus general architecture organization in the occipital lobe
is the existence of the human visual word form area, which is
dedicated to the construction of visual words and thus is a key
player in our ability to read. Like other language-related areas
in the prefrontal and parietal cortex, this area also has a strong
hemispheric dominance located in the left occipito-temporal
sulcus bordering the fusiform gyrus (McCandliss, Cohen, and
Dehaene 2003). Other methodologies, such as recording field
potentials in awake humans, showed selectivity to words and
letters strings in similar parts of the left occipital cortex (Nobre,
Allison, and McCarthy 1994). Recently, a causal link between
lesions in VWFA and acquired alexia without agraphia was demonstrated in a patient who had a patch of his cortex removed in
surgery, causing activation in VWFA to disappear (Gaillard et al.
2006; Martin 2006).
As in the case of the fusiform face area, some investigators
have suggested that there is no reason to label VWFA as a separate modular brain area purely because of the possible lack of
specificity of a ventral occipito-temporal lesion to cause pure
alexia, and since the reading disorder might not be limited to
words or could be a manifestation of a more general visual processing deficit (Price and Devlin 2003).
What happens when early blind subjects read using a different sensory modality? Recent neuroimaging studies in the blind
have demonstrated robust occipital cortex activation during
Braille reading (see blindness and language). In this case,
activation is not limited to occipito-temporal areas but stretches
all the way to the primary visual cortex (Sadato et al. 1996), and
interference in processing in the occipital cortex using transcranial magnetic stimulation (TMS) increases the error rate in
Braille reading (Cohen et al. 1997).
OCCIPITAL LOBE AND PLASTICITY IN LANGUAGE AND VERBAL
MEMORY FUNCTIONS. Recent neuroimaging studies in the blind
have demonstrated robust occipital cortex activation during
a wide variety of linguistic and specifically semantic judgment

tasks and during speech processing (See Burton, Diamond, and


McDermott 2003; Amedi et al. 2003; Pascual-Leone et al. 2005;
Roder et al. 2002).
For instance, robust plasticity in the left occipital areas of
the blind is evident during verbalmemory tasks requiring the
retrieval of abstract words from long-term memory in early
blind individuals (Amedi et al. 2003). In this case, the observed
occipital activation occurred without introducing any tactile or
auditory sensory input. Notably, blind subjects showed superior
verbal memory capabilities, compared not only to age-matched,
sighted controls but also with reported population averages.
Furthermore, in the blind group only, a strong positive correlation was found between the magnitude of V1 activation and the
verbal memory capabilities of individual subjects.
More directly related to language processing, several studies have used a verb-generation task, in which both blind and
sighted subjects were instructed to generate a verb in response
to a noun cue. The sighted group showed activation in typical
language-related areas (e.g., brocas area in the prefrontal
cortex, which was activated as well in the blind), but no occipital activation. The blind group, however, showed additional
robust activation in the occipital cortex (Burton, Diamond, and
McDermott 2003; Amedi et al. 2003). Furthermore, as research
in bilingual subjects has demonstrated, the convergence of
two languages in prefrontal cortex during semantic tasks (e.g.
Crinion et al. 2006), performed by bilingual blind subjects,
shows additional convergence of the two languages in the posterior occipital cortex, including in the primary visual cortex (Ofan
and Zohary 2006). In addition, fMRI studies have shown that
effective connectivity between the prefrontal and occipital cortex is increased in blind individuals during semantic processing.
Both early blind and sighted subjects activate a left-lateralized
fronto-temporal core semantic retrieval system. However, blind
subjects activate additional extra-striate regions, which are coupled with frontal and temporal semantic regions (Noppeney,
Friston, and Price 2003; Liu et al. 2007).
Finally, it should be pointed out that neuroimaging, at best,
establishes an association between brain activity and task performance. A causal link between occipital areas and semantic
processing was reported in a recent transcranial magnetic stimulation study. TMS targeted over the left V1 or left occipito-temporal cortex led to a disruption and an increase in the error rate in a
similar verb-generation task in blind but not in sighted subjects
(Amedi et al. 2004). An analysis of error types revealed that the
most common error produced by the TMS was semantic (e.g.,
apple would lead to the verb jump), whereas phonological errors
and interference with motor execution or articulation were rare.
These results suggest that processing language and verbal
memory in the blind incorporates a widespread network that
encompasses occipital visual brain areas, and that this type of
reorganization of language and memory is relevant to behavior.
It is clear, for example, that the functional and structural identity of the occipital cortex may switch from processing visual
information to processing information related to another sensory modality or even different language functions. However,
is this a unique consequence of early blindness? As shown here
in some examples, the occipital cortex may inherently possess
the computational machinery needed for nonvisual information

565

Occipital Lobe

Optimality Theory

processing. Under specific conditions, this potential may be


materialized. If so, visual deprivation may simply allow for the
emergence of the true potential of certain brain regions. This
hypothesis also suggests that careful task choice and experimental design (e.g., blindfolding sighted subjects for several days)
may reveal additional nonvisual, linguistic roles in the occipital
cortex in the sighted (Pascual-Leone et al. 2005).
Amir Amedi
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Amedi, A., et al. 2007. Shape conveyed by visual-to-auditory sensory substitution activates the lateral occipital complex. Nature Neuroscience
10: 6879.
Amedi, A., R. Malach, T. Hendler, S. Peled, and E. Zohary. 2001. Visuohaptic object-related activation in the ventral visual pathway. Nature
Neuroscience 4: 32430.
Amedi, A, N. Raz, P. Pianka, R. Malach, and E. Zohary. 2003. Early
Visual Cortex Activation Correlates with Superior Verbal Memory
Performance in the Blind. Nature Neuroscience 6: 75866.
Amedi, A, A. Floel, S. Knecht, E. Zohary, and L. G. Cohen. 2004.
Transcranial Magnetic Stimulation of the Occipital Pole Interferes
with Verbal Processing in Blind Subjects. Nature Neuroscience
7: 126670.
Bach-y-Rita, P., and S. W. Kercel. 2003. Sensory substitution and the
human-machine interface. Trends in Cognitive Neuroscience 7: 5416.
Buchel, C., C. Price, and K. Friston. 1998. A multimodal language region
in the ventral visual pathway. Nature 394: 2747.
Burton, H., J. B. Diamond, and K. B. McDermott. 2003. Dissociating cortical regions activated by semantic and phonological tasks to heard
words: A fMRI study in blind and sighted individuals. Journal of
Neurophysiology 90: 196582.
Crinion, J., et al. 2006. Language control in the bilingual brain. Science
312: 153740.
Cohen, L. G., P. Celnik, A. Pascual-Leone, B. Corwell, L. Falz, et al. 1997.
Functional relevance of cross-modal plasticity in blind humans.
Nature 389: 1803.
Damasio, A. R., and H. Damasio. 1983. The anatomic basis of pure
alexia. Neurology 33: 157383.
Downing, P. E., Y. Jiang, M. Shuman, and N. Kanwisher. 2001. A cortical area selective for visual processing of the human body. Science
293: 24703.
Epstein, R., and N. Kanwisher. 1998. A cortical representation of the
local visual environment. Nature 392: 598601.
Felleman, D. J., and D. C. Van Essen. 1991. Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex 1: 147.
Gaillard, R., L. Naccache, P. Pinel, S. Clemenceau, E. Volle, et al. 2006.
Direct intracranial, fMRI, and lesion evidence for the causal role of left
inferotemporal cortex in reading. Neuron 50: 191204.
Gauthier, I., P. Skudlarski, J. C. Gore, and A. W. Anderson. 2000. Expertise
for cars and birds recruits brain areas involved in face recognition.
Nature Neuroscience 3: 1917.
Goodale, M. A., and A. D. Milner. 1992. Separate visual pathways for perception and action. Trends in Neurosciences 15: 205.
Grill-Spector, K., and R. Malach. 2004. The human visual cortex. Annual
Review of Neuroscience 27: 64977.
Hashimoto, R., and K. L. Sakai. 2004. Learning letters in adulthood: Direct visualization of cortical plasticity for forming a new link
between orthography and phonology. Neuron 42: 31122.
Haxby, J. V., et al. 2001. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science
293: 242530.

566

Hubel, D. H., and T. N. Wiesel. 1963. R Shape and arrangement of columns in cats striate cortex. Journal of Neurophysiology 165: 55968.
. 1965. Receptive fields and functional architecture in two nonstriate visual areas (18 and 19) of the cat. Journal of Neurophysiology
28: 22989.
Kanwisher, N., J. McDermott, and M. M. Chun. 1997. The fusiform face
area: A module in human extrastriate cortex specialized for face perception. Journal Neuroscience 17: 430211.
Liu, Y., et al. 2007. Whole brain functional connectivity in the early
blind. Brain 130: 208596.
Malach, R., et al. 1995. Object-related activity revealed by functional
magnetic resonance imaging in human occipital cortex. Proceedings
of the National Academy of Science 92: 81359.
Martin, A. 2006. Shades of Djerine forging a causal link between the
visual word form area and reading. Neuron 50: 1735.
Martin, A., and L. L. Chao. 2001. Semantic memory and the
brain: Structure and processes. Current Opinion in Neurobiology
11: 194201.
McCandliss, B. D., L. Cohen, and S. Dehaene. 2003. The visual word form
area: Expertise for reading in the fusiform gyrus. Trends in Cognitive
Science 7: 2939.
Nobre, A. C., T. Allison, and G. McCarthy. 1994. Word recognition in the
human inferior temporal lobe. Nature 372: 2603.
Noppeney, U., K. J. Friston, and C. J. Price. 2003. Effects of visual deprivation on the organization of the semantic system. Brain 126: 16207.
Ofan, R. H., and E. Zohary. 2006. Visual cortex activation in bilingual
blind individuals during use of native and second language. Cerebral
Cortex 17: 124959.
Pascual-Leone, A., A. Amedi, F. Fregni, and L. B. Merabet. 2005. The plastic human brain cortex. Annual Review of Neuroscience 28: 377401.
Price, C. J., and J. T. Devlin. 2003. The myth of the visual word form area.
Neuroimage 19: 47381.
Rder, B., O. Stock, S. Bien, H. Neville, and F. Rosler. 2002. Speech processing activates visual cortex in congenitally blind humans. European
Journal of Neuroscience 16: 9306.
Sadato N., A. Pascual-Leone, J. Grafman, V. Ibanez, M. P. Deiber, et al.
1996. Activation of the primary visual cortex by Braille reading in
blind subjects. Nature 380: 5268.
Schlaggar, B. L., and B. D. McCandliss. 2007. Development of neural systems for reading. Annual Reviews in Neuroscience 30: 475503.
Serano, S.C., et al. 1995. Borders of multiple visual areas in humans
revealed by functional magnetic resonance imaging. Science 268:
88993.
Ungerleider, L. G., and J. V. Haxby. 1994. What and where in the
human brain. Current Opinion in Neurobiology 4: 15765.
van Atteveldt, N., E. Formisano, R. Goebel, and L. Blomert. 2004.
Integration of letters and speech sounds in the human brain. Neuron
43: 27182.
Zangaladze, A., C. M. Epstein, S. T. Grafton, and K. Sathian. 1999.
Involvement of visual cortex in tactile discrimination of orientation.
Nature 401: 58790.
Zeki, S. M. 1978. Functional specialization in the visual cortex of the rhesus monkey. Nature 274: 4238.

OPTIMALITY THEORY
Optimality Theory (OT; Prince and Smolensky [1993] 2004) is a
formal theory of constraint interaction in grammar that seeks
to explain how and to what extent natural languages may vary.
In addition to this central question of generative grammar,
research in OT addresses questions of the grammars use in performance, its acquisition, and its neural realization.

Optimality Theory

ARCHITECTURE. An OT grammar maps an input specification


onto an output structure. In phonology, the input is typically
an underlying form and the output the corresponding surface
form. In syntax the input is a proposition and the output is the
grammatical form that expresses that meaning (excepting ineffability: see the section on Faithfulness). Gen (generator) is a
mechanism for producing candidate outputs for any input and
freely generates all of the types of structures that are present
in any of the worlds languages (McCarthy and Prince 1993).
Pruning this enormous set down to the grammatical forms is the
job of H-Eval (harmony evaluator), a procedure for evaluating
the relative well-formedness (or harmony) of candidate structural descriptions. H-Eval depends on a set of universal wellformedness constraints Con.

competitors only the most harmonic of which are grammatical.


The competition (via H-Eval) evaluates each pair of candidates
against the universal constraint set Con, which is ordered into a
language-particular domination hierarchy or ranking; C1 >> C2
means that constraint C1 dominates constraint C2. In OT, constraint domination is strict: One violation of any constraint C is
always worse than violating constraints ranked lower than C
regardless of how many lower-ranked constraints are violated
and regardless of how severe the violations are of those lowerranked constraints. Given two candidate structural descriptions
p and q for an input I, p has higher harmony (p  q) if p is preferred by the highest-ranked constraint that does not evaluate p
and q as equal. A candidate p is optimal if there is no other candidate q with higher harmony. If p is optimal, then p cannot violate
any constraint C unless every competing candidate that is preferred to p by C is dispreferred to p by a constraint higher ranked
than C. In this sense, violations incurred by an optimal candidate
structure are minimal. In sum, for every input I, harmony optimization over the candidates in Gen(I) determines (at least) one
optimal, though not necessarily perfect, structural description of
I, which is ipso facto declared to be a grammatical output for that
input.
The only cross-linguistically varying property of the grammar
is the relative ranking of the universal constraints in Con: the
set of all possible grammars then is exactly the set of all rankings of this fixed set of constraints. Typically, any given empirical
pattern that is predicted to be part of the universal typology is
generated by many different (but typologically equivalent) rankings, and the number of predicted possible typological patterns
is vastly smaller than the number of all possible rankings (e.g.,
13 patterns vs. 40,320 rankings in Smolensky and Legendre 2006,
Chapter 15).
Employing violable, conflicting constraints often means that
the universal constraints can be more simply stated: Complexity
emerges primarily from the interaction of simple constraints;
this reduces the need for hedges or disjunctive principles arising
when universal constraints are construed as inviolable (Speas
1997). Constraint ranking naturally captures the common situation in which some phenomenon widely observed in other
languages is seen in one context only in language L (e.g., null
subjects in main clauses only as in Old French). Stipulating the
limited distribution is unnecessary: In L, a lower-ranked constraint C (e.g., against null elements) often violated in grammatical forms of L makes itself felt in those special contexts
where dominating constraints do not contravene. In other languages, where C is more highly ranked, the phenomenon is seen
widely.

COMPETITION AND CONFLICT. The fact that well-formedness constraints are violable in OT means that two constraints often conflict: Satisfying one requires violating the other. Grammaticality
is therefore not equated with satisfaction of all grammatical
constraints. Grammatical structures are simply structures that
suffer less severe constraint violations than their ungrammatical
counterparts. This means that the evaluation of grammaticality is
inherently comparative. At any level of description (phonological,
syntactic, etc.) the universal set of possible structural descriptions of an input I, Gen(I), forms a candidate set, a collection of

FAITHFULNESS. The most familiar grammatical constraints are


markedness constraints, which demand that the output structure meet some well-formedness condition (e.g., a subject must
be an agent; Smolensky and Legendre 2006, Chapter 15). In an
optimizing grammar, unless there is pressure for the output to
contain all and only the elements contained in the input, the
optimizing system would always simply return the best of all
structures (under the given ranking). Grammars must therefore
include (inputoutput) faithfulness constraints, which require
minimal structural distance between the output and the input.

OT in Theoretical Linguistics
Characterizing the set of possible natural languages minimally
requires specifying i) the mental representations that are characteristic of language, ii) the constraints that distinguish possible
from impossible linguistic systems, and iii) the formal mode of
interaction among these constraints. Concerning mental representations, OT imposes no restrictions beyond requiring that
specifications of phonological, syntactic, or semantic structure
be explicit in the sense of generative grammar. This makes OT
compatible with alternative substantive theories of particular grammar components; in syntax, for example, OT versions
of government and binding (Grimshaw 1997; Legendre,
Smolensky, and Wilson 1998), lexical-functional grammar (Bresnan 2000), and the minimalist program (Mller
1997) are flourishing. OTs main contribution concerns constraint interaction (iii) and, as a consequence, the proper formal
characterization of the constraints themselves (ii). Hence, OT
is best characterized as a meta-theory of grammatical structure
compatible with any explicit theory of linguistic representation.
It is therefore applicable to all linguistic levels, and has been
applied to phonology (McCarthy and Prince 1993; Prince and
Smolensky [1993] 2004), syntax (Legendre, Grimshaw and
Vikner 2001), semantics (Hendriks and de Hoop 2001), and
pragmatics (Blutner and Zeevat 2004; Blutner, De Hoop, and
Hendriks 2006).
According to OT, all grammatical constraints are universal
and violable (or soft) a claim that represents a major departure from previous approaches to the characterization of phonological and syntactic knowledge via language-particular rewrite
rules written in a universal notation, or as universal, inviolable
constraints supplemented by additional principles subject to
language-particular parameterization.

567

Optimality Theory
Faithfulness constraints, unique to OT, have been shown to
operate at all levels of linguistic description. In syntax, faithfulness constraints play a crucial role in accounting for languageparticular ineffability, that is, syntactic structures that are simply
impossible in certain languages, for example, multiple wh-questions in Irish (Legendre, Smolensky, and Wilson 1998; Legendre
2009). In most languages, faithfulness to question operators in a
semantic input force the optimal syntactic output to contain multiple wh-phrases. In languages like Irish, however, such faithfulness is ranked below the syntactic constraints violated by clauses
containing multiple wh-phrases, and thus no optimal syntactic
structure contains multiple wh-phrases. Multiple wh-questions
are therefore inexpressible (in a single clause).
VARIATION. OT is naturally extensible to unstable states of language, such as free variation (Anttila 1998), dialectal variation,
and diachronic change (Nagy and Reynolds 1997). Relaxing the
requirement of a complete ranking of the universal constraints, a
language L may be characterized by a single partial ranking P; L
is generated by the set of grammars S full rankings consistent
with P. These generate a set of outputs for each input. According
to Joan Bresnan, Ashwini Deo, and Devyani Sharma (2007),
intraspeaker variation in the British paradigm for be arises from
a partial ranking P among faithfulness constraints, requiring an
output to express input agreement features and markedness
constraints penalizing all features in the output. A total ranking
consistent with P having more highly ranked faithfulness yields a
more highly inflected paradigm.
FURTHER DEVELOPMENTS. OT is an evolving theory; major developments include the following. Faithfulness: Output-output
faithfulness requires identity of a morphemes exponent across
its paradigm (Burzio 1994; Benua 1995); output-output anti-faithfulness constraints achieve morphophonological alternations by
demanding nonidentical surface forms for distinct underlying
forms (Alderete 2001); sympathy theory demands faithfulness
to suboptimal candidates (McCarthy 1999b). Harmonic evaluation: Comparative markedness evaluation distinguishes constraint violations that are shared with the most faithful output
from those that are not (McCarthy 2003); targeted constraints
only compare candidates that differ only in a specified way
(Wilson 2001b). Architecture: Stratal OT assumes differing rankings in a series of lexical levels (Kiparsky 2006); harmonic serialism derives the surface form from a series of small alterations,
each optimal at its point in the derivation (Prince and Smolensky
[1993] 2004; McCarthy 1999a); candidate chain theory evaluates entire derivations (McCarthy 2007); bidirectional optimization adds competition of interpretations/underlying forms to
the competition of expressions/surface forms that is standard in
OT (Smolensky 1996; Blutner 2000; Wilson 2001a). Probabilistic
formulations: In stochastic OT (Boersma 1998), each optimization ranks the constraints according to relative numerical values randomly selected for that optimization from a probability
distribution for each constraint, with a mean value determined
by that constraints strength in the grammar. In the maximum
entropy formulation of OT (Hayes and Wilson 2008), harmony is
numerical: Each constraint has a numerical strength determining the size of the penalties it assesses to violating candidates; the

568

probability of a candidate is proportional to the exponential of its


harmony. Gaja Jarosz 2006 defines a version of OT phonology,
maximum likelihood learning of lexicons and grammars (MLG),
in which underlying forms as well as rankings have probability
distributions, defining a lexicon + grammar. The relative probability of a form has been used to model its gradient acceptability
(Boersma and Hayes 2001; Hayes and Wilson 2008).

OT and Grammar Use


OT is well suited for theories of performance. With no additional
machinery, standard OT grammars assign a structural description to all inputs, including loanword inputs violating the phonotactics of the borrowing language (Yip 1993; Davidson, Jusczyk,
and Smolensky 2006) or the initial fragment of a sentence being
processed word by word (Gibson and Broihier 1998; Stevenson
and Smolensky 2006). In the latter case, processing difficulty is
predicted to occur if the optimal parse of the initial portion of
a sentence changes substantially when a new word arrives. On
a formal level, the computational complexity of the problem
of computing optimal outputs is well studied (e.g., Tesar 1996;
J. Eisner 1997; Frank and Satta 1998; Idsardi 2006)

OT and Grammar Acquisition


OTs account of variation and change provides a natural extension
to the analysis of acquisition of phonology and syntax (Legendre
et al. 2002; Kager, Pater, and Zonneveld 2004; Legendre et al.
2004). In early child syntax, for example, a childs competence
may be characterized by a partial ranking P between faithfulness constraints and constraints penalizing syntactic structure
(Legendre et al. 2002). In some of the full rankings consistent
with P, higher-ranked faithfulness constraints lead to optimal
clauses with functional projections; in other rankings consistent
with P, lower ranking of these faithfulness constraints entail that
optimal outputs lack some or all functional projections. The variation in child production of tense and agreement marking is thus
given a principled grammatical account.
Formal and computational studies of the problem of learning
OT grammars have been extensive, including constraint learning (Hayes and Wilson 2008), ranking learning (Tesar 1998; Tesar
and Smolensky 1998; Jason Eisner 2000; Boersma and Hayes
2001; Hayes 2004; Prince and Tesar 2004), simultaneous learning
of a probabilistic phonological lexicon and a ranking by maximum likelihood estimation (Jarosz 2006), and the mathematical
logic of OT learning (Prince 2002, 2006).

OTs Neural Realization


OT has historical roots in debates (Pinker and Prince
1988; Smolensky 1988) concerning neural network (or
connectionist) cognitive models (Rumelhart, McClelland,
and the PDP Research Group 1986). In these models, networks
of abstract model neurons excite and inhibit one another as activation spreads from input neurons to output neurons. Formal
analysis of a class of networks reveals that they perform optimization: They compute mental representations (activation patterns) that maximize a numerical measure of self-consistency
or well-formedness: harmony (Smolensky 1986). Mathematical
analysis makes precise the following general picture: At a
lower level of description, spreading activation among abstract

Optimality Theory
neurons maximizes numerical harmony; at a higher level of
description, the same system computes the symbolic structural
description of the input that optimizes the harmony of an OT
grammar. Construed in these terms, the study of grammar is
fully integrated into the contemporary science of the mind/brain
(Smolensky and Legendre 2006).
Geraldine Legendre and Paul Smolensky
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alderete, John. 2001. Dominance effects as transderivational antifaithfulness. Phonology 18: 20153.
Anttila, Arto. 1998. Deriving variation from grammar. In Variation,
Change, and Phonological Theory, ed. F. Hinskens, R. van Hout, and
W. L. Wetzel, 3568. Amsterdam: Benjamins.
Benua, Laura. 1995. Output-output faithfulness. In University of
Massachusetts Occasional Papers in Linguistics 18: Papers in Optimality
Theory, ed. Jill Beckman, Laura Walsh Dickey, and Suzanne Urbanczyk,
77136. Amherst: University of Massachusetts at Amherst, GLSA.
Blutner, Reinhard. 2000. Some aspects of optimality in natural language
interpretation. Journal of Semantics 17: 189216.
Blutner, Reinhard, Helen De Hoop, and Petra Hendriks. 2006. Optimal
Communication. Stanford, CA: CSLI Publications.
Blutner, Reinhard, and Henk Zeevat, eds. 2004. Pragmatics in Optimality
Theory. London: Palgrave Macmillan.
Boersma, Paul. 1998. Functional Phonology: Formalizing the Interactions
between Articulatory and Perceptual Drives. The Hague: Holland
Academic Graphics.
Boersma, Paul, and Bruce Hayes. 2001. Empirical tests of the gradual
learning algorithm. Linguistic Inquiry 32: 4586.
Bresnan, Joan. 2000. Optimal syntax. In Optimality Theory: Phonology,
Syntax and Acquisition, ed. Joost Dekkers, Frank van der Leeuw, and
Jeroen van de Weijer, 33485. Oxford: Oxford University Press.
Bresnan, Joan, Ashwini Deo, and Devyani Sharma. 2007. Typology in
variation: A probabilistic approach to be and nt in the survey of
English dialects. English Language and Linguistics 11: 30146.
Burzio, Luigi. 1994. Principles of English Stress. Cambridge: Cambridge
University Press.
Davidson, Lisa, Peter W. Jusczyk, and Paul Smolensky. 2006. Optimality
in language acquisition I: The initial and final states of the phonological grammar. In The Harmonic Mind: From Neural Computation
to Optimality-Theoretic Grammar. Vol. 2. Ed. Paul Smolensky and
Graldine Legendre, 793839. Cambridge, MA: MIT Press.
Eisner, J. 1997. Efficient generation in primitive optimality theory. Annual
Meeting of the Association for Computational Linguistics 35: 31320.
Eisner, Jason. 2000. Easy and hard constraint ranking in optimality theory: Algorithms and complexity. In Finite-State Phonology: Proceedings
of the Fifth Workshop of the ACL Special Interest Group in Computational
Phonology (Sigphon), ed. Jason Eisner, Lauri Karttunen and A. Thriault,
2233. Morristown, NJ: Association for Computational Linguistics.
Frank, Robert, and Giorgio Satta. 1998. Optimality theory and the generative complexity of constraint violability. Computational Linguistics
24: 30715.
Gibson, Edward, and Kevin Broihier. 1998. Optimality theory and human
sentence processing. In Is the Best Good Enough? Optimality and
Competition in Syntax, ed. Pilar Barbosa, Danny Fox, Paul Hagstrom,
Martha McGinnis, and David Pesetsky, 15791. MIT Working Papers
in Linguistics. Cambridge, MA: MIT Press.
Grimshaw, Jane. 1997. Projection, heads, and optimality. Linguistic
Inquiry 28: 373422.
Hayes, Bruce. 2004. Phonological acquisition in optimality theory: The
early stages. In Constraints in Phonological Acquisition, ed. Ren

Kager, Joe Pater, and Wim Zonneveld. Cambridge: Cambridge


University Press.
Hayes, Bruce, and Colin Wilson. 2008. A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry 39: 379440.
Hendriks, Petra, and Helen de Hoop. 2001. Optimality theoretic semantics. Linguistics and Philosophy 24: 132.
Idsardi, William J. 2006. A simple proof that optimality theory is computationally intractable. Linguistic Inquiry 37: 2715.
Jarosz, Gaja. 2006. Rich lexicons and restrictive grammars maximum
likelihood learning in optimality theory. Ph.D. thesis, Johns Hopkins
University.
Kager, Ren. 1999. Optimality Theory. Cambridge: Cambridge University
Press.
Kager, Ren, Joe Pater, and Wim Zonneveld, eds. 2004. Constraints in
Phonological Acquisition. Cambridge: Cambridge University Press.
Kiparsky, Paul. 2006. Paradigms and Opacity. Stanford, CA: CSLI
Publications.
Legendre, Graldine. 2009. Ineffability in syntax. In Modeling
Ungrammaticality in Optimality Theory, ed. Curt Rice, 23766.
London: Equinox.
Legendre, Graldine, Jane Grimshaw, and Sten Vikner eds. 2001.
Optimality-Theoretic Syntax. Cambridge, MA: MIT Press.
Legendre, Graldine, Paul Hagstrom, Joan Chen-Main, Liang Tao, and
Paul Smolensky. 2004. Deriving output probabilities in child mandarin from a dual-optimization grammar. Lingua 114: 114785.
Legendre, Graldine, Paul Hagstrom, Anne Vainikka, and Marina
Todorova. 2002. Partial constraint ordering in child French syntax.
Language Acquisition 10: 189227.
Legendre, Graldine, Paul Smolensky, and Colin Wilson. 1998. When
is less more? Faithfulness and minimal links in wh-chains. In Is the
Best Good Enough? Optimality and Competition in Syntax, ed. Pilar
Barbosa, Danny Fox, Paul Hagstrom, Martha McGinnis, and David
Pesetsky, 24989. MIT Working Papers in Linguistics. Cambridge,
MA: MIT Press.
McCarthy, John J. 1999a. Harmonic serialism and parallelism. NELS 30:
50124.
. 1999b. Sympathy and phonological opacity. Phonology
16: 33199.
. 2002. A Thematic Guide to Optimality Theory. Cambridge:
Cambridge University Press.
. 2003. Comparative markedness. Theoretical Linguistics 29:
151.
. 2007. Hidden Generalizations: Phonological Opacity in Optimality
Theory. London: Equinox.
McCarthy, John J., and Alan Prince. 1993. Prosodic morphology
I: Constraint interaction and satisfaction. Technical Report RuCCSTR-3. Rutgers Center for Cognitive Science, Rutgers University, and
University of Massachusetts at Amherst.
Mller, Gereon. 1997. Partial wh-movement and optimality theory.
Linguistic Review 14: 249306.
Nagy, Naomi, and William Reynolds. 1997. Optimality theory and variable word-final deletion in Ftar. Language Variation and Change
9: 3755.
Pinker, Steven, and Alan Prince. 1988. On language and connectionism: Analysis of a parallel distributed processing model of language
acquisition. Cognition 28: 73193.
Prince, Alan. 2002. Entailed ranking arguments. Manuscript, Rutgers
University, New Brunswick, NJ.
. 2006. Implication and impossibility in grammatical systems: What it is and how to find it. Manuscript, Rutgers University,
New Brunswick, NJ.
Prince, Alan, and Paul Smolensky. [1993] 2004. Optimality
Theory: Constraint Interaction in Generative Grammar. Malden,

569

Oral Composition
MA: Blackwell. Original version was a technical report, Rutgers
University and University of Colorado at Boulder..
Prince, Alan, and Bruce B. Tesar. 2004. Learning phonotactic distributions. In Constraints in Phonological Acquisition, ed. Ren Kager, Joe
Pater, and Wim Zonneveld. Cambridge: Cambridge University Press.
Rumelhart, David E., James L. McClelland, and the PDP Research Group.
1986. Parallel Distributed Processing: Explorations in the Microstructure
of Cognition. 2 vols. Cambridge, MA: MIT Press.
Smolensky, Paul. 1986. Information processing in dynamical systems: Foundations of harmony theory. In David E. Rumelhart, James
L. McClelland, and the PDP Research Group, Parallel Distributed
Processing: Explorations in the Microstructure of Cognition, I: 194281.
Cambridge, MA: MIT Press.
. 1988. On the proper treatment of connectionism. Behavioral
and Brain Sciences 11: 174.
. 1996. On the comprehension/production dilemma in child language. Linguistic Inquiry 27: 72031.
Smolensky, Paul, and Graldine Legendre. 2006. The Harmonic
Mind: From Neural Computation to Optimality-Theoretic Grammar.
Vol. 1. Cognitive Architecture. Vol. 2. Linguistic and Philosophical
Implications. Cambridge, MA: MIT Press.
Speas, Margaret. 1997. Optimality theory and syntax: Null pronouns and
control. In Optimality Theory: An Overview, ed. Diana Archangeli and
D.Terrence Langendoen, 17199. Malden, MA: Blackwell.
Stevenson, Suzanne, and Paul Smolensky. 2006. Optimality in sentence processing. In The Harmonic Mind: From Neural Computation
to Optimality-Theoretic Grammar. Vol. 2. Ed. Paul Smolensky and
Graldine Legendre, 30738. Cambridge, MA: MIT Press.
Tesar, Bruce B. 1996. Computing optimal descriptions for optimality
theory grammars with context-free position structures. Proceedings of
the Thirty-Fourth Annual Meeting of the Association for Computational
Linguistics, 1017. Morristown, NJ: Association for Computational
Linguistics.
. 1998. Error-driven learning in optimality theory via the efficient computation of optimal forms. In Is the Best Good Enough?
Optimality and Competition in Syntax, ed. Pilar Barbosa, Danny Fox,
Paul Hagstrom, Martha McGinnis, and David Pesetsky, 42135. MIT
Working Papers in Linguistics. Cambridge, MA: MIT Press.
Tesar, Bruce B., and Paul Smolensky. 1998. Learnability in optimality
theory. Linguistic Inquiry 29: 22968.
Wilson, Colin. 2001a. Bidirectional Optimization and the Theory of
Anaphora. In Optimality-Theoretic Syntax, eds. Geraldine Legendre,
Sten Vikner and Jane Grimshaw. Cambridge, MA: MIT Press.
. 2001b. Consonant cluster neutralisation and targeted constraints. Phonology 18: 14797.
Yip, Moira. 1993. Cantonese loanword phonology and optimality theory. Journal of East Asian Linguistics 2: 26191.

ORAL COMPOSITION
Oral composition broadly refers to the creation of organized verbal formulations without reliance on writing. Though in essence
a familiar process in everyday speech, it has become a quasitechnical and debated term, applied especially to bringing into
being relatively sustained examples of entextualized verbal art,
both ancient and recent. It has thus been of interest to linguists,
anthropologists, folklorists, psychologists, historians, and specialists in specific languages and cultures, also linking to work
on oral culture, performance, story, literacy, and
memory.
The central issues have been, first and most directly, how
lengthy oral poems, narratives, and other sustained verbal forms

570

can come into being without writing a puzzle for those steeped
in literate traditions that assume the centrality of the written
word; and second, how this relates to performance (for performance is arguably how an oral creation exists).
Earlier approaches focused largely on nonliterate settings,
especially those characterized as primitive or traditional. One
model was of spontaneous improvization by the unself-conscious child of nature, unfettered (and unhelped) by recognized artistic conventions. Another was of unchanging tradition
from the far-distant past, not composed by living creators but
stored in the communal tribal memory.
These models were largely superseded by the influential oralformulaic approach (also known as the oral theory) which came
to the fore in the mid-twentieth century. The concept of oral
composition acquired a specific meaning and became a key term
of analysis and explanation. Its classic statement in Albert Lords
seminal The Singer of Tales (1960) used fieldwork in the 1930s in
Yugoslavia to demonstrate how lengthy oral poems were composed during performance: The singers drew on a traditional
store of formulaic phrases and themes, which enabled them,
without writing or verbatim memorization, to pour forth long
epic songs in uninterrupted flow. Variations around such formulaic phrases as, for example, By Allah, he said, and mounted his
white horse recur throughout the poems, providing a parallel to
the Homeric epithets like fleet-footed Achilles found in early,
putatively oral, Greek epics. This special technique of composition (Lord 1960, 17) relied not on preplanned, memorized texts
but on composition-in-performance. Contrary to literate expectations, there was no fixed correct version: each performance was
authentic in its own right, a unique product composed and performed on one occasion. Oral-formulaic composing was linked
to a traditional, oral mindset incompatible with literacy and the
literate mind, and once singers became literate, it was posited,
they lost the power to compose orally.
The oral-formulaic theory was enormously influential
throughout much of the later twentieth century and across a
wide span of disciplines, providing, as it apparently did, an
answer to the puzzle of verbal composition without writing.
Examples of comparable formulaic expression and hence, it
seemed, of oral composition were identified throughout the
globe, from early Greek epic, Old English texts, or the Bible to
living examples recorded from the field, soon also extending to
the full range of poetic genres and to prose-like forms, such as
sermons or storytelling.
Though still regarded as a classic approach, oral-formulaic
theory has been both modified and challenged, especially during the last two decades. First, it has become apparent that not
all genres of unwritten verbal art follow the oral-formulaic composition-in-performance mode, nor, as implied by the classic
oral-formulaic analysts, is oral composition a single identifiable
process. Their often somewhat generalized conclusions have not
been fully supported by the empirical evidence, for oral forms
turn out to be created in diverse ways. Some are composed before
and separated from performance. Some do, after all, involve
memorization. One much-quoted case is of the Somali poets
who spend hours, sometimes days, composing elaborate genres
of oral poetry, later delivered word for word either by themselves
or by reciters who are able to memorize poems and, without

Oral Composition
writing, store large and exactly reproducible repertoires in their
memory over many years. Elsewhere, too, prior composition is
sometimes a long-drawn-out and carefully considered procedure, in some cases involving multiple authors and/or rehearsals before being performed. Certain womens personal songs in
mid-twentieth-century Zambia, for example, were thought out
by one woman, elaborated with her friends, worked over for days
by an expert composer, then rehearsed and memorized before
final performance. In other cases, a composer may speak aloud
words of rapid inspiration designed for later performance, to
be captured by listeners on the spot through memorizing, tape
recording, or writing (further details and discussion in Finnegan
1992, 5287; 2007, 96113, 179200). Contrary to the classic oralformulaic model, oral composing varies in different cultures,
genres, and circumstances.
Second, the assumption that literacy and orality are mutually
incompatible has been extensively challenged. By now, many
empirical examples of their interaction in both historical and
more recent times have been noted and investigated. At a more
theoretical level, there are also the current transdisciplinary critiques of the West-centered binary dichotomizing between primitive/civilized, non-Western/Western, traditional/modern, and,
alongside these, oral/literate, together with parallel challenges
to the arguably ethnocentric and ideological presuppositions of
a simple and necessary link between literacy and modernity. In
practice, it appears, there are multiple forms of literacy, interacting, therefore, in multiple ways with oral modes.
Despite challenges to some of its central presuppositions, the
legacy of the oral-formulaic school lives on. It rightly unsettled
the (literate) concept of fixed correct text, highlighted the significance of performance and audience, and, if in the (arguably)
somewhat elusive terminology of formulae, pointed up the
importance for composition of conventionalized verbal formulations in generic settings. Scholars identifying themselves with
that tradition have continued their (largely textual) examinations of oral and oral-derived texts while also reconfiguring
their approaches by attention to the specificities of aesthetic and
cultural traditions, interacting fruitfully with trends elsewhere to
produce sophisticated analyses of the complex interrelations of
oral with written composition (Amodio 2005; Foley 2002).
Although there is currently no one dominant approach to
complement the earlier oral theory, the topic of composition
without writing (or anyway, without central reliance on writing)
has continued to attract interdisciplinary interest. The focus is
now less on attempting to delineate oral composition as a single
process, or as pertaining to some special kind of culture or mentality, and more on complexity and plurality.
Oral composition is thus no longer conceptualized as primarily confined to traditional, historic, or non-Western settings
but as also including such examples as contemporary popular
songs or the spoken oratory of modern statesmen and publicists. It has also been noted how readily some long-established
oral genres are exploited in new settings, like the South African
praise poems now composed for Nelson Mandela, the national
football team, or university graduation ceremonies, and circulated not only in live performance but in writing and on radio,
CD-ROMs, and the Web. The relation between oral and literate is
now more often envisaged as continuum than as opposition or,

better, as a multifaceted spectrum of overlaps, interpenetrations,


and diversities. The now-influential concepts of entextualization
and of dialogism, here applied in particular by linguistic and
literary anthropologists, have also bridged the once-accepted
chasm between oral and literate and illumined the multiple ways
in which people construct, assemble, and interact with texts
(Barber 2007; Silverstein and Urban 1996).
The meaning of oral itself has also been enlarged and problematized. Most oral compositions, it is now increasingly noted,
are realized not just through words but through a constellation
of multimodal resources. The act of performance may include,
for example, movement, bodily enactment, visual devices, and
the variegated arts of the voice (volume, intonation, speed,
silence, timbre, atmosphere, and much else): A musical element
is essential in certain genres, an aspect often neglected in Western
scholars propensity to privilege the verbal component. Although
music and words are in some cultures and genres taken as distinct, composed by different people, this is not always so, and
some scholars argue that language and music form a continuum
rather than a dichotomy (see Banti and Giannatasio 2004). The
substantial recent work on gesture (McNeill 2000; Kendon
2004) has also elucidated the integral relation between gesturing
and speaking. Even if below our explicit consciousness, gesture,
it seems, is a planned and patterned activity, a dimension therefore that, like music, must arguably enter into a full understanding of oral composition and performance.
Recent approaches to memory are also relevant. Historians,
anthropologists, and psychologists have drawn attention to the
frames within which remembering is actively recreated and
to the diverse social mechanisms for organizing and manipulating memory. Some cultures or genres prioritize word-forword memorization and organize formal or informal training
in this skill; in others, different arts are emphasized, including
improvization. Generic conventions themselves provide schemas for organizing and activating memory, offering constraints
and opportunities for the creative flow of language, not only
through larger frames such as narrative, praising, or lamenting
but also by memory-enhancing devices like imageries, rhythm,
and audience (and chorus) participation and by sound-pattern
repetitions and sequences, such as rhyme, alliteration,
parallelism, or melody. In some contexts, memory is seen as
itself an aspect of creativity, eroding its apparent opposition
to composition (see further Rubin 1995; Carruthers 1990).
The upshot is that oral composition has somewhat dissolved
as a distinctive topic for analysis. It no longer stands out as something self-evidently special or puzzling but as an aspect of processes being studied from other viewpoints and as taking place
in many different forms, settings, and modalities from lengthy
art genres to the creativity of everyday conversation; from long
preplanned and rehearsed performances to extemporized
speeches; from live delivery to multimedia enactments. It is now
tied less to theories of the primitive, traditional, or, indeed, the
oral as such as to ongoing issues related to language or creativity more generally, analyzed both comparatively and in cultural
specificities.
While in one way this has undermined the idea of oral composition as a subject for direct scrutiny in its own right, in another
way this broader cross-cultural approach and the empirical

571

Oral Culture
investigations it has stimulated have enabled a firmer grasp on
the complexity of the processes by which, without much or any
direct recourse to writing, people can and do produce verbal
formulations both lengthy and short, aesthetically marked and
everyday. Further, all of this has helped to challenge traditional
models of language as realized preeminently either, on the one
hand, in stable written texts or, on the other, in relatively unconstrained and perhaps trivial everyday speech. A consideration of
oral composing highlights the sustained and creative marshaling of language in situations where writing does not necessarily
lie at the core: verbal genres that are by no means outdated or
peculiar but have had a wide spread in the world, both yesterday
and today.
Ruth Finnegan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Amodio, Mark C., ed. 2005. New Directions in Oral Theory. Tempe: Arizona
Center for Medieval and Renaissance Studies.
Banti, G., and F. Giannatasio. 2004. Poetry. In A Companion to Linguistic
Anthropology, ed. Alessandro Duranti, 290320. Oxford: Blackwell.
Barber, Karin. 2007. Texts, Persons and Publics in Africa and Beyond.
Cambridge: Cambridge University Press.
Carruthers, Mary. 1990. The Book of Memory. Cambridge: Cambridge
University Press.
Finnegan, Ruth. 1992. Oral Poetry: Its Nature, Significance and Social
Context. 2d ed. Bloomington: Indiana University Press.
. 2007. The Oral and Beyond: Doing Things with Words in Africa.
Oxford: James Currey. Chicago: Chicago University Press.
Foley, John Miles. 1988. The Theory of Oral Composition: History and
Methodology. Bloomington: Indiana University Press.
. 2002. How to Read an Oral Poem. Urbana: University of Illinois
Press.
Kendon, Adam. 2004 Gesture: Visible Action as Utterance.
Cambridge: Cambridge University Press.
Lord, Albert B. 1960. The Singer of Tales. Cambridge: Harvard University
Press.
McNeill, David, ed. 2000. Language and Gesture. Cambridge: Cambridge
University Press.
Rubin, David C. 1995. Memory in Oral Traditions: The Cognitive
Psychology of Epic, Ballads, and Counting-Out Rhymes. Oxford: Oxford
University Press.
Silverstein, Michael, and Greg Urban, eds. 1996. Natural Histories of
Discourse. Chicago: Chicago University Press.

ORAL CULTURE
Oral culture is a conceptual construct associated primarily with
the work of Walter J. Ong, S.J., Marshall McLuhan, and Eric A.
Havelock, whereas the term oral tradition, which they also use,
is more often associated with the work of Milman Parry, Albert B.
Lord, and their many followers (see Foley 1985). Ong, McLuhan,
and Havelock use the term oral culture to refer primarily to preliterate cultures but also to characterize the thought and expression that carry over into manuscript culture and even into print
culture. Moreover, oral culture, which Ong also refers to as
primary oral culture, endures in the sense that people continue
to talk with one another. The later subsequent cultural developments in manuscript culture and print culture may be seen
as cultural overlays that influence and transform the base oral

572

culture to certain degrees, but without ever eliminating it or


totally superseding it. In the world today, an estimated one billion people do not know how to read or write any language, and
so they live in a residual form of primary oral culture. In addition, certain cultures in the world today remain highly oral, just
as Western culture did for centuries before print culture helped
usher in what is commonly referred to as modern culture and
modernity modern science, modern capitalism, modern
democracy, the Industrial Revolution, and the Romantic movement. The common distinction between modern culture, prominent in the West, and premodern cultures in many other areas
of the world (e.g., Turner 1969) can be understood in terms of
Ongs account of Western cultural history. Premodern cultures
are examples of what Ong has referred to as primary oral cultures
and as residual forms of primary oral cultures.
When alphabetic writing was introduced, Ong claims (and so
do McLuhan and Havelock), it did not change everything overnight. As a result, early writing such as most of the Bible (except
for the prologue to the Gospel of John) and the Homeric epics
can be seen as providing transcripts of primary oral thought
and expression. But distinctively literate forms of thought and
expression emerged in the pre-Socratics and Plato, as Havelock
explains in detail (1963, 1978, 1982). Perhaps more than anything
else, Ong sees the formal study of logic initiated by Aristotle as
involving distinctively literate thought; Ong has traced the history of the formal study of logic in his 1958 masterwork Ramus,
Method, and the Decay of Dialogue: From the Art of Discourse to
the Art of Reason (3d ed. 2004). Within the Aristotelian tradition
of medieval logic, Ong notes, new developments emerged that he
styles the quantification of thought (see esp. [1958] 2004, 5391).
In a subsequent essay, he points out how these new developments contributed to the emergence of a new state of mind as
found in modern science (Ong 1962, 72). Neither this new state
of mind nor modern science emerged in oral culture, just as the
formal study of logic developed by Aristotle has no counterpart
in oral culture.
In Ramus, Method, and the Decay of Dialogue, Ong also calls
attention to the visualist tendencies of Western philosophic
thought, which were advanced further by the development of
printed books. The visualist tendencies of ancient Greek philosophic thought have been further amplified recently by Andrea
Nightingale (2004). But such visualist tendencies do not characterize oral culture. Ong ([1969] 1995) also describes oral culture
as based on an oral-aural sense of the world as event, which he
contrasts with the visual sense of the world as something seen
(as in the expression worldview). In World as Event: Aspects of
Chipewyan Ontology (1997), anthropologist David M. Smith
borrows Ongs expression world as event to help elucidate certain aspects of Chipewyan thought.
Ong associates oral culture with the cyclic forms of thought
that Mircea Eliade describes in The Myth of the Eternal Return
([1949] 2005). Lynne Ballew describes further examples of cyclic
thought in Straight and Circular: A Study of Imagery in Greek
Philosophy (1979). Ong sees the recycling of souls in the story
of Er recounted by Socrates in Platos Republic as an instance
of cyclic thought in Greek philosophy, which is to say a residual
form of oral thought in Greek philosophic thought. Conversely,
Ong (see, for example, 1967a, 6182, 8398, 99126) associates

Oral Culture
the linear accounts of time in the Bible with literacy; he likes
to style linear conceptions of history as evolutionary thought,
thereby rooting later forms of evolutionary thought in Darwin
and others within the biblical cultural tradition in Western tradition. Independently of Ong, Donald L. Fixico, who is himself
of American Indian descent, works comfortably with these contrasts in The American Indian Mind in a Linear World: American
Indian Studies and Traditional Knowledge (2003; also see Lee
1987, 10520).
In Manliness, Harvey C. Mansfield does not happen to refer
explicitly to oral culture, but he refers to Achilles frequently to
illustrate certain points regarding manliness (2006, 558, 601),
an ambivalent quality that he sees as needing to be disciplined
toward socially constructive ends. Male puberty rites, for example, have long been used in oral cultures to help discipline and
orient young men in socially constructive ways (see van Gennep
1960; Ong 1971, 11341). The kind of socially constructive warrior manliness that Achilles and Agamemnon and Hector and
Odysseus represent is a necessity in oral cultures: The entire
enterprise of modernity, however, could be understood as a project to keep manliness unemployed (Mansfield 2006, 230). In
David Riesmans terminology, oral culture is tradition directed,
whereas modernity is dominated by inner-directedness (1950).
(For further recent studies of the historical development of innerdirectedness, see Williams 1993; Brakke 2006; Cary 2000; van t
Spijker 2004; Renevey 2001; Low 2003; Connor 2006; Bloom 1998;
Kahler 1973). In Honor and the Epic Hero, Maurice B. McNamee
(1960) shows that concepts about heroic and great-spirited persons have shifted from time to time. Even though the concepts of
a heroic and magnanimous person in oral culture no longer work
for modernity, we do need to formulate some concepts of heroic
and magnanimous persons that will work for modernity.
Before concluding, we should note the critique that some
authors have made of Ongs work and related work regarding
oral culture. The critique alleges that Ong has set forth a great
divide theory in which there is a great divide with oral culture
when literacy emerges (see, for example, Daniell 1986; but also
see Ongs 1987 letter about her article). Beth Daniell and others
who advance this critique do not accurately summarize what
Ong has said, and so their supposed critique amounts to little
more than knocking down a straw man named by them Ong.
(For a more detailed response to this alleged line of critique, see
Farrell 2000, 1626, 2004).
In conclusion, in oral culture, people are culturally conditioned so that they tend to favor cyclic patterns of thought and
expression, to have a world-as-event sense of life, to put manliness to work in socially constructive ways, to use oral stories of
heroes as ways to help orient and put manliness to work, and to
use ritual process very effectively to promote and support socially
constructive behavior.
Thomas J. Farrell
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ballew, Lynne. 1979. Straight and Circular: A Study of Imagery in Greek
Philosophy. Assen, the Netherlands: Van Gorcum.
Bloom, Harold. 1998. Shakespeare: The Invention of the Human. New
York: Riverhead Books.

Brakke, David. 2006. Demons and the Making of the Monk: Spiritual
Combat in Early Christianity. Cambridge: Harvard University Press.
Cary, Phillip. 2000. Augustines Invention of the Inner Self: The Legacy of a
Christian Platonist. New York: Oxford University Press.
Connor, James L. 2006. The Dynamism of Desire: Bernard J. F.
Lonergan, S.J., on The Spiritual Exercises of Saint Ignatius of Loyola.
St. Louis, MO: Institute of Jesuit Sources.
Daniell, Beth. 1986. Against the great leap theory of literacy. Pre/Text
7.3/4: 18193. Also see Ong 1987.
Draper, Jonathan A. 2004. Orality, Literacy, and Colonialism in Antiquity.
Leiden: Brill.
Draper, Jonathan A., ed. 2003. Orality, Literacy, and Colonialism in
Southern Africa. Leiden: Brill.
Eliade, Mircea. [1949] 2005. The Myth of the Eternal Return. 2d ed. Trans.
Willard R. Trask, new introduction by Jonathan Z. Smith. Princeton,
NJ: Princeton University Press.
Farrell, Thomas J. 2000. Walter Ongs Contributions to Cultural
Studies: The Phenomenology of the Word and I-Thou Communication.
Cresskill, NJ: Hampton.
Fixico, Donald L. 2003. The American Indian Mind in a Linear
World: American Indian Studies and Traditional Knowledge. New
York: Routledge.
Foley, John Miles. 1999. Homers Traditional Art. University
Park: Pennsylvania State University Press.
Foley, John Miles, ed. 1985. Oral-Formulaic Theory and Research: An
Introduction and Annotated Bibliography. New York: Garland.
Havelock, Eric A. 1963. Preface to Plato. Cambridge: Belknap Press/
Harvard University Press.
. 1978. The Greek Concept of Justice: From Its Shadow in Homer to Its
Substance in Plato. Cambridge: Harvard University Press.
. 1982. The Literate Revolution in Greece and Its Cultural
Consequences. Princeton, NJ: Princeton University Press.
Horsley, Richard A., Jonathan A. Draper, and John Miles Foley, eds.
2006. Performing the Gospel: Orality, Memory, and Mark. Minneapolis,
MN: Fortress.
Jousse, Marcel. 1990. The Oral Style. Trans. Edgard Sienaert and Richard
Whitaker. New York: Garland.
Kahler, Erich. 1973. The Inward Turn of Narrative. Trans. Richard
Winston and Clara Winston, foreword by Joseph Frank. Princeton,
NJ: Princeton University Press.
Kelber, Werner H. 1997. The Oral and the Written Gospel: The Hermeneutics
of Speaking and Writing in the Synoptic Tradition: Mark, Paul, and Q.
2d ed. Foreword by Walter J. Ong, S.J., new introduction by Werner H.
Kelber. Bloomington: Indiana University Press.
Lee, Dorothy. 1987. Freedom and Culture. Prospect Heights,
IL: Waveland.
Lord, Albert B. 1960. The Singer of Tales. Cambridge: Harvard University
Press.
Low, Anthony. 2003. Aspects of Subjectivity: Society and Individuality from
the Middle Ages to Shakespeare and Milton. Pittsburgh, PA: Duquesne
University Press.
Mansfield, Harvey C. 2006. Manliness. New Haven, CT: Yale University
Press.
McLuhan, Marshall. 1962. The Gutenberg Galaxy: The Making of
Typographic Man. Toronto: University of Toronto Press.
McNamee, Maurice B. 1960. Honor and the Epic Hero: A Study of the
Shifting Concept of Magnanimity in Philosophy and Epic Poetry.
New York: Holt, Rinehart and Winston.
Morris, Jan, and Barry Powell, eds. 1997. A New Companion to Homer.
Leiden: Brill.
Nightingale, Andrea. 2004. Spectacles of Truth in Classical Greek
Philosophy: Theoria in Its Cultural Context. Cambridge: Cambridge
University Press.

573

Ordinary Language Philosophy


Ong, Walter J. [1958] 2004. Ramus, Method, and the Decay of
Dialogue: From the Art of Discourse to the Art of Reason. 3d ed. New
foreword by Adrian Johns. Chicago: University of Chicago Press.
.1962. The Barbarian Within: And Other Fugitive Essays and Studies.
New York: Macmillan.
. 1967a. In the Human Grain: Further Explorations of Contemporary
Culture. New York: Macmillan.
. 1967b. The Presence of the Word: Some Prolegomena for Cultural
and Religious History. New Haven, CT: Yale University Press.
. [1969] 1995. World as view and world as event. In Faith and
Contexts, III: 6990. Atlanta: Scholars Press.Originally printed in
American Anthropologist 71 (August): 63447.
. 1971. Rhetoric, Romance, and Technology: Studies in the Interaction
of Expression and Culture. Ithaca, NY: Cornell University Press.
. 1977. Interfaces of the Word: Studies in the Evolution of
Consciousness and Culture. Ithaca, NY: Cornell University Press.
. 1981. Fighting for Life: Contest, Sexuality, and Consciousness.
Ithaca, NY: Cornell University Press.
. 1986. Hopkins, the Self, and God. Toronto: University of Toronto
Press.
. 1987. Letter to the Editor. Pre/Text 8.1/2: 155. Comments on
Daniell 1986.
. 199299. Faith and Contexts. 4 vols. Ed. Thomas J. Farrell and
Paul A. Soukup. Atlanta: Scholars Press. Volumes now distributed by
Rowman & Littlefield.
. 2002a. An Ong Reader: Challenges for Further Inquiry. Ed. Thomas
J. Farrell and Paul A. Soukup. Cresskill, NJ: Hampton.
. 2002b. Orality and Literacy: The Technologizing of the Word. 2d ed.
New York: Routledge.
Opland, Jeff. 1983. Xhosa Oral Poetry. Cambridge: Cambridge University
Press.
Parry, Milman. 1971. The Making of Homeric Verse: The Collected Papers
of Milman Parry. Ed. Adam Parry. New York: Oxford University Press.
Renevey, Denis. 2001. Language, Self, and Love: Hermeneutics in the
Writings of Richard Rolle and the Commentaries on the Song of Songs.
Cardiff: University of Wales Press.
Riesman, David, with Rueul Denny and Nathan Glazer. 1950. The Lonely
Crowd: A Study of the Changing American Character. New Haven,
CT: Yale University Press.
Scholes, Robert, and Robert Kellogg, with a chapter by James Phelan.
2006. The Nature of Narrative. 2d ed. New York: Oxford University
Press.
Smith, David M. 1997. World as event: Aspects of Chipewyan ontology.
In Circumpolar Animism and Shamanism, ed. Takako Yamada and
Takashi Irimoto, 6791. Sapporo, Japan: Hokkaido University Press.
Turner, Victor. 1969. The Ritual Process: Structure and Anti-Structure.
Chicago: Aldine.
van Gennep, Arnold. 1960. The Rites of Passage. Trans. Monika B.
Vizedom and Gabrielle L. Caffee, introduction by Solon T. Kimball.
Chicago: University of Chicago Press.
van t Spijker, Ineke. 2004. Fictions of the Inner Life: Religious Literature
and Formation of the Self in the Eleventh and Twelfth Centuries.
Turnhout, Belgium: Brepols.
Walker, Jeffrey. 2000. Rhetoric and Poetics in Antiquity. New York: Oxford
University Press.
Williams, Bernard. 1993. Shame and Necessity. Berkeley and Los
Angeles: University of California Press.

ORDINARY LANGUAGE PHILOSOPHY


Within the analytic tradition of contemporary Anglophone philosophy, ordinary language philosophy is set in contrast to the
view that the prescriptions of formal logic provide the means

574

necessary for the elimination of the confusing ambiguities of


ordinary language. Advocates of the ordinary language approach
may recognize the power, the frequent utility, and the intellectually admirable parsimony of the formalist, but they also insist
that ordinary linguistic practice as it stands is generally appropriate for our use without any preemptory need for a comprehensive and indispensable reform, and, moreover, that ordinary
usage contains helpful distinctions and nuances that would be
hurtfully eliminated if the rigors of a formalist system were to be
imposed as the ultimate standard. Peter Strawson, J. L. Austin,
and John Searle have been, in varied ways, advocates of an ordinary language approach, and undoubtedly Ludwig Wittgensteins
move from the rigors of the Tractatus to the complexities of the
Philosophical Investigations is seminal to the entire movement.
A short entry cannot survey all the philosophers and topics
of importance in this ordinary language approach, but as prime
examples this entry will focus on one philosopher, Peter Strawson,
and on one topic, reference. We will consider the way in which
Strawson makes reference his starting point, how this single topic is
embedded in contemporary debates within the philosophies of language and logic, Strawsons own significant contributions to those
debates, and some replies he makes to his critics. Also noted will be
Bertrand Russell and Wittgenstein as prime movers in Strawsons
thought and W. V. O. Quine as a stern and characteristic critic.
The best known of all Strawsons writings is probably his early
article On referring (1950) in which he addresses the issue of
singular reference and predication and their objects, a matter with
which he was concerned throughout his working life. This article
was written in response to Bertrand Russells theory of definite
descriptions contained in On denoting (1905). For Strawson,
we use a variety of expressions to refer to some individual person,
object, or event. We use singular descriptive pronouns (this and
that), proper names (Winston Churchill), and singular pronouns
(I, you, it), and for what are called definite descriptions, we use
the definite article followed by a noun in the singular, e.g., the
king of France. Suppose someone at present utters the sentence
The king of France is wise (S). For Bertrand Russell, S is significant; that is, it may be true or false. But he claims that to show the
true logical form of S, it needs to be rewritten as
(1) There is a king of France.
(2) There is not more than one king of France.
(3) There is nothing that is the king of France and is not
wise.
Thus, someone uttering S today would be saying something
significant but false. For Russell, we must distinguish definite
descriptions such as the king of France from logically proper
names, for example, Winston Churchill. The latter alone can be
subjects of sentences of a genuine subject-predicate form and
have some single object for which they stand.
Strawson thinks that Russell is wrong in this, since his account
of sentences 13 is neither completely nor even partially correct.
A correct account must begin by distinguishing among
a sentence
a use of a sentence
an utterance of a sentence

Ordinary Language Philosophy


The sentence The king of France is wise can be uttered at
various times and for various purposes. We cannot say that the
sentence is true or false, only that it may be used to make a true
or false assertion. At the heart of Strawsons position is the claim
that referring is not something that an expression such as the
king of France does. Referring is, instead, characteristic of the
use of an expression. Meaning is a function of the sentence or
expression, but mentioning and referring and truth and falsity
are functions of the use of the sentence or expression.
Russells claim is that someone at present uttering The king
of France is wise (S) would a) be making a true or false statement, and b) be asserting that there exists at present one and
only one king of France. Strawson finds Russell wrong on both
counts. For Strawson, the sentence is significant, since it could
be true or false, and it could refer to a particular person. But that
does not mean that any particular use of the sentence must be
either true or false. Ordinarily, a person uttering S presupposes
the existence of the king, and his uttering S neither asserts nor
entails the kings existence. Thus, presupposition must be carefully distinguished from both assertion and entailment.
There is, moreover, a need to distinguish rules for referring
from rules for ascribing and attributing. That distinction roughly
corresponds to the grammatical distinction between subject and
predicate. For Strawson, that irreducible distinction has been
blurred by logicians in their desire to reduce or to eliminate
altogether the referring use. He finds a prime example of that
attempted elimination in Gottfried Leibnizs effort to establish
individual identity through the use of complete individual concepts done in exclusively general terms. Strawson thinks that
Russell also strives to make logic in a narrow sense adequate for
referring to individuals.
It is particularly noteworthy that Strawsons fundamental distinctions between sentence and utterance, and between referring
and describing, are a challenge to the votaries of modern logic.
Consider such nonuniquely referring expressions as all, no, some,
and some are not, that is, the four types of standard form categorical propositions: A, E, I, and O. For the modern, only I and O
propositions have existential import. In consequence, the modern must deny some traditional doctrines, such as the square of
opposition and the validity of some forms of the syllogism. The
moderns dilemma is for Strawson a bogus one. We may simply say that the question of whether or not the quantificational
expressions are being used to make true or false statements just
do not arise except that when the existential conditional is fulfilled for the subject term, then all of the laws of traditional logic
hold good. If we ask a literal-minded and childless man if all of
his children are asleep, he will not answer either yes or no
because the question simply does not arise.
For Strawson, neither Aristotelian nor Russellian rules give
the exact language of any expression of ordinary language, since
ordinary language has no exact logic.
In light of On referring, Strawson sets out in Introduction
to Logical Theory (1952) to remedy the failures of modern logicians to address adequately the relationships between formal
logic and the logical features of ordinary language. He begins by
noting differences among the various ways we make judgments
about what someone says. To say that a statement is logical is
ordinarily a commendation. There is a further and more complex

distinction when we say that a statement is untrue or that it is


inconsistent. If a deductive argument is valid, if the premises are
true, then the conclusion is judged necessarily true under pain of
inconsistency or self-contradiction. But in all of this we must also
consider the context of statements that are made. Asked if the
results of the recent election pleased me, I may significantly reply
that they did and they didnt. Words such as vehicle and entertainment have only approximate boundaries for their appropriate use. The uses and therefore the meanings of various words
and expressions are subject to expansion and contraction.
Thus, as in On referring, logical appraisal is properly applied
to statements, not sentences. We need, therefore, to approach
the relation between formal logic and ordinary language with
caution. In formal logic, a formula is an expression such that by
substituting words or phrases for the variables we can obtain
sentences that could be used to make statements. In the formula
x is a younger son, to substitute Tom for x would yield a
sentence that would have meaning, while to substitute The
square root of 2 would not. Thus, some variables would yield
sentences, but not significant statements. We can talk about the
range of admissible values for a variable, but, unlike formal logic,
in ordinary language there are no precise rules for what is admissible. Once again, statements have a contextual component, and
that goes beyond the reach of formal logic.
The limits of formal logic are also manifest in its use of symbols for truth- functional connectors. Consider particularly the
logical symbol . Consider If it rains, the party will be a failure.
That suggests conditions that are neither logical nor linguistic, but are instead discovered in our experience of the world.
Compare the function of the connector in that sentence with the
function of the same connector in If he is a younger son, then he
has a brother. Similar limitations are evident in the use of other
connectors. Consider the question of the connector that is the
appropriate one for unless.
Again, when logicians choose the pattern for their representative rules, they employ common uses drawn from ordinary language, and then proceed to make standard what is common. In
this way, a rigidity is imposed that is foreign to the uses of ordinary language.
The logician is not a lexicographer but is concerned only with
general principles that are indifferent to subject matter. The difficulty here is that sometimes different expressions may have the
same uses; all, the, and a may have the same use in describing
the basic move of the pawn in chess. Similarly, the same expression may have different uses; not and not may be used as a
double negation, to emphasize, or to show necessity. The logician would eliminate this complexity and clutter and impose a
systems rules to cure the perceived deficiencies of ordinary language. The logician is content with all, some, and no but has no
use for most or few, despite the usefulness and common employment of those terms in our ordinary reasoning.
There are further challenges to any claim for a sovereignty of
formal logic over the workings of ordinary language. Consider
the notion of logical form as a sort of verbal skeleton that remains
when all expressions, except the selected logical constants, are
eliminated from a sentence that might be used as a statement
and are replaced by variables. For Strawson, this notion of logical form is viable, but it may lead us to the mistaken conclusions

575

Ordinary Language Philosophy


that a statement must have just one logical form, or that logical
form makes the work of the lexicographer superfluous, or that
logical features need not take into account the relevant subject
matter, or that validity depends upon form, rather than the other
way about.
The claim may be made that appropriate caution will enable
us to avoid such mistakes, and that the relation between ordinary language and formal logic might be seen minimally as peaceful coexistence, and more truly as a separation of powers that is
necessary and useful to both sides. For Strawson, the ongoing
difficulty here is that the logician is not content with being consistent but seeks the completeness of a system. That ideal is compromised by the fact that the typical truth-functional connectors
defy a single ordinary use, and the attendant complexities run
counter to a mathematical model taken as the paradigm for the
whole of logic. That paradigm appeals, but its seduction misleads, with profound consequences for the study of metaphysics
and epistemology.
Of all of the identifications between the truth-functional connectors of formal logic and ordinary words, Strawson finds conjunction and negation least troublesome, but even here there are
limitations. By the laws of formal logic, pq and qp are equivalent, but in ordinary language, the order may be essential to the
meaning. Most troublesome in the identification of logical connectors and ordinary words is . The falsity of the antecedent
suffices in material implication for the truth of the statement, but
not in the corresponding hypothetical statement.
The workings of the class system of modern quantificational
logic further compound these difficulties about the relation
between ordinary language and truth-functional logic. Modern
orthodoxy claims that once the older Aristotelian system is
cleaned up, it is simply a small part of todays quantificational
logic. Conversely, Strawson contends that with only a few reservations, the traditional rules dating from Aristotle conform to
the use of words in ordinary language, and indeed avoid some
of the incongruities of the moderns practice. Standard criticism
of tradition rests on the question of existential import, that is, for
the four moods A (all), E (no), I (some) and O (some are not ),
whether there is a commitment in the tradition to the actual
existence of the members of the terms. The moderns assumption is that only I and O have such import. But consider someones saying All Johns children are asleep. Again, if John is
fatherless, the existential import question simply does not come
up. The existence of those children is a necessary precondition of
the statement being either true or false. The modern goes wrong
in failing to distinguish sentence from statement. The sentence
may be true or false, that is, meaningful, but in its use as a statement, the question of existential import is determined by the
context.
In sum, for Strawson there are two kinds of logic, the entailment rules of formal logic, which abstract from the time and
place of utterance, and the referring rules, which lay down the
contextual requirements of what a sentence presupposes. In the
study of those referring rules, we do not find the elegance and
system of formal logic, but Strawson does find a field of intellectual study unsurpassed in richness, complexity, and the power
to absorb. The two kinds of logic are interrelated, and both are
necessary in human communication.

576

Not all have agreed. Quine is notable for his long-standing


disagreement on the issue of singular terms and reference, the
issue which is central in Strawsons On referring and pervasive in all of his later writings. For Quine, singular terms are at
best superfluous, to be eliminated without loss; to be is to be the
value of a variable (1972, 234). Here is the great divide that, on
Strawsons account, separates him from both Quine and Russell.
Quines concern that singular terms are ambiguous in their reference is set aside by Strawson on the ground that such terms do
not refer at all; they are, instead, used by persons to make reference. If the reference is ambiguous, the responsibility rests with
the statement maker, not the term. For that matter, ambiguity
has its own uses and, indeed, its own occasional sweetness in the
ordinary language of daily life.
Strawson found strong support for his views on ordinary language in Wittgensteins transition from his positions in Tractatus
to those in the Philosophical Investigations. This is manifest in
Strawsons review of the latter work in Mind (1954). In sections
38137 of the Investigations, Strawson finds an evident rejection
of the logical atomism that characterizes the Tractatus. In that
earlier work, Wittgenstein had been concerned with the idea of
the genuine names of a language, and with the idea of the simple
indestructible elements of reality that are only to be named, not
described or defined, and which are the meanings of those genuine names. These primary elements are Russells individuals,
and the objects of the Tractatus. Those elements are connected
to the belief that the clarification of ordinary language depends
on an analysis in which ambiguous sentences are replaced by
ones that reflect exactly the logical form of the fact under consideration. Logic then seems to be pure, exact, and general, the
essence of the thoughts that mirror the structure of the empirical
world. That leads us to the illusion that this process of analysis
is finite, that there is a single completely resolved form for every
expression.
For Wittgenstein in the Philosophical Investigations, and for
Strawson, the cure for this illusion is to give up the search for the
very essence of language and to direct our attention, instead, to
the various ways in which language actually functions, In a wellknown example, Wittgenstein asks us what is common to all of
the proceedings we call games. We cannot say that they must all
have something in common, an essence, simply because they
have a common name. There is no single element they all share.
There are only family resemblances, a network of overlappings
and crisscrosses.
What is true of games is true of linguistic activity; there is no
single use, only family resemblances. There is no exact boundary of use, although a fixed boundary could be set to serve some
particular purpose. A word or a linguistic practice need not be
exact in order to be understood and acted upon; stand roughly
here may be serviceable enough. To say in dispraise that it is
inexact misses the mark. The demand for absolute and fixed
meanings is senseless. Whether or not there is enough precision is determined by whether the concept is used with general
agreement.
The consequence is that we are not to provide ordinary language with a necessary revision and reduction; we are simply to
describe the ways it works. If we do so, we eliminate the puzzles
that arise when language goes on holiday, when we consider

Origins of Language
words and sentences in abstraction from their ordinary uses. For
Wittgenstein, philosophys proper task is simply the assembling
of a series of reminders of actual uses, with the purpose of dispelling confusions that arise in specific contexts.
That conception of philosophy is one that Strawson shares up
to a point, but he also finds that an appropriate philosophy of language provides the basis for a descriptive metaphysics, one that
is content to give an account of the actual structure of the world
of our experience. This is set in contrast to a revisionary metaphysics that vainly strives to do better. Descartes, Leibniz, and
George Berkeley are revisionary; Kant and Aristotle are descriptive. That contrast in many ways mirrors the distinction between
ordinary language philosophy and those formalist attempts that
only mar whats well.
Clifford Brown
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Austin, J. L. 1962 . How to Do Things with Words. Oxford: Clarendon.
Brown, C. 2006. Peter Strawson. Montreal: McGill-Queens University
Press.
Quine, W. V. O. 1972. Methods of Logic. New York: Holt, Rinehart and
Winston.
Russell, B. 1905. On denoting. Mind 14.4: 47993.
Strawson, P. 1950. On referring. Mind 59: 2152.
. 1952. Introduction to Logical Theory. London: Methuen.
. 1954. Wittgensteins Philosophical Investigations. Mind, n.s.,
63.249: 7099.
. 1974. Subjects and Predicates in Logic and Grammar. London:
Methuen.

ORIGINS OF LANGUAGE
This term and language evolution are sometimes used interchangeably. Here, origins of language will mean the earliest
emergence of a system structurally distinct from the communication systems of other animals and having at least some of the
attributes of human language; the term will not include further
evolutionary developments leading to the emergence of fully
modern language or the subsequent diversification of human
languages (see historical linguistics). Although the topic
is one that has engaged the human imagination throughout history, an orgy of armchair speculation following Darwin caused
it to fall into disrepute. However, now that advances in a variety of sciences have made possible more informed (if still inevitably speculative) approaches, the attention devoted to it has
increased annually, with perhaps an overly ebullient proliferation of theories.
The issues may be more sharply defined by considering separately three major questions to which the topic gives rise:
(1) Was language directly selected for, or an emergent product of other faculties?
(2) If it was selected for, what pressure(s) selected for it?
(3) What form did its earliest emergence take?
Other issues involve the timing of language emergence, the
modality it originally employed, and whether or not language
evolved directly from prior means of animal communication.

Direct Versus Indirect Selection


The notion that language constituted an evolutionary adaptation
sensu stricto in other words, that it arose through some selective pressure acting directly upon pre-existing genetic material
had been around since Darwin but is most cogently expressed
by Steven Pinker and Paul Bloom (1990). Arguing against the
suggestion, made by Stephen Jay Gould among others, that language could be a spandrel an accidental by-product of other
evolutionary developments these authors pointed out that the
intimate interconnections among the various parts of language
parallel a similar interconnectivity in the eye, an object universally agreed to have evolved through natural selection. Although
their approach entailed a gradual process of evolution, they did
not address the initial stage of that process nor discuss in detail
possible adaptive pressures (beyond suggesting that competition among humans was probably more influential than environmental factors).
While few if any scholars would deny that selective pressures
have played a role in the development of many prerequisites for
language, some still suggest that the emergence of language itself
was not specifically selected for. The notion that language was an
invention by human ancestors with expanded brains is still held
by some (e.g., Donald 1991). Others propose that laws of form
affecting brain structure and growth played a more significant
role than natural selection (Jenkins 2000; see also biolinguistics). Alternatively, a mutation or the modification of some prior
nonlinguistic faculty might have yielded recursion, the capacity to generate infinite structures from finite materials (Hauser,
Chomsky, and Fitch 2002), and recursion added to prehuman
conceptual structure might have sufficed to produce language.
These last two proposals imply that language emerged in more
or less its current form, without any intermediate stage between
animal communication and true language. Approaches of
this type would be strengthened if the required laws of form,
mutations, or changes of function could be precisely specified;
this has not yet been done.

Selective Pressures
Among those who see language as an adaptation, explanations
for the selective pressure involved have changed over time. Until
the 1980s, it was widely assumed that language arose for purposes of tool making and/or cooperative hunting. However, ecological studies revealed complex cooperative hunting patterns in
nonhuman species, while anthropological studies showed that
preliterate peoples made tools and taught tool making largely
without using words. Moreover, ethological studies of ape species showed highly complex societies in which individuals competed with and sought to deceive and outwit one another (Byrne
and Whiten 1992). An influential essay (Humphrey 1976) had
already suggested that higher cognitive faculties, including language, had most likely been generated through intense withingroup competition.
The view that language arose from social intelligence is
nowadays shared by a majority, but it has problems. Social
competitiveness is far from unique to humans; so why has no
form of language, however rudimentary, evolved in other primate species? A unique adaptation suggests a unique pressure. Furthermore, there must surely have been a stage when

577

Origins of Language
language was limited to a handful of symbols with which it
would have been impossible to express any socially significant
meaning. What, in such a situation, would have reinforced language use?
Advocates of some form of social adaptive pressure whether
for gossip (see grooming, gossip, and language), sexual
display, or social manipulation have so far failed to address
such problems adequately. An alternative proposal is that some
primitive form of language developed for exchanging information about food sources among small groups of extractive foragers (Bickerton 2002). Carcasses of megafauna, in particular,
would have required the rapid recruitment of significant numbers for efficient exploitation. Nobody doubts that language,
once it had emerged, would have been used for a variety of social
functions; such functions, in turn, would have expanded language. The real, and still unanswered, question is exactly what
led to its initial emergence.
The issue is rendered still more problematic by the fact that
words are cheap tokens (Zahavi 1975). Since they take so little
effort to produce, and since primate species constantly engage in
deception, why would anyone have believed them, and if no one
believed them, who would have persevered in their use?

Initial Structure
While some (as noted) believe that language has always possessed its present structure, most researchers would probably
agree that some simpler form developed first a stage generally
termed protolanguage (Bickerton 1990) and subsequently grew
more complex. Until recently, it was assumed that protolanguage,
like early-stage pidgins, consisted of a small quantity of units
(roughly equivalent in semantic coverage to modern words)
that could be concatenated, without any consistent grammatical
structure, to form brief propositions; in other words, protolanguage was compositional. This view is now challenged by
the proposal that protolanguage was synthetic, with holophrastic
units (like the units of animal communication systems) roughly
the semantic equivalents of complete propositions and not divisible into smaller meaningful units the whole thing means the
whole thing (Wray 2002, 118).
Defenders of a synthetic system note that (in contrast with
a compositional system) there would nowhere be any break in
continuity between language and the prelinguistic communication system of hominids (assumed to be similar to those of other
primates; see primate vocalizations), which it would at first
resemble except for productivity (holophrastic units could be
multiplied indefinitely). At a subsequent stage, chance phonetic
similarities between portions of holophrases would cause the latter to be reanalyzed into wordlike segments; these could then be
recombined to form a modern, compositional language.
It is claimed that a synthetic protolanguage would be less subject to ambiguities than a compositional one and would be better adapted for manipulation of other group members. Support
has come from computational linguists, many of whose
simulations of language evolution begin with units that represent propositional rather than lexical units (Briscoe 2002). Those
who, following Darwin and Otto Jespersen, assume a common
origin for language and music are more or less obliged to adopt
some form of the synthetic hypothesis.

578

A synthetic protolanguage faces many difficulties, however


(Tallerman 2007). Whereas a compositional protolanguage
enables basic functions of language, such as creating new information, asking questions, and negating statements, a synthetic
protolanguage allows for none of these. Predication and displacement are equally impossible. Other problems arise at the
stage of reanalysis into a compositional system. For instance,
unless a given holophrase is equivalent to just one sentence in a
compositional language, no two people would necessarily agree
as to the meanings of its analyzed segments; yet if such equivalence exists, a compositional language must already exist, at least
mind-internally so why is a holophrastic stage necessary? The
precise nature of protolanguage has been, and will doubtless
continue to be, hotly debated, a debate to which experimental
evidence will hopefully contribute (Bowie 2006).

Other Issues
A further controversy revolves around whether language was originally spoken or signed. Given that sign languages develop as
naturally among the deaf as do spoken ones among the hearing,
and that the hands of our closest primate relatives are more agile
and under more volitional control than their vocal organs, the
notion of a signed protolanguage is not unreasonable and has
been vigorously defended (Corballis 2002). However, even if the
original modality could be determined (and for all we know, protolanguage could originally have mixed signs and vocalizations
indiscriminately), this would not answer the questions discussed
here or tell us how language came to acquire the properties that
distinguish it from other modes of communication.
Another unresolved issue concerns the timing of emergence.
None of the evidence from the fossil record is unambiguous.
Endocasts of Homo habilis suggest a developed brocas area,
and this has been taken to indicate an early (~2.5 million years
ago) beginning for language. But since, even today, Brocas area
subserves both linguistic and nonlinguistic functions, we cannot know what functions it performed in antecedent species.
Symbolic artifacts are sometimes used to date language origins,
but while these indicate that language already existed, they cannot tell how long before their appearance it began. Absent reliable evidence, estimates of when language originated tend to
be determined by researchers positions on other issues. For
instance, those who believe that language emerged abruptly
more or less in its present state favor a recent date coincidental with the emergence of anatomically modern humans (~140
thousand years ago), or even later. Conversely, those who take
an adaptationist approach argue for a much earlier date, anything up to a few million years ago. The origin of language is
probably associated with some speciation event, but this issue,
like most others, is unlikely to be resolved without new sources
of evidence.
The question of continuity with prelinguistic systems is
somewhat clearer. That language evolved from some prior communicative system was, to Darwin, an article of faith, and some
subsequent authors have assumed that a commitment to gradual evolution entails such continuity, discounting the possible
capacity of mutations, changes in function, and interactions
between different faculties to produce evolutionary novelties.
But the only plausible continuist scenario is the holophrastic,

Overregularizations
synthetic model of Wray, discussed previously. If objections to
this are overcome, the case for continuity could be maintained;
otherwise, the differences between language or even protolanguage and any nonlinguistic system suggest a sharp discontinuity between the two.
Derek Bickerton
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bickerton, Derek. 1990. Language and Species. Chicago: University of
Chicago Press.
. 2002. Foraging versus social intelligence in the evolution of protolanguage. In The Transition to Language, ed. Alison Wray, 20726.
Oxford: Oxford University Press.
Bowie, Jill. 2006. The evolution of meaningful combinatoriality. Paper
presented at the Sixth International Conference on the Evolution of
Language, Rome.
Briscoe, Ted,
ed. 2002. Linguistic Evolution through Language
Acquisition: Formal and Computational Models. Cambridge: Cambridge
University Press.
Byrne, Frank, and Andrew Whiten. 1992. Cognitive evolution in primates. Man 27: 60927.
Corballis, Michael. 2002. From Hand to Mouth: The Origins of Language.
Princeton, NJ: Princeton University Press.
Donald, Merlin. 1991. Origins of the Modern Mind. Cambridge: Harvard
University Press.
Hauser, Marc, Noam Chomsky, and Tecumseh Fitch. 2002. The faculty of language: What is it, who has it, and how did it evolve? Science
298: 156979.
Humphrey, Nicholas K. 1976. The social function of intellect. In
Growing Points in Ethology, ed. P. P. G. Bateson and R. A. Hinde,
30317. Cambridge: Cambridge University Press.
Jenkins, Lyle. 2000. Biolinguistics: Exploring the Biology of Language.
Cambridge: Cambridge University Press.
Kirby, Simon, and Morton H. Christiansen, eds. 2003. Language Evolution.
Oxford: Oxford University Press. A collection of position papers by
leading scholars in the field.
Pinker, Steven, and Paul Bloom. 1990. Natural language and natural
selection. Behavioral and Brain Sciences 13: 70726.
Tallerman, Maggie. 2007. Did our ancestors speak a holistic protolanguage? Lingua 117: 579604.
Wray, Alison. 2002. Dual processing in protolanguage: Performance
without competence. In The Transition to Language, ed. A. Wray,
11337. Oxford: Oxford University Press.
Zahavi, Amotz. 1975. Mate selection a selection for a handicap.
Journal of Theoretical Biology 53: 20514.

OVERREGULARIZATIONS
Overregularizations like runned and mans have played a major
role in the language development literature for more than 40 years
(see also childrens grammatical errors; morphology,
acquisition of; syntax, acquisition of). Once brought
into focus, overregularizations comprised prime examples of
the way in which childrens use of general grammatical rule
knowledge (the regular past tense rule of adding -ed) could productively overwhelm the word-specific knowledge they gained
form actual input, a prime example of using rules to go beyond
the input. Less attention was paid to the way in which children
would get rid of overregularizations, but results indicated that
there is a verb-by-verb competition between the regular rule and

the irregular form, and that experience eventually settles in favor


of the irregular form.
In a crucial 1986 paper, however, D. E. Rumelhart and J. L.
McClelland showed that newly developed connectionist
networks could simulate the rise and decline of overregularizations in childrens speech without the use of general rules or syntactic symbols like verb. Briefly stated, connectionist networks
hypothesize connective paths between the constituent features
of present and past forms. Through feedback about correctness,
eventually the right feature-to-feature connections get sorted
out. With skillful design, such networks can simulate the temporal courses through which children pass without ever using
any general rule statement at all. In such a model, there is an
implicit competition between irregular and past forms, but the
competition is really among connections of features. There is no
general rule, no general reference to verb as a category. Irregular
and regular forms are produced by a single overall network process. So these are called single process models versus dual process
(general rule vs. individual lexical entry).
Rumelhart and McClellands paper instigated a series of
simulations, arguments, criticisms, and new simulations that
continue to this day (e.g., McClelland and Patterson 2002;
Plunkett and Marchman 1991, 1993; Pinker 1999). Perhaps the
most prominent empirical data were introduced by G. Marcus
and colleagues (1992), who analyzed longitudinal studies of four
children and cross-sectional studies of hundreds more. They
argued that in any competition account, one would expect that
overregularizations would originally occur at a high rate before
experience wore them down. But in their analysis of the longitudinal and cross-sectional subjects, they found that overall preschool year rates seemed very low, around .04 to .06 (or .02 to .10,
depending on the method).
This means, they argue, that in actuality, children probably
know the irregular form is correct as soon as they learn it. This
knowledge is available because an innate general heuristic
called blocking tells children that if two forms are possible but
only one is heard, choose the heard one. The actually heard
irregular form thus has an innate heuristic preferred status.
Overregularizations only occur if the child does not know the
irregular form, or if the child temporarily cannot remember the
irregular form and the regular rule intrudes itself. Such retrieval
errors are posited to be inherently rare, for some unstated
reason.
The blocking hypothesis requires a general reference to alternative rules and, in practice, to regular rule versus irregular individual lexical patterns. So blocking contradicts connectionist
formulations in many ways. If there is no competition, connectionist models cannot be correct, as they presuppose competition of some sort. If blocking in particular is correct, it requires
statement at general symbolic and rule levels, and so network
formulations are inadequate.
M. Maratsos (2000), however, has argued against the empirical conclusions of Marcus and colleagues (1992). Using sampling arguments, he notes that for frequent irregular verbs,
which dominate overall tabulations, even in a competition
model children would probably hear hundreds of correct inputs
within a week or few weeks after the competition started, and
so overregularizations would fall to near zero very quickly; the

579

Overregularizations

Parable

result would be an overall rate of near zero in a sample of two to


three years. Our samples are so small (usually an hour a week)
that they would fail to catch these occurrences. Only less frequent verbs, discounted by Marcus and colleagues, might show
evidence of strong overregularization. In fact, that they do was
shown for R. Browns two low-overregularizing subjects Adam
and Sarah (see, e.g., 1973). For Adam, for example, the average
overregularization rate was a strong 55 percent for his 21 lowerfrequency verbs. The same rate was found even in samples after
the child first produced the correct irregular form of a verb.
Arguments from sampling considerations indicated that such
overregularizations were still persisting after tens or even hundreds of uses. Recent work from a more intensively recorded
subject, Peter (Maslen et al. 2004), has strongly supported these
analyses and extended them to noun plurals. These data indicate that overregularizations do often appear frequently after
the irregular past is known, contrary to blocking. Our samples
restrictions just make it difficult to capture them for the more
frequent irregular verbs whose numbers dominate overall
rates.
Suppose these analyses do indeed indicate that the low-rate
blocking account is incorrect. Do they also show that the connectionist account is therefore correct? Actually, they only indicate
that a competition process of some sort is involved. As noted,
older rule-based models also assumed a competition between
regular rule and individual entry. The current association of
competition with connectionism and with non-rule models thus
reflects current disputes, not the basic analytic problem. The
conflict between connectionist and rule-based approaches will
thus have to be resolved ultimately, if it can be, using other data
and arguments.
Michael Maratsos
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brown, R. 1973. A First Language: The Early Stages. Cambridge: Harvard
University Press.
Maratsos, M. 2000. More overregularizations after all: New data and discussion on Marcus, Pinker, Ullman, Hollander, Rosen & Xu. Journal
of Child Language 27: 183212.
Marchman, V., and E. Bates. 1994. Continuity in lexical and morphological development: A test of the critical mass hypothesis. Journal of
Child Language 21: 31718.
Marcus, G., S. Pinker, M. Ullman, M. Hollander, T. Rosen, and F. Xu.
1992. Overregularizations in language acquisition. Monographs of
the Society for Research in Child Development 57, serial no. 228.
Maslen, R. J., A. L. Theakston, E. V. Lieven, and M. Tomasello. 2004.
A dense corpus study of past tense and plural overregularization in English. Journal of Speech, Language, and Hearing Research
47: 131933.
McClelland, James, and Karalyn Patterson. 2002. Rules or connections
in past-tense inflection: What does the evidence rule out? Trends in
Cognitive Science 6: 46572.
Pinker, S. 1999. Words and Rules: The Ingredients of Language. New
York: Basic Books.
Plunkett, K., and B. Marchman. 1991. U-shaped learning and frequency
effects in a multi-layered perceptron: Implications for language acquisition. Cognition 28: 73193.
. 1993. From rote learning to system building: Acquiring verb morphology in children and connectionist nets. Cognition 48: 3559.

580

Rumelhart, D. E., and J. L. McClelland. 1986. On learning the past tenses


of English verbs. In Parallel Distributed Processing: Explorations in
the Microstructure of Cognition. Vol. 2: Psychological and Biological
Models. Ed. J. L. McClelland, D. E. Rumelhart, and the PDP Research
Group, 21671. Cambridge, MA: Bradford Books/MIT Press.

P
PARABLE
Standard definitions, such as the one given in the Oxford English
Dictionary, conceive of parable as a literary term; it is said to be
the expression of one story through another. Literary historians
have modified this conception by placing limits on the kind of
story that counts as parable, attempting to distinguish it from, for
example, fable or allegory.
There are, however, even among literary scholars, some who
see parable as a much larger phenomenon, belonging not merely
to expression and not exclusively to historical genres but, rather,
as C. S. Lewis (1936, 44) observed, to mind in general. (See also
Louis MacNeices discussion of literary critical perspectives on
parable in MacNiece [1963] 1965, 5.)
For the language sciences, parable is not only, or even chiefly,
a kind of story; it is not an expression at all but, rather, a mental faculty that allows the human mind to integrate two conceptual stories or narratives into a third story, thereby creating a
conceptual blending network that has emergent meaning.
Straight history, or the observation of human interaction,
often can serve as the material for such parabolic blending. For
example, Sun Tzus The Art of War treats 13 aspects of warfare.
It has been studied in the West by military strategists since the
eighteenth century. Written in the sixth century b.c. in China, it
precedes by a couple of millennia the origin of modern business
management. But in the 1980s, it underwent extensive parabolic rendering in numerous books and articles for the purpose
of offering guidance to twentieth-century graduate students of
business and investment on how to conduct their professional
lives.
Parable frequently blends two stories that have strong conflicts in their content. It is a scientific riddle why human beings
should be able to activate two conflicting stories simultaneously,
given the evident risks of mental confusion, distraction, and
error. Yet, uniquely among species, human beings can evidently
not only activate fundamentally conflicting stories simultaneously and construct connections between them but also blend
them to create emergent meaning. This ability to blend two conceptual arrays with strong conflicts in their framing structure is
central to higher-order human cognition and is a hallmark of the
cognitively modern human mind. It is known as double-scope
blending (Fauconnier and Turner 2002).
Consider a parable from the Fourth Gospel. In John 10:1118,
Jesus presents Himself as the good shepherd, who lays down
His life for the sheep, in contrast to the hired hand, who does
not care for the sheep and flees in the face of the wolf. He says
the Father loves Him because He lays down his life and that no
one takes it from Him. Rather, He has the power to lay it down

Parable

Paralanguage

and take it up again. The clash between the story of the shepherd and the blend Jesus proposes is astonishing. It is quite
implausible that a shepherd would choose to die defending the
sheep, because then the sheep would be without a defender.
Yet this consequence is not projected to the blend: The actual
shepherd cannot return after being killed to look out for the
flock, but in Jesuss blend, He can. The emergent structure in the blend is crucial: Jesuss narrative blends dying
with physical manipulation of an object. (Physical manipulation is at the root of human understanding. See Chapter 4 of
Turner 1996, Actors Are Movers and Manipulators.) In the
story of manipulation, we can lay down an object and pick it
back up. Blending manipulation of a physical object with the
state of being alive or dead, Jesus achieves the remarkable
ability of self-revival.
As discussed in Chapter 4 (Analogy) of Cognitive Dimensions
of Social Science (Turner 2001), almost all the mental achievements analyzed by analogy theorists as analogy involve considerable unrecognized blending. In general, analogy involves
dynamic forging of mental spaces, construction of connections between them, and blending of the mental spaces to create
a conceptual integration network of spreading coherence, whose
final version contains a set of what are recognized, after the fact
in the rearview mirror, as systematic, even obvious analogical
connections. But those analogical connections are more often
the outcome of conceptual blending than its preconditions. Put
differently, what is commonly discussed as analogy manifests
the faculty for parable.
It is also important to recognize that a parable is not, in general, a conceptual metaphor for understanding one conceptual domain in terms of another. Consider 2 Samuel 12. The
prophet Nathan creates an elaborate blend in which a rich man
is blended with King David, a poor man is blended with Uriah
the Hittite, Uriahs wife Bathsheba is blended with a favored ewe
lamb, and there is a traveler who comes to dinner. The point of
the complex blend is that David has done wrong. The source
and target are complicated, drawing on many conceptual
domains, and the principal connection is that in both of them,
one man abuses another and deserves punishment. No general
conceptual metaphor provides this set of cross-space connections. Most of them are not metaphoric.
Parable as a form of literary expression might be of interest
to historians, anthropologists, and critics. But parable as a species-specific mental faculty that can activate, connect, and blend
sharply conflicting stories to produce new emergent meaning is
a far larger and more fundamental topic, posing one of the central riddles of the cognitive and language sciences.
Mark Turner
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Fauconnier, Gilles, and Mark Turner. 2002. The Way We Think: Conceptual
Blending and the Minds Hidden Complexities. New York: Basic
Books.
Lewis, C. S. 1936. The Allegory of Love. Oxford: Oxford University Press.
MacNeice, Louis. [1963] 1965. The Varieties of Parable [The Clark
Lectures]. Cambridge: Cambridge University Press.
Turner, Mark. 1996. The Literary Mind: The Origins of Thought and
Language. New York: Oxford University Press.

. 2001. Cognitive Dimensions of Social Science: The Way We Think


About Politics, Economics, Law, and Society. New York: Oxford
University Press.

PARALANGUAGE
Nuances, connotations, and innuendos, which are integral characteristics of verbal communication, are given the vague term
paralanguage. These meanings arise from sources both within
and outside of standard linguistic structure. Linguistic elements
words, word order, semantics, grammar can be utilized
for paralinguistic communication. These combine with variations
of speech melody in ways that often defy structural description.
Paralanguage (as the term implies) both draws on and lies over
the known and describable ortholinguistic levels of phonetics,
phonology, morphology and lexicon, syntax, and semantics. All of these elements can be harnessed for paralinguistic
communication, as is well known from baby talk, from connotational meaning differences in terms such as skinny, slim, slender,
and from word-order choices such as Herman Melvilles That
inscrutable thing is chiefly what I hate. In addition, emotion,
attitude, intention, mood, psychological state, personality,
and personal identity can be communicated without referring
to words. Because of the power of the intonational contribution to paralanguage, the notion of two channels in the speech
signal has been invoked, but their intimate interplay has been
emphasized (Bolinger 1964). Words can communicate emotions,
but when prosody does so using a different channel, the paralinguistic intent overrides the ortholinguistic content, as in Im not
angry! spoken with increased pitch, amplitude, and rate.
Much of paralanguage is carried over longer stretches of
utterance than the phonetic element or the single word. Lexical
and syntactic choices may interact with intonational features
with a cumulative effect. Formulaic and nonliteral expressions
may be called into play. It has come to my attention that your
stonewalling is holding up the works contains conventional and
metaphoric utterances that build to a message more fraught
with paralinguistic content than Ive learned that your hesitation is contributing to a delay. Although subtle contrasts can be
conveyed on short utterances (see Nine ways of saying yes in
Crystal 1995), paralanguage prefers a larger canvas. Repetition of
words (Shakespeares a little, little grave) may have a powerful
paralinguistic effect. Movement from low to high pitch across an
intonational unit displays surprise or amazement; temporal units
are stretched to express sadness or disappointment; increased
intensity signals aggression or thematic emphasis; voice quality
becomes creaky to communicate victimization or breathy to
signal excitement.
Prosody, a major vehicle of paralanguage, can be decomposed into measurable elements: timing, pitch, amplitude, and
voice quality. These measures combine into complex patterns,
such that associating acoustic cues with paralinguistic meanings
is far from straightforward. John didnt drive the car can be
intoned with sadness, happiness, fear, or disgust, and may enfold
attitudes such as incredulity, relief, perplexity, or amusement.
Contradiction or denial, and conversational presumptions, such
as sincerity and truthfulness, are carried by phrasal intonation.
Take a common paralinguistic trope, sarcasm (see irony), in

581

Paralanguage

Parameters

That was a good effort. We know sarcasm when we hear it, but
exactly what in the signal conveys that the speaker is intending
to communicate the opposite of the usual lexical meanings is difficult to specify. In one version, the sarcastic utterance utilizes
higher pitch and greater amplitude on the first word followed
by falling intonation, pharyngeal voice quality, tensed vocal
tract, and spread lips. While morphological, lexical, and syntactic meanings can be structurally analyzed using units, features,
and rules, paralinguistic meanings constitute a brew of unstable,
fleeting, and subjective qualities. These paralinguistic qualities
shade into one another, and they impinge on purely linguistic
uses of prosodic contrasts, as in question and statement intonation. The auditory-acoustic cues that comprise paralanguage are
graded, in that they are not perceptually allocated by the listener
into discrete, contrastive categories as are the acoustic signals
for phonetic and lexical elements. Using deft combinations and
placements of prosodic cues, a speaker can communicate more
or less fear, gradations of perplexity, and degrees of denial.
The development of the pragmatics of communication,
a branch of linguistics that studies language use in conversation
(see conversation analysis), jokes (see verbal humor),
and storytelling, has advanced understanding of paralanguage.
Communicative elements such as turn-taking, inference, and
theme (topic of the discourse), and how they are signaled by the
speaker and comprehended by the listener, are investigated. The
fields of prosody and pragmatics have provided another valuable
impetus for the productive study of paralanguage: investigation
of the communicative competence of right hemisphere language processing in humans. While it has long been known
that the left hemisphere modulates language processing,
studies of pragmatics and prosody indicate involvement of the
right hemisphere in processing emotions and attitudes, inference and theme. The notion of two channels, ortholinguistic and
paralinguistic, is supported by the model that allocates processing to left and right hemispheres, respectively. Paralinguistic
nuances are intimately woven into the propositional message, so
much so that synthesized speech is often judged as unpleasant.
A goal of speech synthesis is to produce more natural-sounding
speech, which means infusing paralanguage, a challenging but
worthy goal.
Diana Van Lancker Sidtis
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bolinger, Dwight. 1964. Around the edge of language: Intonation. In
Intonation, ed. D. Bolinger. Harmondsworth, UK: Penguin Books.
Crystal, David. 1995. Nine ways of saying yes. The Cambridge
Encyclopedia of the English Language, ed. David Crystal, 248.
Cambridge: Cambridge University Press.
Kreiman, Jody, Diana Van Lancker Sidtis, and Bruce Gerratt. 2005.
Perception of voice quality. In Handbook of Speech Perception, ed.
David Pisoni and Robert Remez, 33862. Maldon, MA: Blackwell.
Van Lancker Sidtis, Diana. 2007. The relation of human language to
human emotion. In Handbook of the Neuroscience of Language,
ed. Brigitte Stemmer and Harry Whitaker. San Diego, CA: Academic
Press.
Williams, C., and K. Stevens. 1972. Emotions and speech: Some
acoustical correlates. Journal of the Acoustical Society of America
52.4B: 123850.

582

PARAMETERS
The term parameter is used in linguistics on analogy with its
usage in mathematics and engineering. In mathematics, the
parameters of a function are those aspects of the function that
are held constant when defining a particular function, but which
can vary in a larger context so as to characterize a family of similar functions. For example, the function for a line in analytic
geometry is f(x) = mx + b, with x the variable and m and b parameters of the function (the slope and the y-intercept). In the definition of any one line, the parameters m and b are held constant,
while the value of x varies, giving the different points on the same
line. In a broader context, however, the parameters m and b can
vary so as to define a family of similar functions: the set of all
lines. In the same way, parameters in linguistics are properties
of a grammatical system that are held constant when characterizing one particular human language, but which are allowed to
take different values in a broader context so as to characterize
a whole family of possible human languages. The idea that the
observed variation in human languages can be understood as
the fixing of certain parameters within an otherwise innate and
invariant system of principles (universal grammar) is most
commonly associated with the Chomskyan approach to formal
generative linguistics (see generative grammar). As a result,
this approach is sometimes called the principles and parameters theory. The idea is, however, a very general one, and it
can also be used in the context of other views about the nature of
the human language faculty.
This notion of linguistic parameters was introduced into linguistic theory by Chomsky (1981) and Rizzi (1982), during the
government and binding period. The paradigmatic case
was the pro-drop parameter (or null subject parameter). It was
observed that languages like Spanish and Italian differ from
French and English in several ways that appear to be interrelated. First, Spanish and Italian allow the subject pronoun of a
finite clause to be omitted, whereas French and English do not:
(1)

a. Verr. (Italian: He/she will come)


b. *Came. (English)

Second, Spanish and Italian allow the subject to come after the
verb as well as before it, whereas French and English generally
do not:
(2)

a. Verr Gianni (Italian, Will-come Gianni)


b. *Came John (English)

Third, the subject of an embedded sentence in Spanish and


Italian can be moved to the beginning of the sentence as a whole,
even when there is an overt complementizer, whereas in French
and English some sort of accommodation is needed in sentences
like these:
(3)

a. Chi credi
che verr? (Italian)
Who you-think that will-come
b. *Who do you think that came? (English)

Although these are clearly three distinct properties of the languages in question, they have a common theme: Informally put,
French and English require that there be an overt noun phrase
in the canonical subject position immediately before the finite

Parameters
verb, whereas Spanish and Italian do not. This difference in the
syntax of subjects was also related to a morphological difference: The agreement morphology on the finite verb is rich
enough to uniquely identify which pronoun would be in the
subject position in Spanish and Italian, whereas in French and
English it is not. The universal syntactic condition, then, is that
finite clauses require subjects (the extended projection principle); the parameter concerns exactly what kind of subject is
necessary to fulfill this condition. In Italian and Spanish, the rich
agreement on the verb means that null or displaced subjects are
permissible because (roughly) much of the information concerning the sort of subject it was is locally available on the finite
verb. In French and English, the agreement on the verb is of little
help, and so an overt subject in the canonical subject position
is required. A parameter, then, is a way of attributing a unified
theoretical account of the systematic differences that distinguish
one class of languages from another.
While the pro-drop parameter was the first important parameter to be proposed, it is by now not considered the best case. A
look at a wider range of languages both nonstandard dialects
of the Romance languages and languages from other families
quickly showed that the properties in (1)(3) do not correlate
with one another as closely as was thought (Jaeggli and Safir
1989). This implies that the pro-drop parameter as it was originally conceived is either false or highly oversimplified.
That does not mean that the idea of a parameter was ill-conceived, however. The current paradigmatic example is what is
sometimes called the head directionality parameter (terminologies vary). This can be stated as an open factor in the principles
of phrase structure (see x-bar theory). Roughly put,
when a word-level category x merges with a phrase Y to create
a phrase of type X, there are two ways that the elements can be
ordered: The order can be X-Y within XP, or it can be Y-X. Setting
the parameter in the first way gives head-initial languages like
English, in which complementizers come before embedded
clauses, tense particles come before verb phrases, verbs come
before their objects, prepositions come before their objects, and
so on:
(4)

John will think that Mary showed a picture to Sue.

Setting the parameter in the second way gives head-final languages like Japanese, in which complementizers come after
embedded clauses, tense particles come after verb phrases, verbs
come after their objects, prepositions come after their objects,
and so on:
(5)

Taroo-ga Hiro-ga Hanako-ni syasin-o miseta to omotte iru.


Taro SUBJ Hiro SUBJ Hanako to picture OBJ show that thinking be
Taro is thinking that Hiro showed a picture to Hanako.

In this way, a parametric theory can account for many of the


most robust Greenbergian universals (Greenberg 1963; Dryer
1992) concerning word order in an elegant way. These two
very common and stable language types fall out of a simple and
unity choice that is made in the precise formulation of a universal principle of language.
Parameters vary widely in the range and scope of the effects
that they are supposed to capture. Some theorists have proposed

parameters that are intended to account for the large-scale differences among the major classes of languages discovered by
typology. The head directionality parameter is a parameter of
this sort. Another early example was Ken Hales (1983) nonconfigurationality parameter, which was designed to explain why
Australian languages like Warlpiri tolerate free word order and
discontinuous phrases, whereas languages like English do not.
Similarly, parameters have been proposed to capture the difference between ergative languages (like Basque and Eskimo), in
which the object of a transitive clause is treated in some respects
like the subject of an intransitive clause, and accusative languages (like English and most Indo-European languages), in
which all subjects are treated similarly. These proposals range
from radical differences in how syntactic structure is initially
constructed (Marantz 1984) to relatively minor differences in
how case and agreement morphology are assigned in a simple
sentence (Bittner and Hale 1996). Mark Baker (1996) proposes
a polysynthesis parameter that attempts to give a unified characterization of the difference between many native American languages, in which a large part of the expressive burden is placed
on verbal morphology, and languages like English, in which the
primary expressive burden is borne by syntactic combination.
Taken together, some set of parameters such as these might
characterize the major linguistic types we observe.
Other parameters operate on a smaller scale, defining the differences between historically-related languages or dialects.
The pro-drop parameter was a parameter of this sort, distinguishing French from Italian. Another example is the parameter that
determines whether the subject of a clause moves from its original position inside the verb phrase to the highest position in the
clause or not; this accounts for the difference between English,
which has subject-finite verb-object word order, and Celtic languages like Welsh, which have finite verb-subject-object word
order (Koopman and Sportiche 1991). Jean-Yves Pollock (1989)
argues that there is a parameter that says that verbs move to a
higher position in French than they do in English; this accounts
for a cluster of subtle word-order differences having to do with the
placement of verbs, negation, and adverbs in the two languages
(e.g., John kisses often Mary is normal French but bad English).
A third example is Jonathan Bobaljik and Dianne Jonass (1996)
proposal that some Germanic languages have an extra position available for subjects that other Germanic languages dont
have; this makes sentences like There have some trolls eaten
Christmas pudding possible in some Germanic languages but
not others, among other things. (See Baker 2001 for a general
overview of these parameters and several others.)
In the early days of parametric theory, it was thought that
virtually any syntactic principle could be parametrized, and
parameters were proposed that were relevant not only to X-bar
theory but also to movement, the theory of binding, and even
the projection principle. On that view, there would be a modest number of parameters (dozens or perhaps hundreds), each
of which would have a relatively large impact on the language
generated. But this view has been questioned in more recent
work. Hagit Borer (1984) proposed almost immediately that the
syntactic principles themselves are invariant, and what is parameterized is the features associated with individual lexical items.
Rather than saying that the syntax of French is different from

583

Parameters

Parietal Lobe

the syntax of English in that verbs raise to the tense/infl node in


French, this view says that the lexicon of French is different from
the lexicon of English in that French has tenses that require the
verb to move into them, whereas English does not.
Borers view has the conceptual advantage that it largely
reduces the learning of syntax to the learning of individual lexical
items. It also suggests that there might be thousands of parameters, rather than dozens, because each distinct lexical item is
a possible locus of parametric variation (see especially Kayne
2005). Each individual parameter, however, will affect only a
relatively narrow part of the grammar since it is limited to those
structures in which a particular item appears. This view is compatible with the fragmentation of the pro-drop parameter, which
is now seen as a cluster of small-scale distinctions, each of which
can vary independently of the others, giving one the flexibility to
describe the various intermediate patterns found in the dialects
of southern France and northern Italy. As a result, Borers view
has been championed by Richard Kayne (2005) as the one that
is supported by his methodology of comparing closely related
dialects.
Baker (1996, 2008), however, argues that there may also be
syntactic parameters in more or less the original sense, in addition to the fine-grained lexical parameters. Taken strictly, Borers
view does not really account for the unity of the head directionality parameter. Even the smaller-scale parameters do not seem to
vary lexical item by lexical item. For example, it is not the case
that some tenses trigger verb-adverb-object order in French and
others do not; rather, all the different tenses trigger that order in
French, whereas none of the English tenses do. Perhaps, then, the
proper locus of much parameterization is neither the individual
lexical item nor the syntactic principle, but rather a natural class
of lexical items. How to state this and what its implications are
continue to be topics of discussion.
Mark C. Baker
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baker, Mark. 1996. The Polysynthesis Parameter. New York: Oxford
University Press.
. 2001. The Atoms of Language. New York: Basic Books. A broad
overview of the notion of a parameter in linguistic theory, written for
a general audience.
. 2008. The macroparameter in a microparametric world. In
The Limits of Syntactic Variation, ed. Theresa Biberauer, 35174.
Amsterdam: John Benjamins. Provides an overview and argument for
large-scale parameters.
Bittner, Maria, and Kenneth Hale. 1996. Ergativity: Toward a theory of a
heterogeneous class. Linguistic Inquiry 27: 531604.
Bobaljik, Jonathan, and Dianne Jonas. 1996. Subject positions and the
roles of TP. Linguistic Inquiry 27: 195236.
Borer, Hagit. 1984. Parametric Syntax: Case Studies in Semitic and
Romance Languages. Dordrecht, the Netherlands: Foris.
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht,
the Netherlands: Foris. The reference that started it all.
Dryer, Matthew. 1992. The Greenbergian word order correlations.
Language 68: 81138.
Greenberg, Joseph. 1963. Universals of Language. Cambridge, MA: MIT
Press.
Hale, Kenneth. 1983. Warlpiri and the grammar of nonconfigurational
languages. Natural Language and Linguistic Theory 1: 549.

584

Jaeggli, Osvaldo, and Kenneth Safir. 1989. The null subject parameter
and parametric theory. In The Null Subject Parameter, ed. Osvaldo
Jaeggli and Kenneth Safir, 144. Dordrecht, the Netherlands: Kluwer.
Kayne, Richard. 2005. Some notes on comparative syntax, with special reference to English and French. In The Oxford Handbook of
Comparative Syntax, ed. Guglielmo Cinque and Richard Kayne, 369.
New York: Oxford University Press. A detailed discussion of general
considerations and very small-scale parameters.
Koopman, Hilda, and Dominique Sportiche. 1991. The position of subjects. Lingua 85: 21158.
Marantz, Alec. 1984. On the Nature of Grammatical Relations. Cambridge,
MA: MIT Press.
Pollock, Jean-Yves. 1989. Verb movement, universal grammar, and the
structure of IP. Linguistic Inquiry 20: 365424.
Rizzi, Luigi. 1982. Issues in Italian Syntax. Dordrecht, the
Netherlands: Foris.

PARIETAL LOBE
Anatomy
The parietal lobe is situated superior to the occipital lobe and
posterior to the frontal lobe. More specifically, it extends from the
central sulcus anteriorly, to the imaginary boundary of the parietal-occipital fissure posteriorly, to the sylvian fissure (perisylvian cortex) inferiorly. The parietal lobe(s) can be further
subdivided into three main areas. These include: 1) the somatosensory strip, also known as the postcentral gyrus (Brodmanns
area [BA] 1, 2, 3, 43), 2) the superior parietal lobule (BA 5), and 3)
inferior parietal lobule (includes BA 39-angular gyrus and 40-supramarginal gyrus). The latter two areas are separated by the
intraparietal sulcus (see Figure 1). Medially, the parietal lobe(s)
comprises the postcentral gyrus extension of the paracentral lobule, the precuneus, and part of the cingulate gyrus (see Figure 2).

Physiology
There are two parietal lobes, one in each hemisphere, which are
divided functionally on the basis of dominance. The dominant
lobe is typically the left one and the nondominant the right.
There are many different non-language functions performed
by the parietal lobe, for example, perception and localization of
touch, pressure, pain, and temperature on the opposite side of
the body, and visuospatial processing. The variety of languagerelated functions associated with the parietal lobe will be especially highlighted in the context of non-language functions.
The dominant parietal lobe is involved primarily in integrating sensory information to create a particular perception. The
inferior portion of this lobe, particularly the supramarginal gyrus
and angular gyrus, is involved in structuring information for
reading and writing (see writing and reading, neurobiology of), performing mathematical calculations, and perceiving
objects normally.
Damage to the dominant lobe can result in apraxia (motor
planning deficit) aphasia (language disorder), agnosia (abnormal
perception of objects), and sensory impairment (e.g., touch, pain).
Lesions to the inferior portion of the dominant lobe involving the
angular gyrus can result in Gerstmanns syndrome, which is characterized by leftright confusion, difficulty pointing to named
fingers (finger agnosia), impaired writing ability (agraphia), and
inability to perform mathematical calculations (acalculia).

Parietal Lobe

Figure 1.

Figure 2.

The nondominant parietal lobe, however, is involved in a different set of functions that are mostly non-language related. In
particular, this region is responsible for visuospatial functions as
it receives and integrates input from the visual system (occipital
lobe) to make sense of the spatial order of the world around us.
M. A. Eckert and colleagues (2005) found that Williams syndrome,
whose phenotype (visuospatial deficits) and genotype (deletion
on chromosome 7) are well characterized, is linked to superior
parietal impairment. Williams syndrome, thus, may provide a
valuable system for understanding parietal lobe function.
Damage to the right parietal lobe can result in a constellation
of deficits involving spatial and body relations. Bilateral lesions
may result in Balints syndrome, which affects both visual attention and motor skills. If both the parietal and temporal lobes are
damaged, memory impairments and personality changes may
result. Specifically, if this damage occurs on the dominant (left)
side, it may result in verbal memory deficits and difficulty in the
retrieval of strings of numbers. If the damage is on the right side,
it will affect nonverbal memory functions and will significantly
impair personality.

History
For more than a century, the exact role of the parietal lobe has
been debated by neuroanatomists and psychologists, with much

of the research involving humans with brain damage and animal


studies using rhesus monkeys. Sir William Turner (1873) is considered the first to describe in detail the intraparietal sulcus (BA
40). Before being scientifically discredited, phrenologists proposed that damage or disease to the parietal lobe(s) was a major
cause of melancholia (depression), and parietal eminence was
believed to relate to cautiousness (Hollander 1902). Due to the
wide variety of symptoms reported from brain damage studies,
the parietal lobes were accurately but vaguely thought to be a
general association area combining all the information from
various functions, specifically visuospatial and attention; however, details as to how this function occurred physiologically
were lacking until modern times.
Josef Gerstmann ([1924] 1971) first described finger agnosia
in a patient with a left parietal stroke, and the effects of various
lesions on the parietal cortex were identified and cataloged in
detail by John McFie and Oliver L. Zangwill (1960). Much of the
early parietal research was pioneered by scientists Macdonald
Critchley and later Juhani Hyvarinen in their respective works
The Parietal Lobes (1953) and The Parietal Cortex of Man and
Monkey (1982).
A great deal of neurological investigation has been conducted
on rhesus monkeys, and there appears to be significant overlap
between the human and monkey parietal lobe in both function

585

Parietal Lobe
and form, though it is noteworthy that differences have been
identified, such as larger parietal cortex, asymmetry of the lobes,
and more neural subdivisions in humans (Kolb and Whishaw
1990).

Language
Continuing the classical connectionist tradition of Hugo
Liepmann, Norman Geschwind (1965) championed the simplified yet controversial position that the parietal lobe acts as the
association area of association areas. Neural tissue damage to
this area often results in the classical disconnection syndromes,
for example, apraxia and others.
Aleksandr Romanovich Luria (1973) considered the parietal
cortex one piece in his two-part model of mental activity, stating that it was important for understanding reception, analysis,
and storage of information. Lesions to the left parietal lobe were
understood to result in afferent motor aphasias (difficulty in
finding the correct articulatory positions for specific phonemes),
particularly lesioned primary and secondary sensory areas
affecting speech motor control and lesioned tertiary sensory
area resulting in aphasia (the loss of speech production and/or
comprehension).
Recent research, such as that of Gregory Hickok (2000), suggests that the inferior parietal lobe serves as the connection
between phonological representations and motor control for
those representations, that is, the auditory-motor interface,
which is part of a larger network of interfaces and systems
subserving language function. Marco Catani, D. Jones, and
H. Dominic (2005) in a significant paper, confirm the analysis
that includes the inferior parietal lobe in the use and possibly
the acquisition of language via a new circuit connecting the
traditional language areas of broca and wernicke. It has
been labeled Geschwinds territory in honor of Geschwinds
original proposal that the parietal lobe is critical to language
function.
In sum, the left parietal cortex has particular areas that are
responsible for various linguistic functions. However, there are
other extralinguistic processes that the parietal lobe is known for
as well.

Extralinguistic Processes
ATTENTION. The function of the parietal lobe in attention mechanisms has been discussed over a period of time. Michael Posner
and Steven E. Peterson (1990) outlined the different subsystems
of attention: a) orientation to sensory events (not conscious), b)
signal detection for focal processing (conscious), and c) maintenance of a vigilant state (conscious). From the available neurocognitve evidence, the researchers assert that the posterior
parietal lobe plays an important role in attention mechanisms,
specifically in orientation and signal detection that are essential
for linguistic processing. Earlier, Luria (1973) identified this parietal region that mediates attention as an involuntary orienting
system. However the posterior parietal attentional mechanisms
are greatly impacted by the frontal regions that subserve alerting
mechanisms as well.
MEMORY. Traditionally, episodic memory, or declarative memory, has been attributed to the medial temporal lobe (MTL);

586

however, recent evidence (Wagner et al. 2005) suggests that the


parietal lobes may have a role to play in it as well. The role of
declarative memory in language has been attributed to word
learning or vocabulary storage. Wagner and his colleagues suggest three theories explaining the contributions of the parietal
lobe in episodic memory retrieval. They highlight that, indeed,
the parietal lobe does not play an independent role in this
retrieval; rather, it mediates the major pathways in which the
MTL subserves episodic memory.
In sum, a number of neurolinguistic positions have been
taken from available neuropsychological and brain-mapping
data (Stein 1989). These include the parietal lobe as a) a sensorimotor association area such that the posterior parietal cortex
(PPC) becomes a junction of somaesthetic and visual information that interacts in a complex fashion, b) a sensorimotor integration area, which is very similar to the previous theory except
for the addition of an actual integral function, and c) a command apparatus that is actually able to initiate a motor activity
from the accumulated sensory information. The authors propose
that although it is possible that the parietal lobe is involved in
some motor processes, it is more likely that the process is one
of maintenance than of initiation and is d) a region for directing
attention to stimuli of interest. Here, the PPC and the pathways
it receives are postulated to direct attentional focus to the target stimulus while coordinating and communicating with the
inferotemporal cortex. J. Stein advocates that the PPC does not
have a single narrow neurocognitive focus; nevertheless, it could
have a common underlying function that integrates its multifaceted involvement in cognitive as well as automatic linguistic and
extralinguistic processes.
Overall, the parietal lobe is crucial for several language functions, most importantly, naming, semantic processing, and
phonological shaping of words, as well as reading and writing.
In addition, it mediates attention and memory, both essential at
different levels of language processing.
Yael Neumann, Hia Datta and Daniel P. Rubino
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baddeley, Alan, Susan Gathercole, and Costanza Papagno. 1998. The
phonological loop as a language learning device. Psychological Review
105: 15873.
Catani, Marco, Derek K. Jones, and H. Dominic. 2005. Perisylvian language networks of the human brain. Annals of Neurology 57: 816.
Critchley, Macdonald. 1953. The Parietal Lobes. London: Edward Arnold.
Eckert, M. A., B. S. Hu, S. Eliez, U. Bellugi, A. Galaburda, J. Korenberg, D.
Mills, and A. L. Reiss. 2005. Evidence for superior parietal impairment
in Williams syndrome. Neurology 64: 1523.
Gerstmann, Josef. [1924] 1971. Fingeragnosie: Eine umschriebene
Strung der Orientierung am eigenen Krper. Wiener Klinische
Wochenschrift 37: 101012. Trans. in Archives of Neurology
24: 4756.
Geschwind, Norman. 1965. Disconnection syndromes in animals and
man. Brain 88: 23794.
Hickok, Gregory. 2000. Speech perception, conduction aphasia, and
the functional neuroanatomy of language. In Language and the
Brain: Representation and Processing, ed. Y. Grodzinsky, L. Shapiro,
and D. Swinney, 87104. San Diego, CA: Academic Press
Hollander, Bernard. 1902. Scientific Phrenology. London: Grant
Richards.

Parsing, Human

Parsing, Machine

Kandel, Eric R., James H. Schwartz, and Thomas M. Jessell. 1991.


Principles of Neural Science, 3d ed. New York: Elsevier.
Kolb, Bryan, and Ian Q. Whishaw. 1990. Fundamentals of Human
Neuropsychology. New York: Freeman.
Joseph, Rhawn. 2000. Neuropsychiatry, Neuropsychology, Clinical
Neuroscience. New York: Academic Press.
Juhani Hyvarinen. 1982. The Parietal Cortex of Monkey and Man: Studies
of Brain Function. Berlin: Springer Verlag.
Luria, Aleksandr Romanovich. 1973. The Working Brain: An Introduction
to Neuropsychology. New York: Basic Books.
McFie John, and Oliver L. Zangwill. 1960. Visual-constructive disabilities associated with lesions of the left cerebral hemisphere. Brain
83: 24360.
Posner, Michael, and Steven E. Peterson. 1990. Attention system of the
human brain. Annual Review of Neuroscience 13: 2542.
Stein, J. F. 1989. Representation of egocentric space in the posterior parietal cortex. Experimental Physiology 74: 583606.
Turner, William. 1873. The Convolutions of the Brain in Relation to
Intelligence. Yorkshire: The West Riding Lunatic Asylum Medical
Reports.
Wagner, Anthony D., Benjamin J. Shannon, Itamar Kahn, and Randy
L. Buckner. 2005. Parietal lobe contributions to episodic memory
retrieval. Trends in Cognitive Sciences 9.9: 44553.

PARSING, HUMAN
In general, parsing refers to breaking something into its
constituent parts. Thus, machines (see parsing, machine) and
humans can decompose a message (such as print or spoken language) into phrases, words, and morphemes. Most commonly
human parsing has been considered in the context of sentence
processing, particularly its syntactic and semantic aspects.
Language spoken, written, and signed can also be described
in terms of smaller functional units, including syllables, phonemes, features, and gestures.
An understanding of grammatical constraints has guided the
development of descriptive representations (e.g., sentence diagrams) and formal systems of language structure and use. Parsing
models also have been influenced by linguistic, psycholinguistic,
and cognitive theory and by techniques used in computational
linguistics, natural language processing, and speech recognition (Chomsky 1965; Bresnan 1982; Jurafsky and Martin 2000).
Representative approaches include linguistic, statistical, connectionist, and dynamical systems models (Charniak 1993; Hale
2006; McClelland and St. John 1989; Steedman 1999; Tabor and
Tanenhaus 1998; see also self-organizing systems).
The scale at which we can break the signal into pieces depends
upon both our attention to detail and to our descriptive goals, as
can be seen in numerous psychological studies that range from
ambiguity resolution (Frazier and Fodor 1978; Frazier 1987) to
assessment of our ability to perceive, produce, and use information at various levels of description. Parsing linguistic information
is not restricted to sound and print but can include a consideration of the gestures underlying the production of language by
voice (the coordinated movement of speech articulators, such as
the tongue body, tongue tip, jaw, and lips; see speech production) and sign (manual, facial, and body orientation) (Battison
1978; Browman and Goldstein 1990; Fowler and Brown 2000).
Philip Rubin

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Battison, Robbin. 1978. Lexical Borrowing in American Sign Language.
Silver Spring, MD: Linstok.
Bresnan, Joan. 1982. The Mental Representation of Grammatical Relations.
Cambridge, MA: MIT Press.
Browman, Catherine P., and Louis Goldstein. 1990. Articulatory gestures
as phonological units. Phonology 6: 20151.
Charniak, Eugene. 1993. Statistical Language Learning. Cambridge,
MA: MIT Press.
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge,
MA: MIT Press.
Fowler, Carol A., and Julie M. Brown. 2000. Perceptual parsing of acoustic consequences of velum lowering from information for vowels.
Perception and Psychophysics 62: 2132.
Frazier, Lyn. 1987. Sentence processing: A tutorial review. In Attention
and Performance. Vol. 12: The Psychology of Reading. Ed. M. Coltheart,
55986. Hillsdale, NJ: Lawrence Erlbaum.
Frazier, Lyn, and Janet Dean Fodor. 1978. The sausage machine: A new
two-stage parsing model. Cognition 6: 291325.
Hale, John. 2006. Uncertainty about the rest of sentences. Cognitive
Science 40: 64372.
Jurafsky, Daniel, and James H. Martin. 2000. Speech and Language
Processing: An Introduction to Natural Language Processing,
Computational Linguistics, and Speech Recognition. Upper Saddle
River, NJ: Prentice Hall.
McClelland, James, and Mark St. John. 1989. Sentence comprehension: A PDP approach. Language and Cognitive Processes 4: 287336.
Steedman, M. 1999. Connectionist sentence processing in perspective.
Cognitive Science 23: 61534.
Tabor, W., and M. Tanenhaus. 1998. Dynamical models of sentence processing. Cognition 23: 491515.

PARSING, MACHINE
The query Over which strait in North Wales did Thomas Telford
build a suspension bridge? illustrates the fact that natural languages have complex syntactic structures. Comparison of the
question with the answer He built a suspension bridge over the
Menai Strait reveals that the phrases including strait, occur in
different positions in the two utterances, and that the verb positions are quite different, leading linguists to propose a constituent structure like (1) for the question:
(1)

[[over/Preposition [which/Determiner [strait/Noun [in/


Preposition [North/Noun Wales/Noun ]NN ]PP ]N1 ]NP ]
PP [did/Vaux Thomas_Telford/NP [build/Verb [a/
Determiner [suspension/Noun bridge/Noun ]Noun ]
NP t/PP ]VP ]Sinv ]Sq

A parser is a program that analyzes sentences in order to figure


out their structure, using a list of rules describing the structure of
the language, such as:
(2)

S NP VP
Sq PP Sinv
Sinv Vaux NP PP
PP Preposition NP
NP Determiner N1
etc.

Most parsers begin by determining the part of speech of each


word (see word classes). A bottom-up parser then attempts to

587

Passing Theories
group the words into phrases and phrases into clauses, according
to the grammar rules, keeping track of multiple possible analyses
because of the extensive ambiguity of natural languages. Topdown parsers, though perhaps less intuitive, are more frequently
used: They essentially work by attempting to generate the input
sentence.
Standard parsing algorithms for analyzing any artificial
language (such as programming languages) have been developed and can be used with any context-free grammar. Thus,
linguists can write the grammar rules: They do not need to be
programmers. But linguists grammars of natural languages
often make use of additional devices, such as agreement or
subcategory features (as with Sinv and Vaux in (1) and (2), denoting inverted sentences and auxiliary verbs). generative
grammars, therefore, usually augment constituent structure
with additional information: There may be labels to uniquely
identify individuals or additional levels of information, such as
meanings. Work on feature-based frameworks such as lexicalfunctional grammar and head-driven phrase structure grammar has gone hand in hand with the development
of complementary parsing methods.
John Coleman
SUGGESTION FOR FURTHER READING
Jurafsky, Daniel, and James H. Martin. 2000. Speech and Language
Processing. Upper Saddle River, NJ: Prentice Hall.

PASSING THEORIES
Passing theories are utterance-specific formal semantic theories; they specify the correct interpretation, or literal meaning,
of particular linguistic utterances: sentences uttered by particular speakers at particular times. The expression passing
theory was coined by Donald Davidson in his 1986 paper A
Nice Derangement of Epitaphs, which was part of his attack on
accounts of linguistic communication essentially involving conventionally determined, shared meanings ([1986] 2005).
According to Davidson, expressions like language, meaning,
or sentence are theoretical terms used for describing, or explaining, successful linguistic communication (cf. [1992] 2001, 108f).
For communicative success, regular or conventional use of
words is not necessary; what is necessary is only that the hearer
understand what the speaker intends to mean. For instance, if by
the words a nice derangement of epitaphs the speaker intends
to mean a nice arrangement of epithets and the hearer understands that, we have a case of successful linguistic communication. Davidson suggests characterizing communicative success
in terms of the semantic intentions of the speaker. These he construes as intentions to be interpreted in a particular way on a particular occasion and by a particular hearer. Moreover, they are of
a Gricean, self-referential form (see communicative intention): A semantic intention is an intention to achieve the end
of being interpreted in a certain way by means of the intentions
being recognized by the hearer (Davidson [1986] 2005, 92 f).
Any utterance is made with a number of intentions that can be
ordered in terms of means to ends; the first intention in such
a sequence (as ordered by in order to) specifies its literal, or

588

first, meaning. A nice derangement of epitaphs thus literally


means a nice arrangement of epithets if uttered with the relevant
semantic intention and understood accordingly. According to
Davidson, this does not obliterate the distinction between literal meaning and speakers meaning; speakers meaning for
instance, metaphorical meaning always comes later in the
order of intentions.
According to Davidson, Tarski-style theories of truth
(T-theories) can be used as formal semantic theories. To specify the literal meaning of any utterance, be it ever so idiosyncratic, a full T-theory is required. In the case of malapropisms
and other novel or idiosyncratic use, these theories will be of a
transient, passing character; they might not hold for more than
a single utterance. If they hold for a certain utterance, Davidson
speaks of speaker and hearer sharing a passing theory (for
that utterance). Prior theories, on the other hand, specify the
interpretations speakers expect hearers to make, and hearers are
prepared to make, prior to actual utterances (cf. Davidson [1986]
2005, 101ff).
Davidson then uses the terminology of prior and passing
theories to renew his argument against any account of linguistic
competence essentially involving the prior mastery of a system
of shared semantic and syntactic conventions or rules: [S]haring such a previously mastered ability [is] neither necessary nor
sufficient for successful linguistic communication (Davidson
[1994] 2005, 110; cf. also [1982] 1984). To model successful linguistic communication, systematic semantic theories of passing and prior nature are required, but sharing of prior theories
is not sufficient for successful linguistic communication. Even if
speaker and hearer share a prior theory, the ability to interpret
in accordance with that theory does not account for those cases
of successful communication where words are used in novel or
idiosyncratic ways. Nor is a shared prior theory necessary for
communication to succeed all that is necessary is that the passing theory be shared. Sharing passing theories, however, does
not amount to sharing a previously mastered ability: In conclusion, then, I want to urge that linguistic communication does not
require, though it very often makes use of, rule-governed repetition; and in that case, convention does not help explain what
is basic to linguistic communication, though it may describe a
usual, though contingent feature (Davidson [1982] 1984, 280).
Davidsons 1986 paper has been heavily criticized, among
others by Michael Dummett. Part of the criticism is due to the
provocative formulation Davidson gives there to his conclusion: [T]here is no such thing as a language, not if a language
is anything like what many philosophers and linguists have supposed ([1986] 2005, 107). A controversy between Davidson and
Dummett ensued regarding the questions of whether the notion
of an idiolect is to be explained in terms of a communal language
or the other way around, and whether meaning is essentially normative or prescriptive. Davidson argues that any obligation we
owe to conformity is contingent on the desire to be understood
([1994] 2005, 118), and he explicitly opposes those forms of social
meaning externalism (such as Tyler Burges), according to
which the literal meaning of words is essentially a matter of the
linguistic practices of the community surrounding the speaker
(Davidson [1994] 2005, 119). Just as for Gricean accounts of
meaning, there are also issues of psychological realism that arise

Performance
for Davidsons account of successful linguistic communication
in terms of the complicated semantic intentions of the speaker.
Kathrin Gler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bar-On, Dorit, and M. Risjord. 1992. Is there such a thing as a language?
Canadian Journal of Philosophy 22: 16390.
Davidson, Donald. [1982] 1984. Communication and convention.
Inquiries into Truth and Interpretation, 26580. Oxford: Clarendon.
. [1986] 2005. A nice derangement of epitaphs. In Truth, Language,
and History, 89108. Oxford: Clarendon.
. [1992] 2001. The second person. In Subjective, Intersubjective,
Objective: 10722. Oxford: Clarendon.
. [1994] 2005. The social aspect of language. In Truth, Language,
and History, 10926. Oxford: Clarendon.
Dummett, Michael. A nice derangement of epitaphs: Some comments
on Davidson and Hacking. In Truth and Interpretation: Perspectives
on the Philosophy of Donald Davidson, ed. E. Lepore, 45976.
Oxford: Blackwell.
Gler, Kathrin. 2001. Dreams and nightmares: Conventions, Norms,
and Meaning in Davidsons Philosophy of Language. In Interpreting
Davidson, ed. P. Kotatko, P. Pagin, and G. Segal, 5374. Stanford,
CA: CSLI.
Pietroski, Paul. 1994. A defense of derangement. Canadian Journal of
Philosophy 24: 95118.

PERFORMANCE
The study of performance investigates communicative practices
in their sociocultural contexts from three perspectives. First, it
foregrounds the performativity of communicative forms and
practices as modes of action or means of accomplishing social
ends. Second, it directs attention to the poetics of communicative practice or to the forms of verbal artistry through which communicative acts are crafted and communicative skill is displayed.
Third, it focuses attention on performances as a special class of
events, such as rituals, spectacles, festivals, or fairs, in which a
societys symbols and values are publicly displayed, interpreted,
and transformed. Within language study, the first and second
perspectives have been foregrounded.
The contemporary focus on the poetics and performance of
communicative practice emerged in the subdiscipline of linguistic anthropology from a line of inquiry called the ethnography of
speaking. Developed by Dell Hymes and his students during the
1960s and 1970s, the ethnography of speaking highlights performance in two linked ways: as speaking practice and as artfully
marked ways of speaking (Bauman and Sherzer 1975). Its centerpiece is what Hymes called the speech event or communicative event, a framework that allowed scholars to analyze multiple
components of language in use, including setting, participants,
ends (goals, purposes), act sequences, key (tone, tenor), instrumentalities (channel, code), norms, and genres (the SPEAKING
acronym provides a mnemonic) (Hymes 1967). The interest was
not simply in cataloging these components but, rather, in understanding how speakers use language within the conduct of social
life. In highlighting the emergent and creative nature of speech
performance, the ethnography of speaking focused attention
on linguistic forms as resources for living, in Kenneth Burkes
([1941] 1973) sense. Further, it proposed a new unit of study, the

speech community, defined as an organization of diversity that


had to be constituted and managed via performances, rather
than as a preexisting homogeneous entity.
In his concern with how language functions in society,
Hymes was inspired by the work of the prewar Prague School
(192938) and, in particular, by Roman Jakobson (18961982).
Working against Russian Formalisms emphasis on the inner
laws and formal structure of text without regard for context,
the Prague School focused attention on the multifunctionality
of language. Jakobson (1960), building on work by Karl Bhler
and Jan Mukaovsk, identified six constitutive factors of a communicative event and postulated that each factor was associated
with a particular language function. Jakobsons constitutive factors include addresser, addressee, context, message, contact,
and code; he termed their associated functions expressive (or
emotive), conative, referential, poetic, phatic, and metalingual.
Thus, for instance, an utterance (such as eee-gads!) that directs
attention to the addresser (speaker) would be associated with
the expressive function, and so on. This model provided a basis
from which scholars could investigate the relationships among
form, function, and meaning.
In attending to speaking as a social accomplishment, the ethnography of speaking opened the way for studies of language
as an arena for the performance of social identities (see identity, language and). Earlier studies tended to focus on the
organization of communicative life in small, often face-to-face
communities, highlighting the differential distribution of linguistic resources by age, gender, ethnicity, or other status
markers (see Bauman and Sherzer 1974). Later works consider
how particular linguistic performances are both embedded in
and help to shape wider political or cultural formations, such
as race relations, subcultural or national identities, multiculturalism, secularism, and the like. Linguistic anthropologys
historical emphasis on ways of speaking, strategies of voicing,
participation structures, and orientation to audiences made the
field especially amenable to the approach of the Russian literary theorist Mikhail Bakhtin (1981), whose work on dialogism
and heteroglossia inspired studies in areas including language ideology, genre, and intertextuality (Silverstein
and Urban 1996).
By foregrounding speaking as a social performance, the ethnography of speaking countered an alternative use of the term
performance proposed by Noam Chomsky (1965). Drawing
on a distinction made by Ferdinand de Saussure ([1907] 1959)
between language (langue) and speech (parole), Chomsky
defined performance as the incomplete and imperfect realization
of language by particular speakers. He opposed performance to
competence, an internalized set of general rules that constitute
ones knowledge of a language, abstracted from particularities of
performance. In contrast, theorists of performance, along with
many linguistic anthropologists and sociolinguists, emphasize communicative competence, understood not as a hypothetical capacity for language but as the contextually grounded and
culturally acquired ability to speak in socially appropriate ways
(Bauman 1977, 11). Here, speaking is understood as a creative
and emergent act through which social life is accomplished. As
such, speaking is inherently risky; it involves skill and accountability and is subject to critical evaluation.

589

Performance
Richard Bauman highlights the dimensions of risk, responsibility, and accountability in what has become a classic definition of performance: Performance as a mode of spoken verbal
communication consists in the assumption of responsibility to
an audience for a display of communicative competence (1977,
11). Inspired by Hymes, Bauman has been particularly interested in the forms of verbal artistry through which communicative skill is put on display. His work generated a pivotal shift in
folklore studies from a classificatory concern with texts independent of their contexts of use to an interest in the performance of
verbal art as a constitutive ingredient of social life. Performance
in this sense may range from sustained, full performance to a
fleeting breakthrough into performance, with hedged or negotiated performance lying somewhere in between (Bauman 2004,
110; the phrase breakthrough into performance comes from
Hymes 1981). Both Bauman and the interactional sociologist
Erving Goffman have been interested in how performances are
framed or keyed, but whereas Goffmans approach is dramaturgical, highlighting how social actors move from back stage
regions to perform the face work associated with an array of
social roles (Goffman 1959), Baumans interest lies in poetics,
voice, and genre as verbal resources for the accomplishment of
social ends.
Thus far, performance has been considered from two related
vantage points, each grounded in particular disciplinary perspectives: Performance as speaking practice has been a focus
of linguistic anthropology and sociolinguistics; performance
as verbal art has been highlighted in folkloristics and linguistic
anthropology. A third approach views performance as a special
class of marked events in which a societys symbols are displayed for commentary, interpretation, or transformation. This
approach, pioneered by Victor Turner (1967, 1969), is less concerned with language per se. Through its focus on collective representations, cultural symbolism, and collective effervescence,
or communitas, it is located in a Durkheimian paradigm, with
inspiration from Arnold Van Genneps work on rites of passage.
Increasingly, however, scholars are drawing on aspects of all
three approaches. One example of how the three approaches
may be productively considered together is Jane Goodmans
analysis of a childrens performance in the Kabyle Berber region
of Algeria (2005).
The performance in question took place at a wedding,
understood as a festive occasion in which villagers suspended
interpersonal or political conflicts and came together to collectively celebrate the new union. The wedding was set apart from
everyday life by various formal markers: location (an outdoor
public square), timing (late evening), dress, music (traditional
band), and activities (dance). Special forms of verbal art also
marked the occasion: A hired poet recited a poem after henna
was applied to the grooms hand; older village women sang
traditional songs to mark transitions. Wedding guests danced
to show support for the new couple. In this village, men and
women shared the same dancing space but typically danced
sequentially rather than concurrently; in no case did they
dance as couples. One summer, however, village youth active
in the national Berber Cultural Movement formed a mixedgender childrens chorus as a way of changing gender relations
in the community and, more broadly, fostering a commitment

590

to forms of social relationship aligned with the democratic


aspirations of their movement (the Berber Cultural Movement
was a minority ethnolinguistic, subnational, and secular
opposition movement in a majority Arabo-Islamist nation).
A chorus offered a way of teaching children new gender roles
while displaying new modes of gender interaction to the wider
community. To accomplish this, the young men created a new,
highly marked event within the already marked wedding: They
mounted a stage, rented microphones, hung lights, and thus
configured an entirely new relationship between performers
and audience, placing the guests in an unfamiliar spectator
role. The children sang political songs that, while well known,
were not typically associated with weddings. This repertoire
provided the backdrop for yet a third performance: An adolescent girl recited a poem on gender relations written by her
brother (the chorus director) a novel form of verbal art that
until then had no possibility of public performance in the village. Yet the girl appeared to be only partially invested in serving as a spokesperson for her brothers text (she animated the
text, in Goffmans sense); at one point, she stumbled over the
words, and her brother prompted her, mouthing the words
from the sidelines. The event culminated in a rousing dance in
which the children spontaneously organized themselves into
malefemale couples, a transformation of gender roles in dance
that galvanized the audience for nearly an hour.
This multilayered performance highlights the use of verbal
art (songs, poems) alongside other performance modes to effect
a transformation of the social relations of gender. It also illustrates differential relations to linguistic resources and linguistic authority (a concern of the ethnography of speaking): The
childrens chorus had access to political repertoires but not to
womens traditional songs or henna poems. A young man could
fashion himself as the author of a poem; a young woman could
only animate it, and was subjected to her brothers corrective
voicing from the sidelines. Further, it shows how the participant
structure (first made salient in Hymess SPEAKING model) was
both creatively altered for political ends and amenable to multiple interpretations. Putting girls on a public stage constituted
a display of political commitment to democracy for the young
men; for the girls, in contrast, their appearance on stage was a
highly controversial and far more ambivalent deviation from the
social norms of female performance. Beyond gender considerations, this performance clearly reoriented what was typically
framed as a purely local event to wider ethnolinguistic and subnational concerns. Yet embedding this political orientation into
the already sanctioned frame of the wedding entailed less risk
(and ensured greater audience) than mounting a stand-alone
political event might have done.
In sum, the study of performance provides a point of entry
for research into social life as it is constituted, critiqued, and
transformed through communicative practices. It highlights the
emergent, creative, and transformative nature of language use in
a sociocultural context. Finally, performance offers a compelling
vantage point on the mutually constitutive relationship between
seemingly microlevel practices and wider processes, ideologies,
and political formations.
Jane E. Goodman

Performative and Constative


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bakhtin, Mikhail. 1981. The Dialogic Imagination. Ed. Michael Holquist,
trans. Caryl Emerson and Michael Holquist. Austin: University of Texas
Press.
Bauman, Richard. 1977. Verbal Art as Performance. Prospect Heights,
IL: Waveland.
. 2004. A World of Others Words: Cross-Cultural Perspectives on
Intertextuality. Malden, MA, and Oxford: Blackwell.
Bauman, Richard, and Joel Sherzer, eds. 1974. Explorations in the
Ethnography of Speaking. London and New York: Cambridge University
Press.
. 1975. The ethnography of speaking. Annual Review of
Anthropology 4: 95119.
Burke, Kenneth. [1941] 1973. The Philosophy of Literary Form: Studies in
Symbolic Action. 3d ed. Berkeley and London: University of California
Press.
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge,
MA: MIT Press.
Hymes, Dell. 1967. Models of the interaction of language and social setting. Journal of Social Issues 23.2: 828.
. 1981. In vain I tried to tell you: Essays in Native American ethnopoetics. Philadelphia: University of Pennsylvania Press.
Goffman, Erving. 1959. The Presentation of Self in Everyday Life. New
York: Doubleday.
Goodman, Jane. 2005. Berber Culture on the World Stage: From Village to
Video. Bloomington: Indiana University Press.
Gumperz, John J., and Dell Hymes, eds. 1986. Directions in
Sociolinguistics: The Ethnography of Communication. Oxford and New
York: Blackwell.
Jakobson, Roman. 1960. Concluding statement: Linguistics and poetics.
In Style in Language, ed. T. A. Sebeok, 35077. Cambridge, MA: MIT
Press.
. 1990. On Language. Ed. Linda R. Waugh and Monique MonvilleBurston. Cambridge: Harvard University Press.
Saussure, Ferdinand de. [1907] 1959. Course in General Linguistics.
Ed. Charles Bally and Albert Sechehaye, trans. Wade Baskin. New
York: Philosophical Library.
Silverstein, Michael, and Greg Urban. 1996. Natural Histories of Discourse.
Chicago: University of Chicago Press.
Turner, Victor. 1967. The Forest of Symbols: Aspects of Ndembu Ritual.
Ithaca, NY: Cornell University Press.
. 1969. The Ritual Process: Structure and Antistructure.
Chicago: Aldine.

PERFORMATIVE AND CONSTATIVE


The distinction between performative and constative utterances
was first introduced by J. L. Austin, and is illustrative of a reaction
within language philosophy to the doctrine of logical positivism: This paradigm holds that sentence meaning can be
captured in terms of truth conditions (see truth conditional
semantics) and logical relations, and that sentences that cannot be thus verified are essentially meaningless. In contrast,
ordinary language philosophy, as conceived by philosophers such as J. L. Austin, Peter Strawson, and H. P. Grice (see
cooperative principle), examines language in use, and thus
lies at the basis of the development of modern pragmatics.
Austin observes that there are utterances, such as I (hereby)
bequeath my watch to my brother, for which any evaluation in
terms of truth and falsity is irrelevant; this type of utterance he
labeled (at least at the outset) performatives, or utterances that

perform an action, as opposed to constatives, which describe a


state of affairs.
The seminal source for this distinction is Austin (1962), published posthumously as a written record of lectures delivered in
1955 (and based on earlier, largely unpublished ideas). This is
important for two reasons. First of all, much of Austins thinking
is actually contemporary with (though probably largely uninfluenced by) Ludwig Wittgensteins ideas on language-games.
Secondly, the 1962 monograph records an evolution in Austins
thinking, in which he starts from a distinction between two utterance classes and ends up drawing the conclusion that this distinction is untenable and that all utterances perform. In order
to understand this major shift, it is necessary to trace the evolution in his model in some detail.
Constatives are defined as utterances that have truth conditions, the prototype case being desciptions of states of affairs
(e.g., The sun comes up in the East). In contrast, Austins original performatives do not have truth conditions, in that they do
not commit the speakers beliefs to the proposition expressed.
Utterances such as I hereby baptize this child John Doe do
things: They perform actions, in that they change reality from
one in which a child named John Doe does not exist to one in
which such a child does exist. Performatives do not have truth
conditions but, rather, felicity conditions; that is, they are
only performed successfully (or happily, to use Austins term) in
specific circumstances. For instance, baptizing is only performed
happily if the speaker has the proper authority to perform the
procedure (e.g., a priest), if the procedure is carried out correctly
and completely (using the appropriate verbal format), and if the
parties involved carry out any necessary subsequent conduct.
The question arises whether performatives and constatives have any formal identifying features, such as grammatical
or lexical devices, that provide cues for the hearer about their
pragmatic status. Austin originally thought that so-called performative verbs might be a good candidate; in an utterance like
I (hereby) promise Ill finish the essay on time, the matrix verb
promise marks the performance of the action of promising.
Constatives appear to lack such a marker. Since performative
verbs are easily identified (they are first person present tense
indicative, they collocate with hereby, etc), they may function as
powerful cues and can be said (following Searle 1969) to function
as illocutionary force indicating devices (or IFIDs).
The assumption that performatives must have performative
verbs proves untenable, however. A slightly variant formulation of the aforementioned promise, such as Ill finish the essay
on time, is functionally very similar, if not identical, to the version with promise. The problem is that the second version does
not contain a performative verb; therefore, if one assumes that
both versions perform the same action, it must necessarily follow
that performative utterances do not need to have explicit performative verbs. This leads Austin to posit a distinction between
explicit and implicit (or, as he called them, primary performatives) performatives.
We have now lost any kind of formal marking of performatives
since both constatives and implicit performatives lack overt performative verbs. In fact, once one posits the existence of implicit
performatives, the possibility is raised that a constative such as
The sun rises in the East is, in fact, an implicit version of the more

591

Performative and Constative


explicit I (hereby) state that the sun rises in the East (since state
here has all the characteristics of a performative verb). What is
more, one could claim that statements also have felicity conditions in the sense that they are only uttered happily if the speaker
is reasonably sure about the truth of the proposition expressed.
Conversely, many performatives need to bear some relation to
actual facts and, thus, have at least some propositional content.
The question thus arises whether constatives are similar to performatives in that they also perform an action. Austin admits
that they do, namely, the action of committing the speaker to the
truth of the propososition: Once we realize that what we have
to study is not the sentence but the issuing of an utterance in a
speech situation, there can hardly be any longer a possibility of
not seeing that stating is performing an act (Austin 1962, 139).
The performativeconstative distinction thus becomes untenable, and one can only conclude that all utterances are actions.
Austins original two utterance classes are then merely subclasses of acts performed through language, or speech-acts,
which consist of three distinct types of act: locution, illocution,
and perlocution.
The locutionary act can be more or less equated to the
semantic meaning of the utterance and is roughly equivalent
to uttering a certain sentence with a certain sense and reference (Austin 1962, 109). Illocutionary acts are utterances
which have a certain (conventional) force (1962, 109), such
as baptizing, promising, and all of Austins original performatives, but also former constatives such as informing and stating.
Perlocutionary acts, finally, are what we bring about or achieve
by saying something (1962, 109), that is, the consequences that
utterances trigger (which may, but need not be verbal), such as
convincing, deterring, or frightening. In short, all utterances perform three different acts: the act of saying something (locution),
what the speakers intention is in saying something (illocution),
and what its consequences are by saying something (perlocution). In much of the subsequent literature, the term speech-act
has become virtually synonymous with the illocutionary force of
the utterance, but it is important to stress that, for Austin, performing a speech-act involves performing all three kinds of act
simultaneously.
Since all utterances are now considered to be performative
speech-acts, the question is raised as to how many different
classes of speech-acts can be distinguished on linguistic grounds.
Out of the three acts involved, locution does not provide any useful distinguishing criteria since the same propositional content
can be employed for creating various speech-acts; neither does
perlocution since the perlocutionary effect of a speech-act is difficult to predict. However, utterances do differ systematically
with regard to their illocutionary force and, thus, presumably
have different felicity conditions. It should, therefore, be possible
to develop a new taxonomy of illocutionary acts based on these
felicity conditions or the linguistic realizations thereof.
Austin did, in fact, develop a rudimentary taxonomy, but it
was left to his pupil J. R. Searle to come up with a more systematic
classification (see Searle 1979). Searle distinguishes five major
classes of illocutionary act:
(i) Representatives (e.g., stating, describing, concluding),
which commit the speaker to the truth of the expressed
proposition;

592

(ii) Directives (e.g., requests, suggestions, commands),


which consist of attempts by the speaker to get the hearer to
do something;
(iii) Commissives (e.g., promising, threatening, offering),
which commit the speaker to some future course of action;
(iv) Expressives (e.g., apologizing, congratulating, thanking),
which express a psychological state;
(v) Declarations (e.g., declaring war, baptizing, chistening), which effect immediate changes in some institutional
state of affairs, typically relying on elaborate extralinguistic
institutions.
This classification, despite having been hugely influential,
raises some serious problems. First of all, some speech-acts seem
to belong to more than one category: A complaint such as Im
upset that you forgot to put the trash out presumably expresses the
speakers psychological state (expressive) but might also be interpreted as an attempt to get the hearer to take the trash out (directive). Secondly, illocutionary acts reflect the communicative
intention of the speaker, who hopes that this intention will be
recognized and interpreted accurately by the hearer. Again, this
raises the question as to how hearers are able to do so. Searles
answer is the performative hypothesis, whereby every utterance U
has an underlying format of the form I (hereby) Vp you (that) U,
Vp representing the (explicit or implicit) performative verb. This
still begs the question how hearers know that, for instance, The
door is standing wide open is the implicit version of I apologize for
leaving the door open, rather than of I am complaining that you
left the door open (or, for that matter, an indirect version of the
request Could you close the door?) The three traditional sentence
types (declarative, interrogative, imperative) potentially offer
some help by functioning as IFIDs, as may some lexical reflexes
associated with certain illocutionary acts (e.g., please appears to
co-occur exclusively with directives). However, the fact remains
that most speech-act realizations contain neither a performative
verb nor any other IFID. Such utterances, which exhibit no overt
structural marking of their speech-act status (as in The door is
standing wide open when intended as a request), Searle labels
indirect speech-acts. However, since most usages of speech-acts
appear to be indirect rather than direct, it remains unexplained
how hearers are capable of computing the speakers intended
illocutionary force in the absence of structural signals. A possible
explanation is that people rely on contextual cues, working out
the implicit meaning by relying on Grices cooperative principle
through conversational implicatures.
The fact remains that Searles classification offers little help in
assigning speech-act status to stretches of verbal interaction in
ethnographic data. Ultimately, it could be argued, the interpretation of an utterance will depend on the speech event in which it
occurs, that is, the culturally recognized social activity in which
language plays a specific, and often rather specialized, role
(Levinson 1983, 279). In classroom interactions, for instance,
teacher questions regularly violate Searles sincerity condition
since the speaker already knows the answer and is thus not sincere in trying to obtain a missing piece of information.
A radically different approach to the interpretation problem (i.e., how the speakers communicative intention is recognized in the absence of linguistic cues) is offered by the

Perisylvian Cortex
ethnomethodological paradigm of conversation analysis.
Consider the following exchange:
S: Another glass of wine would hit the spot.
H: I dont think so mate, youve had enough.

Ss utterance, despite being a declarative, is clearly not interpreted as simply stating a fact by H; rather, Hs response (a
refusal to comply) shows that it was interpreted as a request
for another glass of wine. The basis for interpretation here lies
in the conversational sequencing of the two contributions: They
are conditionally reliant upon each other, by virtue of being two
parts of a request-refusal adjacency pair. The question as to
the intended illocutionary force of Ss turn becomes moot in this
approach; what matters is that H has clearly interpreted it as
request-like, having provided an appropriate second part to the
adjacency pair. Of course, H might provide an incorrect interpretation, but if this is the case, it will become appararent in the subsequent interaction. Such an inductive approach avoids some of
the pitfalls inherent in attempts to classify speech-acts according
to the nonobservable, and therefore unfalsifiable, intentions of
the speaker.
Ronald Geluykens
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Austin, J. L. 1962. How to Do Things with Words. Oxford: Oxford University
Press.
Levinson, S. C. 1983. Pragmatics. Cambridge: Cambridge University
Press.
Searle, J. R. 1969. Speech Acts: An Essay in the Philosophy of Language.
Cambridge: Cambridge University Press.
. 1979. Expression and Meaning: Studies in the Theory of Speech Acts.
Cambridge: Cambridge University Press.

PERISYLVIAN CORTEX
Beginning in the late nineteenth century, the application of deficit-lesion correlations based on autopsy material to the problem
of the regional specialization of the brain for language yielded
the fact that human language requires parts of the association
cortex in the lateral portion of one cerebral hemisphere (Broca

1861; Wernicke [1874] 1969), usually the left in right-handed


individuals (Broca 1865). This cortex surrounds the sylvian fissure and runs from the pars triangularis and opercularis of the
inferior frontal gyrus (Brodmans areas [BA] 45, 44: Brocas area),
through the angular and supramarginal gyri (BA 39 and 40) into
the superior temporal gyrus (BA 22: Wernickes area) in the dominant hemisphere (Figure 1).

Classical Clinical Models of the Functional Neuroanatomy


of Perisylvian Cortex for Language
The first theories of the functional neuroanatomy of language
pertained to this cortical region. The pioneers of aphasiology
Paul Broca, Karl Wernicke, John Hughlings Jackson, and other
neurologists described patients with lesions in the left inferior
frontal lobe whose speech was hesitant and poorly articulated, and other patients with lesions more posteriorly, in the
superior temporal lobe, who had disturbances of comprehension and fluent speech with sound and word substitutions
(see aphasia). These correlations led to the theory that language comprehension went on in unimodal auditory association
cortex (wernickes area, BA 22) adjacent to the primary auditory cortex (Heschls gyrus, BA 41), and motor speech planning
went on in unimodal motor association cortex in brocas area
(BA 44 and 45) adjacent to the primary motor cortex (BA 4).
These theories incorporated the only principle that has ever been
articulated regarding the localization of a language operation.
According to this principle, language operations are localized in
relation to their sensory-motor requirements. Speech planning
goes on in Brocas area because Brocas is immediately adjacent
to the motor area responsible for movement of the articulators,
and Wernickes area is involved in comprehension because it
is immediately adjacent to the primary auditory cortex. These
ideas and models were extended by Norman Geschwind and his
colleagues in the 1960s and 1970s. Geschwind (1965) added the
hypothesis that word meaning was localized in the inferior
parietal lobe (BA 39 and 40) because word meanings consist
of associations between sounds and properties of objects, and
the inferior parietal lobe is an area of multimodal association
cortex to which fibers from unimodal association cortex related
to audition, vision, and somasthesis project.

Inferior parietal lobe

Brocas area

Figure 1. A depiction of the left hemisphere of the brain


showing the main language areas.
Wernickes area

593

Perisylvian Cortex
Despite its widespread clinical use, however, this model has
serious limitations. It deals only with words, not other levels of
the language code. From a linguistic and psycholinguistic
point of view, the syndromes are all composed of many processing deficits, which are different in different patients. The syndromes themselves do not provide a guide to the localization of
more specific components of the language processing system.
As reviewed in the following, Geschwinds critical contribution regarding the role of the parietal lobe receives no empirical
support.

Linguistically Oriented Models of the Functional


Neuroanatomy of the Perisylvian Cortex for Language
Since approximately 1975, psychologists and linguists have
approached language disorders and their neural basis in a more
systematic fashion, informed by models of language structure
and function. I briefly review two areas of work that relate these
models to the functional neuroanatomy of the persiylvian cortex
and other brain regions.
LEXICAL SEMANTIC PROCESSING. As noted, traditional neurological models of the neural basis for word meaning maintained that the meanings of words consist of sets of neural
correlates of the physical properties that are associated with a
heard word (Wernicke [1874] 1969), all converging in the inferior parietal lobe (Geschwind 1965). It is now known that most
lesions in the inferior parietal lobe do not affect word meaning
(Hart and Gordon 1990), and functional neuroimaging studies designed to activate word meanings do not tend to activate
this region (see the following). A. Damasio (1989), therefore,
modified this model, suggesting that the meanings of words
included retroactivation of neural patterns in unimodal
association primary sensory cortex. Evidence for this comes
from functional neuroimaging results that reveal activation
for different classes of words in different areas, each related
to the sensory-motor associations of the word (frontal cortex
for verbs and manipulable objects; inferior temporal cortex for
concrete nouns) (see Caramazza and Mahon 2006, for review).
However, it is not clear that these activations reflect the meaning of words, rather than properties commonly associated with
words. Word meanings include much more than sensory and
motor associations; the essence of word meaning is itself quite
mysterious (Fodor 1998). In any event, word meanings are part
of a network that relates a word to a complex set of concepts
and contexts (Tulving 1972).
There is evidence that a critical part of this semantic network
is located outside the perisylvian cortex, in the anterior inferior
temporal lobes. Patients with semantic dementia, a degenerative disease that affects the anterior inferior temporal lobe, and
herpes encephalitis, with somewhat more posterior lesions, have
initially selective and ongoing major problems with many aspects
of semantic memory (Davies et al. 2005; Gorno-Tempini et al.
2004; Warrington and Shallice 1984). Activation studies have
implicated the inferior temporal cortex in representing concepts
and word meanings (Caramazza and Mahon 2006). Some studies of the neural generators for the N400 event-related potential
(ERP) wave, which reflects some aspect of semantic processing
(Kutas and Hillyard 1980; Holcomb and Neville 1990), present

594

evidence that this wave originates in the inferior temporal lobe


(Nobre and McCarthy 1995), though perhaps more posteriorly
than the lesion studies would suggest. Other brain areas that
have been suggested as loci for semantic processing (the inferior frontal lobe: Petersen et al. 1988; Dapretto and Bookheimer
1999) are much less clearly related to this function.
In the past two decades, studies of impairments of word
meaning and functional neuroimaging have suggested a finergrained set of distinctions within the class of objects. Both deficits and functional activation studies have suggested that there
are unique neural loci for the representation of categories such
as tools (frontal association cortex), animals and foods (lateral
inferior temporal lobe), and faces (medial inferior temporal lobe)
(see Caramazza and Mahon 2006, for review). Debate continues
as to whether such divisions and localizations reflect different
co-occurrences of properties of objects within these classes or
innate, neurally localized human capacities to divide the world
along these lines.
SYNTACTIC PROCESSING. Most researchers also subscribe to
localizationist views regarding aspects of syntactic processing. A well-known hypothesis is the trace deletion hypothesis
(Grodzinsky 2000), which claims that patients with lesions in
Brocas area have deficits affecting certain moved constituents
(traces in Chomskys theory). The evidence supporting these
models is based on correlating deficits in syntactic comprehension to lesions. However, there are two issues that such data must
face. First, it is often not clear whether a patient has a deficit in
a particular parsing operation or a reduction in the resources
available to accomplish syntactically based comprehension.
Second, there is virtually no consistency in an individual patients
performance across tasks, raising questions about whether a
patient who fails on a particular structure has a parsing deficit
(Caplan, DeDe, and Michaud 2006 and Caplan, Waters, Dede, et
al. 2007).
Assuming that patients performances reflect deficits in
particular parsing operations, the relation of these deficits to
lesions does not support invariant localization models. We have
recently reported the most detailed study of patients with lesions
whose syntactic comprehension has been assessed (Caplan,
Waters, Kennedy, et al. 2007). Lesion size in multiple, cytoarchitectonically different small areas of cortex both within and outside the perisylvian and non-perisylvian area, not connected by
major fiber tracts, predicted performance, ruling out invariant
localization as the mode of neural organization for the operations supporting this function that were assessed. At the same
time, patients who performed at similar levels behaviorally had
lesions of very different sizes in larger areas of the brain (such
as the perisylvian association cortex, or the entire left hemispheric cortex) in which it has been suggested that syntactic
processing might be distributed, and patients with equivalent
lesion sizes in these larger areas varied greatly in their level of
performance, arguing that syntactic processing in comprehension is not distributed in these areas. The data are consistent
with a model in which the neural tissue that is responsible for
the operations underlying sentence comprehension and syntactic processing is localized in different neural regions in different
individuals.

Perisylvian Cortex

Perlocution

Functional neuroimaging studies have been said to provide


evidence for the localization of specific parsing and interpretive
operations in Brocas area (Ben-Shachar et al. 2003; Ben-Shachar,
Palti, and Grodzinsky 2004; Bornkessel, Fiebach, and Friederici
2005; Fiebach, Schlesewsky, and Lohmann 2005). However,
most neuroimaging studies actually show multiple cortical areas
of activation in tasks that involve syntactic processing, and different areas have been activated in different tasks. Overall, these
data also suggest variation in the localization of the areas that
are sufficient to support syntactic processing within the language
area across the adult population, although invariant localization
models are not ruled out (Caplan, Chen, and Waters 2008).

Overview
The left perisylvian association cortex appears to be the most
important brain region supporting human language. However,
it is not the sole area involved in these abilities. How this area
and other brain regions act to support particular language operations is not yet understood. There is evidence for both localization of some functions in subparts of this region and other
brain areas, and for either multifocal or distributed involvement of brain areas in other language functions. It may be that
some higher-level principles are operative in this domain. For
instance, content-addressable activation and associative operations such as those that underlie phoneme recognition, lexical
access, and lexical semantic activation, may be invariantly
localized, while combinatorial computational operations such
as those that constitute the syntax of natural language may not
be. However, many aspects of these topics remain to be studied
with tools of modern cognitive neuroscience.
David Caplan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ben-Shachar, M., T. Hendler, I. Kahn, D. Ben-Bashat, and Y. Grodzinsky.
2003. The neural reality of syntactic transformations: Evidence from
fMRI. Psychology Science 14: 43340.
Ben-Shachar M., D. Palti, and Y. Grodzinsky. 2004. The neural correlates
of syntactic movement: Converging evidence from two fMRI experiments. Neuroimage 21: 132036.
Bornkessel I., C. Fiebach, and A. Friederici. 2005. On the cost of syntactic
ambiguity in human language comprehension: An individual differences approach. Cognitive Brain Research 21: 1121.
Broca, P. 1861. Remarques sur le sige de la facult du parole articul,
suivis dune observation daphmie (perte de parole). Bulletin de l
Socit dAnatomie (Paris) 36: 33057.
Broca P. 1865. Sur le sige de la facult du langage articul. Bulletin de
la Socit danthropologie 6: 33793.
Caplan, D., E. Chen, and G. Waters. 2008. Task-dependent and taskindependent neurovascular responses to syntactic processing. Cortex
44: 25775.
Caplan, D., G. DeDe, and J. Michaud. 2006. Task-independent and taskspecific syntactic deficits in aphasic comprehension. Aphasiology
20: 893920.
Caplan, D., G. Waters, G. DeDe, J. Michaud, and A. Reddy. 2007. A study
of syntactic processing in Aphasia I: Behavioral (psycholinguistic)
aspects. Brain and Language 101: 10350.
Caplan, D., G. Waters, D. Kennedy, N. Alpert, N. Makris, G. DeDe,
J. Michaud, and A. Reddy. 2007. A study of syntactic processing in
aphasia II: Neurological aspects. Brain and Language 101: 15177.

Caramazza, A. and B. Mahon. 2006. The organization of conceptual


knowledge in the brain: The futures past and some future directions.
Cognitive Neuropsychology 23: 1338.
Damasio, A. 1989. Time-locked multiregional retroactivation: A systems-level proposal for the neural substrates of recall and recognition. Cognition 33: 2562.
Dapretto, M., and S. Y. Bookheimer. 1999. Form and content: Dissociating syntax and semantics in sentence comprehension.
Neuron 24: 42732.
Davies, R. R., J. R. Hodges, J. R. Kril, K. Patterson, G. M. Halliday, and J.
H. Xuereb. 2005. The pathological basis of semantic dementia. Brain
128: 198495.
Fiebach, C. J., M. Schlesewsky, and G. Lohmann. 2005. Revisiting the
role of Brocas area in sentence processing: Syntactic integration versus syntactic working memory. Human Brain Mapping 24: 7991.
Fodor, J. A. 1998. Concepts. Oxford: Oxford University Press.
Geschwind, N. 1965. Disconnection syndromes in animals and man.
Brain 88: 23794, 585644.
Gorno-Tempini, M. L., N. F. Dronkers, K. P. Rankin, J. M. Ogar,
L. Phengrasamy, H. J. Rosen, J. K. Johnson, M. W. Weiner, and B. L.
Miller. 2004. Cognition and anatomy in three variants of primary progressive aphasia. Annals of Neurology 55: 33546.
Grodzinsky, Y. 2000. The neurology of syntax: Language use without
Brocaa area. Behavioral and Brain Sciences 23: 47117.
Hart, J., Jr., and B. Gordon. 1990. Delineation of single-word semantic comprehension deficits in aphasia, with anatomical correlation.
Annals of Neurology 27: 22633.
Holcomb, P. J., and H. J. Neville. 1990. Auditory and visual semantic
priming in lexical decision: A comparison using event-related brain
potentials. Language and Cognitive Processes 5.4: 281312.
Kutas, M., and S. A. Hillyard. 1980. Reading senseless sentences: Brain
potentials reflect semantic incongruity. Science 207: 2034.
Nobre, A. C., and G. McCarthy. 1995. Language-related field potentials in the anterior-medial temporal lobe: II. Effects of word type and
semantic priming. Journal of Neuroscience 15: 10908.
Petersen, S. E., P. T. Fox, M. Posner, M. Minton, and M. Raichle. 1988.
Positron emission tomographic studies of the cortical anatomy of
single-word processing. Nature 331: 5859.
Tulving, E. 1972. Episodic and semantic memory. In Organization
of Memory, ed. E. Tulving and W. Donaldson, 381403. New
York: Academic Press.
Warrington, E., and T. Shallice. 1984. Category specific semantic impairments. Brain 107: 82953.
Wernicke, K. [1874] 1969. The aphasic symptom complex: A psychological study on a neurological basis. Breslau: Kohn and Wegert. Repr. in
Boston Studies in the Philosophy of Science. Vol. 4. Ed. R. S. Cohen and
M. W. Wartofsky, 3497. Boston: Reidel.

PERLOCUTION
In pragmatics, perlocution refers to the effect speech-acts
have on the hearer (H). J. L. Austin (1962) distinguishes three
types of act that utterances perform simultaneously: locution
(roughly equivalent to the meaning in a propositional sense),
illocution (the intended force of the speech-act), and perlocution. Austin characterizes perlocution as follows: Saying something will often, or even normally, produce certain consequential
effects upon the feelings, thoughts, or actions or actions of the
audience, or of the speaker, or of other persons: and it may be
done with the design, intention, or purpose of producing them
(1962, 101). Hs reaction to an illocutionary act might be verbal
(e.g., asking a question might prompt an answer), or nonverbal

595

Person
(e.g., an insult may result in a slap in the face), but also an internal
psychological or emotional state (e.g., a threat might result in H
being frightened or angry).
Although Austin intended perlocution to be an integral part
of a speech-act, later developments of speech-act theory have
focused almost exclusively on illocution, that is, the speakers
mental state or intention (e.g., Searle 1969). As a result, the term
speech-act has become virtually synonomous with illocutionary force. This is perhaps unsurprising, given that perlocutions
do not always consist of observable behavior (and might therefore
be argued to fall outside a linguistic theory of pragmatics; but see
Gu 1993). Moreover, perlocutions are hard to classify: Not only
do certain illocutions allow for a range of possible perlocutions
(a request, for instance, may result in either compliance or rejection by the hearer); there is often no way of knowing whether the
achieved perlocution is actually the one the speaker (S) intended
to achieve (a warning, say, may be intended to make the hearer
(H) take evasive action but may only result in frightening him/
her). Nevertheless, it is easy to demonstrate that perlocutions are
intrinsic parts of speech-acts since their successful performance
often depends on them. As Austin points out, an utterance such
as I bet you 10 dollars the Knicks will win by 5 points is felicitous
only if it receives uptake, that is, if H acknowledges and accepts
the bet (see felicity conditions).
conversation analysis (Sacks 1992) offers a potential
alternative, inductive approach to (verbal) perlocutions based on
local sequential organization. Consider the following exchange:
S: Have a cookie
H: ehm no thanks Ive just had dinner

In this exchange, Ss contribution can be labeled an offer by virtue of its being the first part of an offer-refusal (or offer-acceptance) adjacency pair. If H recognizes Ss utterance as such,
he/she will have to provide a sequentially appropriate response
(or perlocution). The second part is thus conditionally reliant on
the first part.
Ronald Geluykens
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Austin, J. L. 1962. How to Do Things with Words. Oxford: Clarendon
Press.
Gu, Yueguo. 1993. The impasse of perlocution. Journal of Pragmatics
20: 40532.
Sacks, Harvey. 1992. Lectures on Conversation. Oxford: Blackwell.
Searle, J. R. 1969. Speech Acts: An Essay in the Philosophy of Language.
Cambridge: Cambridge University Press.

PERSON
Person is a morphosyntactic property of nominal phrases (nouns
and pronouns) used to indicate the discourse role of their referent. English personal pronouns show three person distinctions: first person, indicating speakers (I, we); second person,
indicating addressees (you); and third person, indicating discourse nonparticipants (he, she, it, they). Some languages also
distinguish inclusive and exclusive we: Ojibwa has kiinawint for
groups including speakers and addressees, and niinawint for

596

groups including speakers but excluding addressees. Further


divisions include an impersonal category and a sentient/nonsentient third person opposition. Like number and gender
marking, person can also be indicated on agreeing elements,
particularly finite verbs. Present tense English verbs show only
third person singular agreement (walk-s), while agreement
on Italian indicative verbs distinguishes three persons in both
singular (parl-o I speak, parl-i you speak, parl-a he/she/it
speaks) and plural.
Linguistic phenomena related to person include morphological categories of pronouns and agreement; partial morphological syncretisms among person categories, in pronouns
or agreement; interactions of person with the ordering of pronominal clitics; interactions of person with case, agreement,
or structural position; and surprising restrictions on person combinations, usually involving direct and indirect objects (the *me
lui effect, or Person Case Constraint). Such phenomena form the
empirical basis of morphosyntactic theories of person.
There are three principal theoretical approaches to person. A
traditional insight represents person categories within a hierarchy of nominals influencing pronoun morphosyntax, for example, case and agreement marking in transitive clauses (Dixon
1994, 85). Cross-linguistically, third person is the least marked,
ranking below first and second. For example, in Georgian, first
and second person objects are indexed by verbal morphology,
while verbs with third person objects resemble intransitives. In
Dyirbal, first and second person pronouns have nominative/
accusative case marking, while third person pronouns, proper
names, and common nouns show an ergative/absolutive opposition. Some scholars rank first person highest (Zwicky 1977),
while others regard the ranking of first and second person as
variable.
Another approach seeks to derive morphosyntactic effects by
representing person as a complex category built from elemental
features. One such feature analysis locates person features
such as [participant], [speaker], and [addressee] within a universal geometry of privative pronominal features, in which the availability of one feature may depend on the presence of another.
An influential paper by Heidi Harley and Elizabeth Ritter (2002)
outlines this approach. Another type of analysis treats person
features as binary rather than privative; this allows the grammar
to refer to negative values, such as [speaker]. Robert Rolf Noyer
(1997) makes a significant case for the binary-feature analysis.
A third approach, potentially compatible with the second,
associates different persons with different syntactic representations (Ritter 1995; Dchaine and Wiltschko 2002; Bejar 2003).
Within the featural approach, most commentators assume
the existence of features corresponding to first and second person. However, third person is widely treated as simply lacking
such features (Zwicky 1977; Noyer 1997). This analysis correctly
predicts certain limits on the typology of person categories
(Greenberg 1966). As noted, some languages have separate categories for inclusive and exclusive we, whose use depends on
whether addressees are included. Thus, [addressee] is a distinctive feature; inclusive ([speaker, addressee]) has it, while first
person ([speaker]) does not. However, there is no parallel contrast between categories whose use depends on whether nonparticipants are included. For example, no known languages have

Person

Philology and Hermeneutics

separate categories for inclusive and exclusive plural you, whose


use depends on whether nonparticipants are included. Such
observations imply that there is no third person feature, therefore no categories such as [speaker, addressee, nonparticipant],
[speaker, nonparticipant], or [addressee, nonparticipant]. Third
person pronouns thus refer to nonparticipants by default, lacking the features that allow reference to discourse participants.
Nevertheless, some phenomena seem to require reference to
nonparticipants, for example, syncretism in Mam pronominal
enclitics (Noyer 1997) or the Spanish spurious se rule (Bonet
1991). An obvious solution is to permit limited reference to negative values, such as [speaker, addressee]. The success of the
privative approach depends on identifying plausible alternative
analyses for such cases.
Although the [speaker] and [addressee] features are sufficient to generate the four main person categories attested crosslinguistically, there is evidence for an additional [participant]
feature, shared by first and second person (Farkas 1990; Noyer
1997; Halle 1997). For example, while Winnebago agreement distinguishes first and second person, free personal pronouns only
distinguish participants from nonparticipants (nee, I or you,
ee he/she).
The argument against a [nonparticipant] feature also applies
to the [addressee] feature in languages without an inclusive category (McGinnis 2005). Such languages treat the inclusive as first
person, not second (Zwicky 1977; Noyer 1997). Thus, in such
languages, [addressee] is non-distinctive: There can only be an
opposition between [speaker] and non[speaker] participants,
not between [addressee] and non-[addressee] participants. If
[nonparticipant] is nonexistent because it is never distinctive,
then [addressee] is likewise nonexistent in languages without
an inclusive category. This suggests that the morphosyntactic
contrast between first and second person is sufficient to activate
[speaker], while [addressee] can be activated only by an additional contrast between inclusive and first person. In such cases,
[addressee] is indeed necessary to capture widespread (and nondefault) syncretisms between inclusive and second person
most famously identified in Algonquian languages but common
among languages with an inclusive category. For example, the
inclusive pronoun in Ojibwa (kiinawint) shows syncretism with
both second person (kiin, plural kiinawaa) and first (niin, plural
niinawint), but not with third (wiin, plural wiinawaa).
Martha McGinnis
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bejar, Susana. 2003. Phi-syntax: A theory of agreement. Ph.D. diss.,
University of Toronto.
Bonet, Eullia. 1991. Morphology after syntax: Pronominal clitics in
Romance. Ph.D diss., Massachusetts Institute of Technology.
Dchaine, Rose-Marie, and Martina Wiltschko. 2002. Decomposing
pronouns. Linguistic Inquiry 33: 40942.
Dixon, R. M. W. 1994. Ergativity. Cambridge: Cambridge University
Press.
Farkas, Donka. 1990. Two cases of underspecification in morphology.
Linguistic Inquiry 21: 53950.
Greenberg, Joseph. 1966. Universals of Language. The Hague: Mouton.
Halle, Morris. 1997. Distributed morphology: Impoverishment
and fission. In MITWPL 30: Papers at the Interface, ed. Benjamin

Bruening, Yoonjung Kang, and Martha McGinnis, 42549. Cambridge,


MA: MITWPL.
Harley, Heidi, and Elizabeth Ritter. 2002. Person and number in pronouns: A feature-geometric analysis. Language 78: 482526.
McGinnis, Martha. 2005. On markedness asymmetries in person and
number. Language 81: 699718.
Noyer, Robert Rolf. 1997. Features, Positions, and Affixes in Autonomous
Morphological Structure. New York: Garland.
Ritter, Elizabeth. 1995. On the syntactic category of pronouns and agreement. Natural Language and Linguistic Theory 13: 40543.
Zwicky, Arnold M. 1977. Hierarchies of person. Chicago Linguistic
Society 13: 71433.

PHILOLOGY AND HERMENEUTICS


This entry briefly outlines some aspects of the study of linguistics
leading up to the twentieth century. As two of the earliest, most
thoroughgoing attempts in the West to understand written texts
and spoken discourse, philology and hermeneutics represent
vital precursors of todays language science. Still synonymous
with classical studies and historical linguistics, philology as both word and practice can be traced to ancient Greece
and Rome. While it is likewise based on a Greek word and while
the problem of interpretation engaged many ancient thinkers,
hermeneutics is often narrowly associated with vigorous philosophical debates centered in late eighteenth-century Germany
and originating in Reformation treatises on the right interpretation of scripture. Today, the heritage of philology and hermeneutics persists in the modern organization of university disciplines,
as well as in many indispensable scholarly monuments, such as
The Oxford English Dictionary.
Philology implies love of language and once stood for linguistics. Hermeneutics can be defined more specifically as the
art (or science) of interpretation. The progress from amateur art
to professional science marks the history of both. In their heydays, philology and hermeneutics were deemed central to all disciplines, whether scientific or humanistic; at other times, either
discipline could also be reduced to trivial pedantry. Among their
more prescient discoveries are Sir William Joness hypothesis of a
common genetic origin for the evolution of all Indo-European
languages nearly a century before Charles Darwins On the
Origin of Species and the hermeneutic circle, the feedback-like
cycle of interpretation formulated by Friedrich Ast almost 150
years before the birth of cybernetics.
The historical survey to follow highlights the respective origins, development, and interrelations of philology and hermeneutics and is singularly appropriate, given the historical
predilection of both fields. Because of the limitations of space, the
focus remains on the European intellectual tradition. However,
the theme emphasized here, that early investigations of language
sometimes uncannily anticipated modern scientific paradigms,
applies equally to non-Western traditions. In South Asia, for
instance, the classical Sanskrit grammar of Panini (ca. sixth to
fifth cent. b.c.) strongly prefigures generative grammar.
Interest in the nature and origins of human language goes back
to the earliest Western literature, such as the Tower of Babel in
Genesis. There is also a fascinating folktale retold by Herodotus,
in which an Egyptian pharaoh isolates two children from birth
in order to see what language they will speak presumably the

597

Philology and Hermeneutics


worlds oldest. Nevertheless, although both philology and hermeneutics have Greek roots, neither was avowed as a primary
concern of leading classical philosophers such as Plato and
Aristotle. In classical Greek, the keyword logos signified discourse in many diverse senses, including speech (both language and oration), argument (a single proposition or an
entire line of reasoning), prose, story, history, reason,
and thought. Eventually, philologia, like philomatheia, would
imply studiousness, love of learning in general, since all learning at that time revolved around gaining written (and mathematical) literacy, but Socrates could be called a philologos in the
more original sense of fond of speaking he famously refused
to write down his ideas. Plato, on the other hand, had fewer compunctions about writing. (In order to elevate written dialogue to
full-blown dialectic, Plato may himself have coined philosophia
philosophy as a more rigorous alternative.) The first classical figure to embrace the title philologos was Eratosthenes, the
second Plato, who was one of the librarians of Alexandria and
a true philomath: He wrote on such diverse fields as geometry,
history, philosophy, poetry, and literary criticism.
In the classical era, hermeneia interpretation (sometimes
in the sense of translation) was a secondary philosophical concern, recalling the subsidiary status of the messenger
god Hermes (Roman Mercury). Today, readers of Aristotles
On Interpretation (probably not Aristotles title) may be disappointed to find that this short treatise deals exclusively with the
logic of propositions. Similarly, Platos dialogue Cratylus is mired
in a shortsighted attempt to show that the names of things may
be both conventional and natural, as if individual letters could
somehow coherently imitate reality. (Socrates commitment to
sound symbolism is satirized in Aristophanes comedy Clouds.)
Nevertheless, the ancient world made great strides in one particular area, namely, grammar (grammatike), which ranged
from the teaching of literacy (including to non-native speakers),
to scholarly description and cataloging of word forms, to literary
and textual criticism, to more rarified philosophical concerns.
Like philology, grammar could entail a very wide disciplinary
spectrum. In addition to the question of whether language was
a product of nature (physis) or convention (nomos), an equally
central and ultimately more fruitful debate among grammarians revolved around whether language should be understood
in terms of analogy or anomaly: Analogia implied that language was ultimately patterned and governed by regularity,
and anomalia that language was irreparably disorganized and
marred by exceptions. To analogy can be traced the systematicity
that still dominates language science (to say nothing of the legacy of prescriptivist correctness in language use), and anomaly
can be thanked for introducing an honestly empirical dimension
to linguistic studies.
In the century after Plato, Stoic philosophers elevated the
study of language to a separate philosophical concern, but
their treatises have largely been lost. Under the Ptolemies, the
Hellenistic librarians of Alexandria refined and advanced all
earlier knowledge of language in their quest to amass, catalog,
and edit as many texts in as many fields of knowledge as possible. This included gathering descriptive word lists of various
Greek dialects, as well as making detailed analyses of orthography, parts of speech (see word classes), morphology, and

598

verbal tense and aspect. All these advances were authoritatively compiled by Dionysius Thrax in his Treatise on Grammar
(ca. 100 b.c.), which was so influential that it was often called
simply The Manual (and thereby probably subject to extensive
later revision by others). Some of this work, such as the eightfold
division of parts of speech and the treatment of Greek nominal
and verbal systems, still appears in twentieth-century textbooks.
In Rome, the Greek grammatical heritage was appropriated by
writers from Varro (On the Latin Language, first cent. b.c., only
partially preserved) down to Priscian (fifth to sixth cent. a.d.),
whose exhaustive Principles of Grammar (ca. 500), fortuitously
designed to assist the Greek speakers of the longer-lived eastern
Roman empire, would become the ultimate authority for learning Latin throughout medieval Europe. Although (as is so often
the case) much Latin grammatical theory slavishly followed
Greek models, it was impossible to ignore obvious differences
between the two languages (e.g., Latins lack of an article, one
past tense fewer, and one additional case).
Since the dominant unit of linguistic analysis of the time was
the word, and less so the sentence, the primary achievements
of classical language science lay in its descriptive and pragmatic
dimensions, particularly in linguistic pedagogy and the accurate
preservation, understanding, and annotation of written texts. For
instance, we have the Hellenistic era to thank for the invention of
such scholarly staples as footnotes, commentaries, critical editions, dictionaries, encyclopedias, and library catalogs. On the
other hand, investigations of phonetics and syntax, though
found in some early classical theorists, remained rudimentary.
And sadly, despite the story of King Mithridates of Pontus (or
Mithradates VI, 12063 b.c.), who was fluent in all 22 languages
of his subjects, there was almost no formal ethnographic study
of the many other now-extinct languages of the Mediterranean
region; non-Greek speakers were simply barbarians (barbaroi, babblers). Lexicographical work was driven by the need
to translate Greek and Latin, as well as to comprehend archaic
texts (e.g., Homer), and many word lists have been preserved as
hermeneumata translations and lists of glosses (glossaries,
from glossai, unfamiliar words). Although prodigious effort
from Socrates on was invested in etymology (the pursuit of a
words etymon, truth), this was almost a complete failure since
ancient philologists did not yet grasp how important phonology and rules of sound change are for tracing the historical roots
of words. The results ranged from the fanciful to the ridiculous.
Thus, Latin lignum wood hid potential ignis fire; lepus hare
was light-foot (compounding levis + pes); and words could
stem from their opposites: bellum war was so named for being
not at all bellum beautiful. Much of this dubious heritage was
compiled by Isidore of Seville (sixth to seventh cent. a.d.), whose
Etymologiae remained influential throughout the medieval
period. Many such classical and medieval compilations remain
secondarily valuable, however, because they often preserve the
sole remaining fragments of hundreds of ancient texts.
As a time of consolidation and preservation of the GrecoRoman heritage, the Middle Ages made relatively few significant
contributions to the study of language, as for many centuries
the Latin culture of Europe lagged behind the Greek learning of
the eastern Roman or Byzantine empire and the Arabic scholarship of Moorish Spain. Based on Varros lost writings on the

Philology and Hermeneutics


disciplines, Martianus Capellas Marriage of Philology and
Mercury (fifth cent. a.d.) formalized the division of the seven liberal arts (the lettered trivium of grammar, logic, and rhetoric,
and the numeric quadrivium of geometry, arithmetic, music,
and astronomy), cementing the philological basis of Western
education for more than a millennium. Capella personified philology as the mother of the liberal arts (from Latin ars, better
translated today as science), and the first art was grammar, the
learning of literacy through the close study and imitation of classic texts. The advent of Christianity did not entirely displace the
pagan past, but instead brought new urgency to the problem of
how to comprehend this legacy in the context of the new worldview. One result was the famous multileveled system of allegory,
a hermeneutics that invited medieval thinkers to integrate three
competing cultural systems: the Hebrew Bible and the Jewish
religion; Greco-Roman mythology, literature, and history; and
orthodox Latin Christianity. Also known as typology, allegorical interpretation was not limited to biblical texts but could be
extended to read types (emblems, characters) everywhere in
Gods creation, including the natural world (the second book
after the Bible).
Although it foreshadows modern linguistic procedures,
medieval allegory now seems as empty as classical etymologizing. It is not overly unfair to the philology of the Latin Middle
Ages to say that it is bracketed by its two greatest authors, its first
and its last: Augustine and Dante. Certainly, there were important contributions to the understanding of language in between
these landmarks, such as the brilliant attempt at orthographical
reform via phonetic analysis by the so-called First Grammarian
of twelfth-century Iceland, but it seems typical that this work was
forgotten until the nineteenth century. And there were the scholastic authors of speculative grammars (often under the rubric
of modi significandi, the means of signifying) who began the
ongoing search for universal principles in language. Yet long
before, around the fifth-century fall of Rome, Augustine brilliantly anticipated modern semiotics in On Christian Doctrine,
and he was the first ever to consider the problem of childhood
language acquisition in the autobiographical Confessions.
Augustine also helped Christianize Capellas seven pagan liberal arts. Meanwhile, the Latin language itself was undergoing
change, and Augustine could no longer hear the vowel quantities
that underlaid Virgils poetic meter: the Romance languages
were slowly differentiating across Europe. A millennium later,
Dante (who also took the theory and practice of medieval allegory to new heights in his Divine Comedy and elsewhere) wrote
a milestone work on language entitled On the Eloquence of the
Vernacular (ca. 1305). Though necessarily and paradoxically
written in Latin, this unfinished treatise argued for the propriety
of using vernacular languages like Italian in literature, and is the
earliest mapping of European languages based on differences
that seem to have evolved over time. It was the first articulation
of the problem of language change.
The fact that Johannes Gutenberg worked simultaneously
on printing his famous Bible alongside an edition of the stillubiquitous Latin grammar of Donatus (fourth cent. a.d.) reveals
how the classical world still dominated the early Renaissance.
Soon, the rediscovery and promulgation of less-digested ancient
texts and ideas caused a surge in textual criticism and the study

of languages. In 1440, Lorenzo Valla used historical-linguistic


evidence to demonstrate that The Donation of Constantine, a
lucrative grant to the church, was a forgery, thereby founding the
forensic philology of diplomatics. The Renaissance humanists
also revived the learning of Greek, along with Arabic and Hebrew
(considered the original human language), and in the wake of
Dante, various vernacular languages of Europe and even some
languages of foreign lands received grammars of their own. The
languages of the world began to be surveyed, and Joseph Justus
Scaliger sharpened Dantes analysis of the language families of
Europe (Diatriba de Europaeorum Linguis, 1599). Meanwhile, the
fundamentals of human thought explored by Ren Descartes and
John Locke also inspired such works as the Port-Royal General
and Rational Grammar (1660) and utopian attempts at inventing universal communication systems, such as John Wilkinss
Essay towards a Real Character and a Philosophical Language
(1668). The Italian rhetorician Giambattista Vico argued for what
he called The New Science (1725; revised 1744), an ambitious
philological recreation of the history of human mental and cultural development via a succession of master tropes embodied
in ancient language, laws, and other social institutions. In short,
the Enlightenment brought a return to Eratosthenes multidisciplinary philology: The famous French Encyclopdie of Denis
Diderot and others (17512) cites philologie as a universal discipline bridging the sciences and the humanities.
The year 1768 is justly remembered as a watershed in the history of linguistics: It is the date of the famous paper of the legal
scholar William Jones to the Royal Asiatic Society in Calcutta.
Assigned as a colonial judge in the subcontinent, Jones had set
about learning Sanskrit, the ancient language in which Indias
religious and legal texts are preserved, much as Latin had done
for Europe. After only a few months of study, Joness brilliant
surmise was that certain obvious similarities among Sanskrit,
Latin, Greek, and other European languages implied a common
ancestor, which, crucially, might no longer survive. Such groupings had been noticed before, as by Dante and J. J. Scaliger, but
had been explained by the mechanisms of borrowing or decay,
rather than by the process of gradual and divergent evolution
from a now-dead proto-language. The modern discipline of
historical and comparative linguistics had been born, and the
Enlightenments passionate but effete search for language origins was given a fresh scientific direction: the problem of protolinguistic reconstruction.
The year 1768 also marked the birth of Friedrich
Schleiermacher, so influential in the field of hermeneutics. Since
the Reformation, increasing philological concern had been
brought to bear on the text of the Bible. Though a philological
monument in its own right, Jeromes Latin Vulgate (trans. ca.
380405) was no longer sufficient for the new commentaries and
vernacular translations desired by the Reformers who knew the
original Hebrew and Greek. This biblical hermeneutics would
develop into the influential higher criticism, one of the troubling scientific advances that precipitated the Victorian crisis
of faith. Higher criticism described scripture not as an inspired
and inerrant document but as a layered tissue of competing
sources that had been edited together at some intermediate
time. Stemming from the patterns of stylistic differences in biblical accounts (e.g., the varying names for God) first noticed by

599

Philology and Hermeneutics


Reformation commentators, the documentary hypothesis suggests that the canonical five books of Moses (the Pentateuch)
are carefully patched together from a number of distinct source
texts.
Just as Dante had thought fit to apply sacred allegory to his
own secular literary production, so did Enlightenment students
of the Bible acknowledge that no special method of interpretation should be required for the word of God. As hermeneutic
theorist Johann August Ernesti put it in 1761, the verbal sense of
Scripture must be determined in the same way in which we ascertain that of other books (quoted in Palmer 1969, 38). The parallel development of secular higher criticism was also underway.
In his Introduction to the Correct Interpretation of Reasonable
Discourses and Books (1742), Johann Martin Chladensius became
the first hermeneuticist to argue for the importance of point of
view (Sehe-Punckt) in interpreting historical texts. Similarly, the
classical scholar Friedrich August Wolff, who famously insisted
on taking his doctoral degree in philology, rather than philosophy, published his Prolegomena to Homer (1795), which asked the
still-vexed Homeric question: Was there really a single author
behind the Iliad and Odyssey? The concept of the linguistic family tree of William Jones also found application in secular textual
editing, as Karl Lachmann (17931851) perfected the method of
stemmatics to posit nonextant archetypes from which various
groups of manuscripts descended and thus to help eliminate a
texts accumulated errors.
While Wolff and others developed Altertumswissenschaft
(classical scholarship) and biblical critics analyzed scripture,
Schleiermacher, who himself published on both classical and biblical philology, elevated hermeneutics to a general practice that
would ultimately bring it far away from traditional philological
concerns. (The primary source for Schleiermachers general hermeneutics are detailed outlines he prepared for his university
lectures, notes partly published in 1819). Hermeneutics followed
this philosophical direction throughout the nineteenth century; Wilhelm Dilthey, for example, located hermeneutics as the
supporting discipline for the universitys Geisteswissenshaften
(human sciences, literally sciences of the spirit). Following
the phenomenology of Edmund Husserl, the hermeneutic project was furthered by Martin Heidegger and has continued down
to the present in a debate between Hans-Georg Gadamer and
Jrgen Habermas, with wider theoretical ripples still being felt in
the French and Anglo-American discourses of modernism and
postmodernism. General hermeneutics grew to be concerned
not only with interpretation per se but also with the very nature
of understanding, being, and reality itself. Today, hermeneutics has grown more at home with the purer varieties of literary
theory and aesthetics than with traditional philologys lower
criticism.
Philology became an increasingly technical mode of historical linguistics during the nineteenth century. Comparative philologists such as Rasmus Rask, Jacob Grimm, and Franz Bopp
assembled exhaustive phonological and morphological data on
modern and ancient global languages in order to trace their development and interrelationships (sometimes with a troublingly
Orientalist attitude; see Boeckh [1886] 1968, 10, 44). Eventually,
philologys scientific hypertrophy drove the foundation of separate humanistic departments devoted to texts as literature.

600

Phonetics
Another great paradigm split was marked by the publication
of the one-time philologist Ferdinand de Saussures Course in
General Linguistics (1916). Perhaps a victim of its own success,
diachronic philology, which so carefully traced the evolution of
parole, eventually yielded its disciplinary headship of language
study to Saussures synchronic langue (see synchrony and
diachonry and structuralism).
Presently partitioned among various university disciplines,
philology and hermeneutics still govern the fields of medieval
and classical studies, historical linguistics, literary theory and
criticism, textual editing, lexicography, prosody and metrics,
and many others (see Cerquiglini [1989] 1999; Gumbrecht 2003).
Today, though the normal science of language emphasizes
such synchronic contexts as society, psychology, and the brain,
there is little doubt that philology and hermeneutics will persist
and reappear, like Hermes and Mercury, in many new guises in
the future.
Christopher M. Kuipers
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Boeckh, August. [1886] 1968. On Interpretation and Criticism. Ed. and
trans. John Paul Pritchard. Norman: University of Oklahoma Press.
Cerquiglini, Bernard. [1989] 1999. In Praise of the Variant: A Critical
History of Philology. Trans. Betsy Wing. Baltimore: Johns Hopkins
University Press.
Gumbrecht, Hans Ulrich. 2003. The Powers of Philology: Dynamics of
Textual Scholarship. Urbana: University of Illinois Press.
Mueller-Volmer, Kurt, ed. 1985. The Hermeneutics Reader: Texts of the
German Tradition from the Enlightenment to the Present. New York:
Continuum.
Ormiston, Gayle L., and Alan D. Shrift, eds. 1990. The Hermeneutic
Tradition: From Ast to Ricoeur. Albany: State University of New York
Press.
Palmer, Richard E. 1969. Hermeneutics: Interpretation Theory in
Schleiermacher, Dilthey, Heidegger, and Gadamer. Evanston,
IL: Northwestern University Press. Still the standard introduction.
Robins, R. H. 1997. A Short History of Linguistics. 4th ed. London:
Longman.

PHONEME
The phoneme is the smallest unit of speech that discriminates
one word from another in a particular language. Phonemes
are represented by symbols between slashes thus /p/ or /b/.
Phonemes may have alternate forms, called allophones. For
example, in English, the same phoneme /p/ is produced differently in pit and spit. Minimal pairs are used to determine whether
two speech sounds are allophones or separate phonemes. For
example, in English, the phonemes /p/ and /b/ distinguish the
word pull from bull, and /t/ and /d/ distinguish the word bat
from bad.
Miwako Hisagi

PHONETICS
What Is Phonetics?
Phonetics is the area of language science research that studies
the articulation, acoustic properties, and auditory perception of

Phonetics
speech units (see speech production, acoustic phonetics, articulatory phonetics , and speech perception ,
respectively). More specifically, phonetics can be understood
as linguistically informed speech science, and research phoneticians are generally trained linguists who bring to bear their
knowledge of the structural properties of language. Rather than
focusing only on one particular language or on universal anatomical properties of hearing or articulation, a phonetician
has a special interest in understanding the full range of distinct possibilities in human speech or signed communication.
Because of the important role that linguistics plays in phonetic
study, most often phonetics finds itself housed academically
as a linguistic discipline, though sometimes it finds its home
in engineering, psychology, or in a language-specific setting.
The most prominent textbook used in educating phoneticians
is P. Ladefogeds A Course in Phonetics (2006), now in its fifth
edition.
Within linguistics, phonetics is related to the field of
phonology, another area of theoretical linguistic research.
Linguists vary in their opinions regarding the degree of distinctness and areas of overlap between the phenomena considered to be the objects of phonetic versus phonological
research. Both are concerned with the component speech units
or building blocks into which words can be divided. However,
the general view is that phonetics investigates measurable,
physical properties of these speech units, such as the precise
articulation of speech units, their detailed and contextually
dependent acoustic properties, and cross-linguistic variation
in these physical properties. Phonology, in contrast, is generally concerned with how these speech units are combined or
organized into acceptable word forms within a language (e.g.,
allowable sequences) and with the underlying principles of
organization shared across languages (see phonology, universals of ). On analogy to chemistry, phonetics investigates
subatomic structure, and phonology studies the formation of
molecules out of basic atoms. Traditionally, the phonological
structure has been viewed as cognitive or grammatical, while
the phonetic structure has been viewed as purely physical and
implementational. However, the dividing line between cognitive and physical has blurred or dissolved over the years (e.g.,
Browman and Goldstein 1995).

Well-Known Theoretical Puzzles in Phonetics


There are a number of well-known puzzles in the area of phonetics whose empirical and theoretical consideration has helped
lead to our current understanding of some fundamental aspects
of the linguistic speech system. As one example, phoneticians
have an abiding interest in understanding how to reconcile a
linguistic view of speech as being composed of concatenated
symbolic units with its physical realization in articulation and
acoustics in which there are no silences, separations, or obvious
criteria for segmentation between these units. We can refer to
this puzzle as lack of segmentability. It is famously acknowledged
in Charles Francis Hocketts (1955) Easter egg analogy, which
describes the phonetic speech production processes as making a
smeared mess out of neat Easter eggs moving through a wringer.
Gradually, however, the field has come to understand that rather
than a mess, the speech produced by humans is governed by

lawful, albeit complex, physical properties. A second puzzle that


has been much discussed in phonetic research is the puzzle of
lack of invariance. This refers to the difficulty of reconciling the
linguists view that language calls on a small fixed set of phonological (or contrastive) units in organizing its words with the
observation that there are no invariant properties of these units
in the speech signal. Indeed, experiments using sinewave synthesis have shown that even signals completely lacking normal speech cues can nevertheless be perceived as speech and
understood. One, but not the only, source of lack of segmentability and lack of invariance is the phenomenon of coarticulation. This refers to the fact that neighboring speech sounds are, in
fact, articulatorily coproduced in time and thus interact with one
another and mutually shape the speech signal. Consequently,
phonological units in natural speech are realized in a highly variable, context-dependent fashion.
Speech perception research in both children and adults
probes, in part, how human listeners are able to recover phonological units from the speech signal (see, e.g., speech perception in infants) and engage in lexical access (word
identification; see word recognition, auditory) (Pisoni
and Remez 2005). This involves understanding how listeners deal
with variability in phonetic form and how prior speech and language experience shapes these processes. Investigation of these
puzzles has informed phoneticians theoretical views regarding
the fundamental nature of speech units.

Ways of Doing Phonetic Research


There are a number of areas of inquiry in the field of phonetics,
and these generally fall under the purview of articulatory phonetics, acoustic phonetics, or speech perception. We encounter
some issues related to each of these areas in the following, but
first, it is worthwhile to consider the two general approaches to
phonetic research. The first focuses on the description, classification, and transcription of speech sounds, the second on experimental phonetics.
Traditionally, the first approach was done by ear, thanks to
the carefully trained abilities of phoneticians, often trained in
a direct line of descent from one practitioner to another. The
International Phonetic Association is a more than century-old
organization whose aim is to promote the scientific study of
phonetics and its practical applications. The association has
provided, with regular updates over the years, a consensus
International Phonetic Alphabet (referred to as the IPA, as is
the association itself) that serves as a notational standard for the
phonetic transcription of all sounds known to exist contrastively
in the worlds languages (and many noncontrastive variations of
these sounds) (IPA 1999). The latest version of the IPA was published in 2005 and is displayed in Figure 1.
This transcription system is a standard reference in the field
of phonetics and has been an important tool for description and
classification. Phoneticians doing work of this sort must determine what the linguistically relevant speech categories are that
is, what counts as linguistically the same and different and what
principled (or idiosyncratic) variation is observed among these
speech units. As can be seen from the IPA chart, phoneticians
have identified important dimensions of variation, in particular,
for consonants:

601

Phonetics

Figure 1. The IPA Chart. Reprinted with permission from the International Phonetic Association. Copyright 2005 by
International Phonetic Association.

place in the vocal tract at which a consonant is articulated or


creates its constriction;
manner of articulation, which refers generally to the type of
constriction: complete closure for stops, narrow closure for
fricatives, constrictions having nasal or lateral airflow; and
voicing, whether the vocal folds are vibrating or not.
For vowels, the variations are captured in a continuous plane
whose dimensions can be identified with auditory properties
called:

602

height [high-mid-low], related to the lowest resonant frequency (the first formant) of the vowel; and
backness [front-central-back], related to the distance between
the first and second resonant frequencies (formants) of a
vowel.
Rounding or lip protrusion or compression is also encoded in
the symbol choice itself.
In addition, in order to adequately describe speech units, the
mechanism by which the air moves in the vocal tract must be

Phonetics
6000

Frequency (Hz)

5000
4000

Figure 2. A spectrogram of the sentence There are


no silences here. In a spectrogram, time is displayed
on the x-axis, frequency (in Hz) on the y-axis, and
amplitude in grayscale darkness.

3000
2000
1000

1450
Time (ms)

identified. All languages use pulmonic sounds with air flowing


out from the lungs, but some languages also move air by laryngeal (glottalic) or tongue (velaric) maneuvers. Other important
linguistic properties of speech units can include distinctions
in the tone (i.e., placement in and/or movement through the
speakers pitch range), phonation type (i.e., the mode or quality
of vocal fold vibration and amount of laryngeal airflow), and VOT
(voice onset time: the temporal coordination of an oral constriction with a laryngeal event).
Much of the most important phonetic work of this sort has
been done in the field by phoneticians working with native
speakers of languages (Ladefoged 2003), often languages that
are poorly documented or possibly endangered. The most
authoritative description of the consonants and vowels of the
worlds languages can be found in The Sounds of the Worlds
Languages (Ladefoged and Maddieson 1996). In addition to
describing the range of possible variation in the units used to
build human speech, phoneticians also address the question of
universal properties of human speech systems. Finally, descriptive phonetics can also address variation within a language, such
as geographical dialect variation. This is one type of sociophonetics (other types include investigations of gender, age, or
class, for example.) For English, an impressive example of this
type of phonetic investigation can be found in the Atlas of North
American English (Labov, Ash, and Boberg 2006).
Whereas descriptive phonetics was traditionally done by ear,
a wide variety of current instrumental techniques is brought
to bear as well. Instrumental phonetics might utilize acoustic
analysis such as digitized waveforms, and spectrograms (see
Figure 2); pitch and formant tracking; articulatory analysis, such
as provided by laryngoscopy, palatography, magnetometry,
ultrasound, and MRI; and perceptual information such as that
provided by discrimination and categorization experiments and
even eye-tracking and neuroimaging.
Work on the other general type of phonetic research, experimental phonetics, also utilizes a wide variety of instrumental approaches, but in this case, the data characterize human
behavior in the processes of producing and perceiving speech,
or reflect quantitative rather than purely qualitative properties of speech. Experimental phonetics often investigates
how linguistic variables, such as segmental context, syllable structure, or prosody, influence the detailed properties
of speech, such as its timing, articulation, spectral characteristics, or intonation. Alternatively, it might examine how

nonlinguistic variables, such as age, gender, speaking rate and


style, affect, or language background, influence these detailed
speech properties.
In experimental phonetics, the development of speech synthesis played a critical role in researchers ability to design and
execute speech perception experiments by allowing for stimuli
with well-controlled acoustic properties. This ushered in a new
era of experimental speech perception research that examines
how humans utilize all of the myriad informational cues present in the acoustic signal. Another particular body of experimental work called laboratory phonology seeks to inform questions
of linguistic representation and processes in phonology via
experimental phonetic data. This work generally takes a cognitive science perspective and has been archived in the multivolume Papers in Laboratory Phonology collection (arising from a
regular Conference on Laboratory Phonology, which has met
every other year since 1987). Browman and Goldstein (1991) and
Beckman and Edwards (1994) provide classic examples of this
type of phonetics.

Other Areas of Phonetic Inquiry


Other important areas of inquiry in the field of phonetics include
investigation of the biomechanics or functional behavior and
coordination of the moving vocal tract (Saltzman and Munhall
1989; Guenther 1995), the role of audition and auditory processing
in speech communication (e.g., psychophysics of speech),
and the vocal tract as a sound-producing device, often characterized in terms of source-filter theory (Stevens 1998; Fant 1960).
Source-filter theory has provided a sophisticated mathematical
understanding of how noise sources at the larynx and along the
vocal tract are shaped by the geometry of the vocal tract and its
particular resonance properties to yield the output speech. The
nonlinear properties of the articulatory-acoustic mapping have
been argued to be important in understanding constraints on
the sound inventories of languages (Stevens 1989). Other phoneticians focus on listener-oriented motivations, such as maximizing auditory distinctions in shaping sound systems, rather
than speaker-generated influences. Clearly, speech systems are
adaptive to communicative and situational demands (Lindblom
1990). Speakerlistener interactions may give rise to change in
word forms over time, that is, diachronically (see phonology,
evolution of; syncrhony and diachrony), and they may
give rise to synchronic adjustments specific to the interlocutors
and the situation.

603

Phonetics

Connections to Other Fields


Phonetics is an interdisciplinary area of linguistics; for example, we have sketched its connection to phonology. It also can
closely tie into other areas of experimental linguistics, particularly psycholinguistic research on spoken language production and processing and neurolinguistic research on brain
function. Knowledge of the phonetic properties of languages
and of the characteristics of the speech signal is critical to the
design of experimental linguistic and neurolinguistic research
programs that examine speech production and processing (see

phonetics and phonology, neurobiology of; brain


and language). Such speech-related work might address lexical access, speech production planning, neural localization of
various functions related to speaking and speech understanding
(see lexical processing, neurobiology of), the integration of visuofacial and auditory information, and the relation of
action and perception (see mirror systems, imitation, and
language).
Furthermore, there are many fields outside of linguistics
on which phonetics has a direct bearing. In the area of speech
technology, linguistic phonetic knowledge can contribute
sometimes directly, sometimes indirectly to machine speech
synthesis and recognition (see voice interaction design).
And conversely, much early work in acoustic phonetics grew
out of the efforts of speech engineers, for example, at Bell
Laboratories, Haskins Laboratories, Massachusetts Institute
of Technology, the Joint Speech Research Unit in England, the
Speech Transmission Laboratory in Sweden, and the Advanced
Telecommunications Research (ATR) Institute International
in Japan. Currently, in the field of speech engineering, there is
interest in capturing linguistic knowledge in ways that will allow
better system performance with conversational interfaces and
with audiovisual speech.
Phonetic science also has utility in forensics, and forensic
phonetics is a recognized area of applied science (see forensic
linguistics). Forensic experts bring both instrumental and
expertlistening techniques to the determination of whether a
suspects voice is a likely or unlikely match to forensic evidence
that investigators have in hand. It should be noted, however, that
there is no unique identifier in the voice of an individual that is
analogous to a fingerprint. Phoneticians are frequently called on
in such speaker-identification cases to provide expert knowledge
and testimony as to the many subtle properties that may distinguish one individuals speech from that of another.
Another field outside of linguistics is often, in fact, an individuals first contact with phonetics second language pedagogy
(see bilingual education). An accurate understanding of
how a languages speech sounds are articulated proves helpful
in pronunciation instruction. Instrumental techniques for displaying feedback on articulation, speech acoustics, or linguistic
categorization can also help in training production and perception of non-native linguistic contrasts.
The paramount area of the influence of theoretical phonology and phonetics on pedagogy is in the teaching of reading
(see teaching reading). Linguists from diverse backgrounds
and groups have taken a leadership position in emphasizing
the importance of phonemic awareness (see phonological awareness) for the acquisition of reading skills and for

604

understanding dyslexia. It is critical for educational success


that reading teachers are made aware of the importance of
characteristic differences between speech and reading, of how
speech knowledge can be leveraged in the teaching of reading,
and of how interference from the phonetic properties of native
languages can influence the acquisition of reading in non-native
languages (Rayner et al. 2002).
A synergistic relationship exists between phonetics and the
field of biomedical imaging. Advances in imaging of the vocal
tract and larynx have greatly illuminated our understanding of
speech production. In turn, new techniques for upper airway
and laryngeal imaging and image analysis have been developed
by phoneticians. These techniques can be incorporated into the
field of clinical phonetics and speech pathology. Traditional
types of descriptive and instrumental phonetics have also
found utility in the understanding of clinical challenges such
as apraxia, stuttering, phonological disorders, and voice disorders. Indeed, the National Institute on Deafness and Other
Communication Disorders is one of the largest funding sources
for phonetic research. Currently, there is enormous interest in
making cochlear implants as successful as possible for their user
populations. Knowledge of the acoustic properties of speech and
of methods for assessing perception adds to the broad body of
technological, engineering, and audiological knowledge currently contributing to this effort.
Phonetics is one of the foundational areas of linguistic research
and language science. It focuses on the descriptive, quantitative,
and behavioral aspects of speech production, transmission, and
perception. Phonetic knowledge helps guide our understanding
of the phonological representations and patterning observed in
human language. Phonetics also makes interdisciplinary contact with speech technology, biomedical imaging, forensics, and
pedagogical and clinical fields.
Dani Byrd
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Beckman, Mary E., and Jan Edwards. 1994. Articulatory evidence for differentiating stress categories. In Phonological Structure and Phonetic
Form: Papers in Laboratory Phonology. Vol 3. Ed. Patricia A. Keating,
733. Cambridge: Cambridge University Press.
Browman, C. P., and L. Goldstein. 1991. Tiers in articulatory phonology,
with some implications for casual speech. In Papers in Laboratory
Phonology. Vol. 1: Between the Grammar and the Physics of Speech.
Ed. J. Kingston and M. E. Beckman, 34176. Cambridge: Cambridge
University Press.
. 1995. Dynamics and articulatory phonology. In Mind as
Motion: Explorations in the Dynamics of Cognition, ed. Robert F. Port
and Timothy Van Gelder, 17593. Cambridge, MA: MIT Press.
Fant, Gunnar. 1960. Acoustic Theory of Speech Production. The
Hague: Mouton.
Guenther, F. H. 1995. Speech sound acquisition, coarticulation, and rate
effects in a neural network model of speech production. Psychological
Review 102: 594621.
Hardcastle, William J., and John Laver. 1997, eds. The Handbook of
Phonetic Sciences. Oxford: Blackwell.
Haskins Laboratories. A speech and reading laboratory in New Haven,
CT, that maintains a Web site at http://www.haskins.yale.edu.
Hockett, Charles Francis. 1955. A Manual of Phonology.
Baltimore: Waverly.

Phonetics and Phonology, Neurobiology of


IPA (International Phonetic Association). 1999. Handbook of the
International Phonetic Association: A Guide to the Use of the
International Phonetic Alphabet. Cambridge: Cambridge University
Press. The association maintains a Web site at http://www.arts.gla.
ac.uk/IPA/ipa.html.
Johnson, Keith. 2003. Acoustic and Auditory Phonetics. 2d ed.
Oxford: Blackwell.
Labov, W., Ash, S., and C. Boberg. 2006. Atlas of North American
English: Phonetics, Phonology and Sound Change. Berlin: Walter de
Gruyter. Available online at: http://www.langsci.ucl.ac.uk/ipa/.
Ladefoged, P. 2003. Phonetic Data Analysis: An Introduction to
Instrumental Phonetic Fieldwork. Oxford: Blackwell.
. 2006. A Course in Phonetics, 5th ed. Boston: Thomson
Wadsworth.
Ladefoged, Peter, and Ian Maddieson. 1996. The Sounds of the Worlds
Languages. Oxford: Blackwell.
Lindblom, B. 1990. Explaining phonetic variation: A sketch of the H&H
theory. In Speech Production and Speech Modeling, ed. W. Hardcastle
and A. Marchal, 40339, Dordrecht, the Netherlands: Kluwer.
Miller, J., ed. 1991. Papers in Speech Communication. A three-volume series published by the Acoustical Society of America (New York) through
the American Institute of Physics.
Pisoni, D. , and R. Remez, eds. 2005. Handbook of Speech Perception.
Malden, MA: Blackwell.
Rayner, K., B. R. Foorman, C. A. Perfetti, D. Pesetsky, and M. Seidenberg.
2002. How should reading be taught? Scientific American
286: 8491.
Saltzman, E. L., and K. G. Munhall. 1989. A dynamical approach to
gestural patterning in speech production. Ecological Psychology
1: 33382.
Stevens, K. 1989. On the quantal nature of speech. Journal of Phonetics
17: 345
Stevens, K. 1998. Acoustic Phonetics. Cambridge, MA: MIT Press.

PHONETICS AND PHONOLOGY, NEUROBIOLOGY OF


The study of the neurobiology of phonetics and phonology focuses on the brain mechanisms that support perception and production of linguistic phonological forms. This entry
describes the neural structures and processes underlying phonetic and phonological processing and briefly discusses four
current theoretical controversies that neurophysiological data
can help address. First, are there invariant relationships between
acoustic properties and phonological categories? Second,
does speech have some special status apart from other acoustic
information? Third, is there a critical period for languagespecific learning? Fourth, to what degree does biology constrain
the nature of phonological systems?

Neurobiological Underpinnings
The physiology (or function) of phonological processing is
described in terms of the structures (anatomy) activated in
processing and the function of these structures. A comprehensive understanding of the physiology requires explication
at the micro- and macrolevels of processing. The microlevel
describes the microstructures and their processing (neuron,
axon, synaptic potential) and are general to brain function,
whereas the macrolevel focuses on larger-scale structures and
processes specific to a particular motor, sensory, or cognitive
process (e.g., phonetic processing). Several points concerning

the microlevel are necessary for understanding how neurobiological methods are used to examine phonetics and phonology
(see Kandel, Schwartz, and Jessell 2000; Shafer and GarridoNag 2007, for greater detail).
First, brain function is in terms of electrochemical messages between neurons. neuroimaging methods index different aspects of these processes and the metabolic processes
that support these. Electrophysiological methods (electroencephalogram [EEG], magnetoencephalogram [MEG]) record
changes in electrical potential at the scalp. These changes are the
result of the synchronous firing of large assemblies of neurons.
Functional magnetic resonance imaging (fMRI) and positron
emission tomography (PET) measure changes in the metabolism of oxygen, and PET can also measure changes in the chemical aspect of the electrochemical signals sent between neurons.
These changes in electrochemical and metabolic measures are
used to make inferences about timing and localization of neural
activity related to some stimulus or event.
A second point is that different brain regions have distinctive
structure in terms of neurons and connectivity and that these
distinctions are the basis of Korbinian Brodmanns classification
system. For example, primary auditory cortex (Brodmanns area
[BA] 41) has a thick layer of neurons specialized to receive information from the peripheral auditory system. These neurons then
send signals to other cortical regions but not directly back to the
periphery for motor responses. Ultimately, phonological functioning will need to be described in terms of connectivity at this
neural level for a complete understanding of the brainbehavior
relationship.
At the macrolevel, neurobiology of phonetics/phonology is
described in terms of the activated brain regions and the timing
of activation of these regions in perception or production (see
speech perception and speech production). These brain
regions are referred to by Brodmanns areas, by names describing
function (e.g., primary auditory cortex), by the scientist involved
in identifying the regions (e.g., brocas area), or by some term
describing an attribute of the regions (e.g., Greek hippocampus
for a region that is shaped like a seahorse).
The principal brain structures involved in phonetic/phonological perception are found in the perisylvian cortex and
include primary (BA 41) and secondary (BA 42) auditory cortex
for processing the acoustic-phonetic aspects of speech (Scott and
Wise 2004) (see Color Plate 10). Sound in general (e.g., noise)
activates bilateral regions of the dorsal plane of the superior
temporal gyrus (STG) and regions of the lateral STG. In contrast
with noise, temporally complex signals, including speech, more
strongly activate the dorsal region of STG, and the lateral STG
activation extends more ventrally. Auditory information identified as speech compared to non-speech leads to increased activation of regions of the STG and superior temporal sulcus (STS)
that are more anterior and ventral (inferior). The left STS appears
to be active in mapping speech onto lexical-semantic representation. In contrast, the right STS shows sensitivity to melodic
features. The left planum temporale (PT, in superior posterior
temporal cortex) is believed to have a special role in phonetic/
phonological processing and appears to support a motor/sensory interface for acoustic information. A left-greater-than-right
asymmetry is generally stronger for speech than non-speech (see

605

Phonetics and Phonology, Neurobiology of


left hemisphere and right hemisphere). Anterior regions
are also activated in speech perception. The left prefrontal cortex (BA 46) is activated in phonological processing in accessing, sequencing, and monitoring phonemes and processing
transitions from consonants to vowels or vowels to consonants.
Articulation of phonetic information is supported by motor (BA
4) and premotor/Brocas area cortex (BA 6, BA 44/BA 45).
Recent models have organized these observations into a simple framework in which the more dorsal regions (i.e., posterior
and superior) are active in auditory-motor integration during
speech perception and the more ventral regions (anterior and
inferior) are more involved in the speech-meaning interface.
This indicates that the phonetic aspects of processing, which
are independent of meaning, will be carried out in more dorsal
regions of the auditory and motor cortex, whereas the phonological aspects, which are the basis of meaningful distinctions, are
processed in more ventral areas of the auditory cortex. The exact
roles of STG, STS, and the two hemispheres in phonetic and phonological processing have not been definitively established yet,
but it is known that these areas are all important in speech processing (Poeppel and Hickok 2004).
Anterior and posterior brain regions involved in phonetic
and phonological processing communicate directly via bundles
of fibers (axons), such as the arcuate fasciculus, but also via more
indirect routes, including the basal ganglia, thalamus, and
cerebellum. These additional structures are involved in general functions related to information processing, motor planning, and coordination and will not be discussed further here.
The timing of activation of levels of phonetic and phonological
processing has largely been provided by EEG and MEG measures. The timing of auditory processes can be roughly related
to levels of processing in the primary and secondary auditory
cortex and to the timing of more basic (e.g., signal detection)
versus higher-level cognitive processes (phonological discrimination). The principal method used to investigate these processes
is event-related potentials (ERPs). The EEG/MEG is time-locked
to a stimulus of interest (e.g., ba), and this stimulus is delivered multiple times (anywhere from 20 to 10,000, depending on
the ERP component of interest). The portion of the EEG/MEG
time-locked to the stimulus is averaged to remove noise (i.e.,
activity produced by unrelated processes).
ERPs are described in terms of the latency, polarity, and
topography of peaks that vary with some stimulus property or
cognitive process. These identified peaks are often referred to as
components. The P and N in a component label refer to positive
and negative polarity, respectively, and the number indicates
the approximate peak latency (e.g., N400) or the position in a
sequence (e.g., N2).
Studies of auditory processing have shown that auditory
information enters primary cortical regions between 10 and 50
ms following contact with the outer ear and that a frontocentral
positivity peaking around 50 ms (P1 component) and negativity peaking around 100 ms (N1 component) index activity in the
primary and secondary auditory cortex. Neurobiological studies
with animals suggest that P1 indexes input from the periphery
into the superior temporal plane of the auditory cortex and that
N1 reflects activity of neurons in the secondary auditory cortex
receiving information from other cortical regions. P1 and N1

606

appear to index acoustic levels of processing. To date, there is


no clear evidence that language experience at the phonological
level directly affects processing in the time range of the P1 and
N1 components.
ERP components occurring later in time are related to higherlevel cognitive processes. Those showing modulation by phonological experience include mismatch negativity (MMN), N2b,
P3b, and N400 (Ntnen 2001; Kujala et al. 2004). Listeners
show more robust MMNs (peaking between 100 and 300 ms and
indicating preattentive, automatic processing) in discriminating
pairs of sounds with which they have had experience (Ntnen
2001). Specifically, the MMN is smaller or later to a contrast in
speech sounds if the speech sounds are assimilated into one phonological category for listeners (e.g., Japanese listeners perception of English [l] vs. [r]) or if the speech sounds are assimilated
into two categories, but one or both sounds are poor exemplars
of these categories (e.g., English listeners perception of Hindi
retroflex [Da] versus [ba]; Shafer, Schwartz, and Kurtzberg 2004).
The later components, N2b, P3b, and N400, are observed when
a participant is asked to actively discriminate a speech contrast.
No discernible N2b, P3b, or N400 is observed if discrimination
is very difficult (chance performance). If discrimination is better than chance but more difficult than for native listeners, then
these components are later and larger than those found for the
native group. For example, English speakers showed reasonably
good discrimination of Japanese (JP) vowel duration (taado
versus tado), but a later and larger P3b component compared
to native Japanese listeners (Hisagi 2007).
Integrating the knowledge of location obtained from fMRI/
PET and timing obtained from EEG/MEG indicates that acousticphonetic processing occurs in primary and secondary auditory
cortical regions between 10 and 100 ms, followed by phonological aspects of processing, presumably in more ventral regions,
between 100 and 400 ms. This model is supported by studies
localizing the sources of N1 and the phonologically elicited N400
(Kujala et al. 2004).

Lack of Invariance Problem


A major theoretical debate in speech perception over the past 40
years has been the relationship between acoustic and phonological properties. Speech with similar acoustic properties may be
assigned to different phonological categories, and, conversely,
speech with different acoustic properties is sometimes assigned
to the same phonological category. Much research focused on
discovering invariant properties of speech sound categories that
would allow for precise categorization has failed to do so.
A recent model can be used to illustrate how neurophysiological data can address the lack of invariance issue. In this
model, speech is categorized and identified by an active process of hypothesis testing (e.g., Magnuson and Nusbaum 2007).
Different types of information are used with regards to the type
and amount of sensory and lexical information available. For
example, clear auditory-speech information and knowledge
of the possible phoneme categories of a language lead to reliance on auditory information in categorization. More ambiguous auditory-speech information can lead to greater reliance on
visual information (e.g., lip closure for [p] but not [t]). In other
words, there are many routes to phonological categorization.

Phonetics and Phonology, Neurobiology of


If this model is viable, then neurophysiological data will show
whether different sensory and motor cortex are activated when
speech is more versus less clear and when other information
(e.g., visual) is available. Several recent studies have shown more
involvement of the motor cortex and visual sensory areas for
ambiguous acoustic speech information when facial information is available, and less activation of these regions when only
the speech signal is available (e.g., Skipper, Nusbaum, and Small
2006).
In summary, this example illustrates the importance of
neurophysiological data for addressing long-standing theoretical controversies.

Does Speech Have Some Special Status Apart from Other


Acoustic Information?
Over the past forty years, there has been a debate regarding
whether speech requires a special type of auditory processing
specific to humans. Behavioral studies have delivered mixed
answers to this question. For example, studies have shown that
speech (in particular, consonants) is perceived categorically,
rather than continuously, and used this to argue for special status. On the other hand, other species (e.g., chinchillas) are shown
to perceive speech categorically, and complex non-speech auditory sounds can be categorically perceived.
Neurophysiological data can help examine this question by
determining whether the same structures and processes support processing of speech and non-speech. The current available
data suggest that in one sense, speech and non-speech are similar. The same auditory cortical regions are activated in processing speech and non-speech, as described previously (also see
Dehaene-Lambertz and Gigla 2004). Furthermore, the sensorymotor links found for speech are similar to those seen for other
sensory-motor links (e.g., tool manipulation using visual and
motor regions) and seen in other species (see Skipper, Nusbaum,
and Small 2006).
In another sense, the neurophysiological data suggest
that the processing of speech differs from that of non-speech.
Specifically, as described previously, more ventral areas (lateral
and anterior superior temporal gyrus) become involved in phonological processing of speech sounds because these sounds
are relevant for making meaning contrasts. It is possible that
humans are the only species that fractionate sound symbols into
subcomponents (phonemes) that can be manipulated to create
novel symbols, and, in this way, speech is special.

Is There a Critical Period for Language-Specific Learning?


Researchers have long noted that learning a second language
late in life typically results in a stronger non-native accent and
poorer speech perception in the second language (see Strange
and Shafer 8; see also second language acquisition). One
explanation for this pattern is that there is a critical or sensitive period in which phonological information must be learned
in order to lead to native-like performance (see phonology,
acquisition of). Some research suggests a gradual loss of ability to alter phonological categories up to puberty.
The reason for this change in ability is unknown. It could be
that the auditory cortex is altered at an early level so that it loses
the ability (or resolution) to respond to non-native contrasts.

Alternatively, listeners may have difficulty refocusing their attention to the relevant cues needed for rapid processing of the second language (Strange and Shafer 2008).
Neurophysiological data can address this question by
examining where in the nervous system differences in processing are found for first and second language learners. The current research has not shown differences earlier than the MMN
response. Furthermore, a recent study from our laboratory suggests that attention plays a role in loss of ability to learn novel
categories. Specifically, listeners learn to automatically attend to
relevant cues in their first language and can only overcome these
weightings with great attentional effort. This result suggests that
the loss of sensitivity in adjusting to novel phonological categories by second language learners is not directly due to a closure
of a critical period for changing the sensitivity or resolution of the
primary and auditory cortex; rather, it is due, at least in part, to
attentional issues (Hisagi 2007).
These findings do not answer all the questions regarding
critical and sensitive periods for setting up phonetic and phonological categories since second language learners acquired
categories for a first language early in life. Recent research examining the neurophysiological and behavioral consequences of
deprivation of hearing, which is reversed by cochlear implants,
will have much to contribute for addressing this question. Recent
advances have led to implantation at earlier ages, which is allowing researchers to compare the quality of phonological processing across different ages of first exposure to speech information.
Improvements in these implanted devices will also allow examination of how the quality of auditory-speech input impacts phonetic and phonological systems. This emerging area of research
is likely to provide less ambiguous evidence regarding a critical
or sensitive period for speech.

To What Degree Does Biology Constrain the Nature of


Phonological Systems?
A classic debate in linguistics concerns the extent to which language is innate. A more useful way to ask this question is what
biological constraints are placed on the nature of phonological
systems and how environmental input contributes to constructing these systems. Across languages, there are common patterns.
For example, all languages contrast /i/ (heep), /u/ (hoop), and
/a/ (hop) (although there can be slight variations in the actual
production of these sounds), and some languages only contrast
these three vowels. However, there is no existing language that
only contrasts i (in bead), I (in bid), and e (in bed) without
also contrasting /i/, /u/, and /a/. It is possible that these universal patterns are due to biological constraints. On the other hand,
they may be attributed to environmental factors. Examination of
the evidence suggests that the system is constrained by an interaction of biological and environmental constraints. For example,
/i/, /u/, and /a/ are perceptually more distinct than i (in bead),
I (in bid), and e (in bed), and this is a property of the auditory
system; the environment (input) leads to less salient distinctions
included in some languages, but many possible distinctions are
never found in languages.
Neurobiological data will aid in further elucidating how development of phonological systems is constrained by instructions
from the genetic code and emerges from patterns in the input. In

607

Phonetics and Phonology, Neurobiology of

Phonological Awareness

particular, examination of the way that genetic variation affects


the development of speech processing and its neurophysiological substrate will help us understand the contributions of biology
and the environment. For example, studies of congenitally deaf
populations have revealed that some brain regions that are typically specialized for audition (e.g., regions of secondary auditory
cortex) are used in visual processing and thus are highly sensitive to input. In contrast, more primary regions specialized for
audition (primary auditory cortex and subcortical areas) do not
reorganize to take on nonauditory functions and are thus less
sensitive to input. An understanding of the relationship among
the genetic code, neural connectivity, and plasticity of auditory
and language-association brain regions will help to create realistic models of phonetic/phonological development and processing, which in turn will help to answer how biology and the
environment contribute to the development of this system.

Conclusion
This entry illustrated the importance of neurobiological data in
addressing significant questions concerning phonetic and phonological processing. In particular, an understanding of the neurobiology supporting phonetic and phonological processing will
allow researchers to construct better models of processing and to
address questions related to first and second language learning
and disorders (such as dyslexia and aphasia) attributable to
deficits in phonological processing.
Valerie Shafer
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Dehaene-Lambertz, G., and T. Gigla. 2004. Common neural basis for
phoneme processing in infants and adults. Journal of Cognitive
Neuroscience 16: 137587.
Hisagi, M. 2007. Perception of Japanese temporally-cued phonetic contrasts by Japanese and American English listeners: Behavioral and electrophysiological measures. Ph.D. diss., City University of New York.
Kandel, E. , J. Schwartz, and T. Jessell. 2000. Principles of Neural Science.
New York: William Heinemann and Harvard University Press.
Kujala, A., K. Alho, E. Service, R. J. Ilmoniemi, and J. F. Connolly. 2004.
Activation in the anterior left auditory cortex associated with phonological analysis of speech input: Localization of the phonological
mismatch negativity response with MEG. Cognitive Brain Research
21: 10613.
Magnuson, J. S., and H. C. Nusbaum. 2007. Acoustic differences, listener expectations, and the perceptual accommodation of talker variability. Journal of Experimental Psychology, Human Perception and
Performance 33: 391409.
Ntnen, Risto 2001. The perception of speech sounds by the human
brain as reflected by the mismatch negativity (MMN) and its magnetic
equivalent (MMNm). Psychophysiology 38: 121.
Poeppel, David, and Gregory Hickok. 2004. Towards a new functional
anatomy of language. Cognition 92: 112.
Scott, S., and R. Wise. 2004. The functional neuroanatomy of prelexical
processing in speech perception. Cognition 92: 1345.
Shafer, V. L., and K. Garrido-Nag. 2007. The neurodevelopmental bases
of language. In The Handbook of Language Development, ed. M. Shatz
and E. Hoff, 2145. Oxford: Blackwell.
Shafer, V. L., R. G. Schwartz, and D. Kurtzberg. 2004, Language-specific
memory traces of consonants in the brain. Cognitive Brain Research
18: 24254.

608

Skipper, Jeremy I., Howard C. Nusbaum, and Steven L. Small. 2006.


Lending a helping hand to hearing: Another motor theory of speech
perception. In Action to Language via the Mirror Neuron System, ed.
Michael A. Arbib, 25084. Cambridge: Cambridge University Press.
Strange, W., and V. L. Shafer. 2008. Speech perception in second language
learners: The re-education of selective perception. In Phonology and
Second Language Acquisition, ed. M. Zampini and J. Hansen, 15392.
Cambridge: Cambridge University Press.

PHONOLOGICAL AWARENESS
Phonological awareness encompasses the broad class of abilities that enable one to attend to, isolate, identify, and manipulate
the speech sounds in spoken words. The domain of phonological awareness abilities can be subdivided into two levels. The
first, phonological sensitivity, pertains to conscious awareness of larger, more salient sound structures within words,
including rhymes and syllable structures (i.e., syllables and
subsyllabic units) (Scarborough and Brady 2002). (Rhymes,
defined at the word level, consist of the stressed vowel and
what follows [e.g., be/we; feather/weather]; subsyllabic units
include onsets, i.e., the portion of each syllable preceding the
vowel [e.g., be; spot; magnet], and rimes, i.e., the remaining portion [e.g., be; spot; magnet]). The second level of phonological
awareness, phoneme awareness, refers to explicit awareness of
the individual phonemes making up words. Generally, children
acquire at least some degree of phonological sensitivity prior
to phoneme awareness (see phonology, acquisition of ).
However, questions remain as to whether attainment of phonological sensitivity is a necessary prerequisite for the development of phoneme awareness (Gillon 2005). When children
begin to acquire phoneme awareness, they usually first are able
to isolate and identify the external phonemes (i.e., the beginning and/or final phonemes in words). Ultimately, proficiency
in phoneme awareness entails the ability to segment, identify,
and blend all of the individual phonemes, including those
within consonant clusters (e.g., in words such as blast).
The significance of phoneme awareness stems from its role
in reading acquisition (see writing and reading, acquisition of). Understanding that spoken words are made up of
individual speech sounds provides a conceptual foundation for
understanding the alphabetic principle (i.e., that letters correspond with phonemes). This awareness, in turn, facilitates
learning to read and spell. The relationship between phoneme
awareness and literacy development is reciprocal: With
some emergent awareness of phonemes, the student can start
to acquire lettersound knowledge. In turn, awareness of phonemes is heightened by experience with print.
Since the concept of phoneme awareness was established in
the 1970s (e.g., Liberman 1971), evidence for the significance of
phoneme awareness for reading achievement has accrued from
correlational, prediction, and training studies. At all ages, including adulthood, less-skilled readers demonstrate weaker performance on phoneme awareness measures than better-reading
peers, whether the same age or younger reading-age controls.
Prediction studies with kindergarten students document that
phoneme awareness performance is one of the strongest predictors of their subsequent reading achievement, particularly
for decoding and word recognition skills, but also for reading

Phonology
comprehension (see teaching reading). Most compelling,
intervention studies confirm a causal link between instruction in
phoneme awareness and increased success at learning to read,
with greater benefits when discovery of phonemes is linked with
letter knowledge (Ehri et al. 2001).
Susan A. Brady
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Ehri, L. C., S. R. Nunes, D. M. Willows, B. Schuster, Z. Yaghoub-Zadeh,
and T. Shanahan. 2001. Phonemic awareness instruction helps
children learn to read: Evidence from the National Reading Panels
meta-analysis. Reading Research Quarterly 36.3: 25087.
Gillon, G. 2005. Phonological Awareness: From Research to Practice.
New York: Guilford.
Liberman, I. Y. 1971. Basic research in speech and lateralization of
language: Some implications for reading disability. Bulletin of the
Orton Society 21: 7187.
Scarborough, H. S., and S. A. Brady. 2002. Toward a common terminology for talking about speech and reading: A glossary of the phon words
and some related terms. Journal of Literacy Research 34: 299334.

PHONOLOGY
As opposed to phonetics, which deals with the properties
of sounds from a language-independent point of view, phonology constitutes the study of the sound structure of units
(morphemes, words, phrases, utterances) within individual
languages. Its goal is to elucidate the system of distinctions in
sound that differentiate such units within a particular language,
and the range of realizations of a given units sound structure as
a function of the shape of other units in its context. These two
goals the study of invariants of sound structure and of the variation shown by these elements in combination are obviously
closely related, but attention has tended to shift between them
over time.
Late nineteenth- and early twentieth-century study of sound
structure focused on the details of sound production. As these
studies (in both articulatory phonetics and acoustic
phonetics) became more sophisticated, however, it was
increasingly apparent that the resulting explosion of data about
sound properties was obscuring, rather than enhancing, scholars understanding of the way sound is organized for linguistic
purposes. Much that is measurable in the speech signal is predictable, internal to the system of a given language, even though
exactly comparable properties may serve to distinguish items
from one another in a different language.
Vowels in English, for example, are relatively longer before
certain consonants than before others, but the difference in the
vowels of, for example, cod and cot is entirely predictable from
this principle alone. By contrast, an exactly parallel difference
between the vowels of kaade dip and kade envious in Finnish
serves as the sole difference between these words. A focus on
phonetic features alone fails to reveal the role played by sound
properties within a language.
The result of this insight was the development within various theories of structuralism of attempts (Anderson 1985)
to define the phoneme, a presumed minimal unit of contrast
within the sound system of a single language. While there is

considerable diversity among these views, it is fair to say that


by and large, they focused on the elucidation of the contrastive
properties of elements of surface phonetic form to the exclusion
of other aspects of sound structure.

The Development of Modern Phonology


Poststructuralist theories fall broadly within the tradition of generative phonology, associated in its origins with Noam Chomsky
and Morris Halle (1968). The distinguishing character of this
view was its attention not simply to surface contrasts but also to
patterns of alternation in shape, and its positing of an abstract
underlying representation (where contrasts among elements
are characterized) that is related to surface phonetic form by a
system of rewriting rules. Each of these rules represents a single
generalization about the realization of phonological elements
(e.g., Vowels are long before voiced obstruents). Much of the
theoretical discussion in the 1960s and early 1970s concerned
the role of an explicit formalism for these rules.
The rules were presumed to apply in a sequence, with each
applying to the result of all previous rules. As a consequence,
some of the generalizations represented by individual rules may
only be valid at an abstract level and not true of all surface forms
to the extent subsequent changes obscure the conditioning factors of a rule or its effects, leading to the opacity of the rule in
question. For example, in many varieties of American English,
the medial consonants of words like ladder and latter are both
pronounced as the same voiced flap [D]. The vowels of the initial syllables of such words continue to differ in length, however,
reflecting the abstract difference in voicing between /d/ and /t/,
even though that difference is obscured by the (subsequent)
application of a rule of flapping that renders the vowel-length
rule opaque. Much attention was paid in this period to the theories of rule ordering necessary for describing such phenomena.
In the years immediately following the publication of Chomsky
and Halle (1968), a number of scholars reacted strongly to the
perceived abstractness of the underlying phonological representations to which it appeared to lead. Various proposals that
intended to restrain this aspect of the theory appeared, some of
them based on the idea that if the rules themselves could be constrained so as to permit only highly natural ones, drawn from
some substantively constrained universal set, the underlying
representations would thereby be forced to be closer to surface
forms. Others proposed to constrain the relation between phonological and phonetic representation directly (again, often in
the name of naturalness).
In general, these attempts to limit the power of phonological
systems by fiat ran into apparent counterexamples that deprived
them of their appeal. Other developments in phonological theorizing shifted scholars attention away from this issue while also
leading (as somewhat unintentional by-products) to a general
reduction in the degree of abstractness of representation. Some
of these elaborations and reorientations of the program of generative phonology are sketched here.
AUTOSEGMENTAL PHONOLOGY. The bulk of research during the
classical period of generative phonology was concerned with
segmental phenomena (although the main goal of Chomsky
and Halle 1968 was an account of English stress). In the early

609

Phonology
1970s, attempts to describe the phonology of tonal systems led
to important changes in assumptions about representations and
a concurrent shift of attention on the part of phonologists.
The classical theory had assumed that phonological (and
phonetic) representations were given in the form of a simple
matrix, where each row represented a phonological distinctive
feature and the columns represented successive segments. Such
a representation is based on the assumption that there is a oneto-one relation between the specifications for any given feature
and those for all other features, since each column contains
exactly one specification for each feature.
Tonal phenomena, however, made it clear that features need
not be synchronized in this way: A given feature specification
might take as its scope either more or less than a single segment.
A classic example of this, offered by W. Leben, is found in Mende,
where each word bears one of a limited set of tonal patterns,
regardless of the number of syllables on which this pattern is
realized. Thus, the tone pattern high-low appears on a single
syllable in mb (and thus the low has scope over only the last half
of the vowel), on two in ngl, and on three in flm (where the
single low of the pattern takes scope over two vowels). This led
to the development of autosegmental representations, in which
feature specifications were linked by lines of association (subject
to specific constraints), rather than all being aligned into segments. The extension of this insight to other phenomena, and its
consolidation, essentially displaced the earlier concerns of rule
notation and ordering in phonologists attention.
METRICAL PHONOLOGY. A similar development took place in
the analysis of stress and the study of the syllable. The analysis
in Chomsky and Halle (1968) treated stress as simply one more
phonological feature, with a value assigned to some (but not all)
of the segments in the representation of a word. This account was
forced to attribute a number of basic properties to the feature
[Stress], however, that had no obvious correlates in the behavior
of other features.
It became possible to rationalize these properties by viewing
stress not as a segmental feature but as a relational property of
the organization of syllables into larger structures. This, in turn,
required the recognition of syllables as significant structural
units: a notion that was explicitly rejected in the earlier theory
in favor of an attempt to reformulate all apparent syllable-based
generalizations in terms of segmental structure alone. The organization of segments into syllables and these, in turn, into larger
units called feet, which themselves are organized into phonological words (and phrases, etc.), allows for the elimination of the
anomalous character of segmentalized stress. The study within
metrical phonology of these units, their internal organization,
and their relation to one another completed the enrichment of
the notion of phonological representation begun within autosegmental phonology.
FEATURE GEOMETRY. A standard theme of classical generative
phonology was that of natural classes of phonological segments,
groups of segments that function in some parallel fashion in phonological rules to the exclusion of others. It was originally hoped
that the analysis of segments into distinctive features would provide the solution to this issue: Segments sharing a feature (or set

610

of features) were thereby characterized as similar to one another,


and thus predicted to behave in the same way in rules.
It soon became apparent, however, that feature analysis
by itself does not exhaust this matter. When nasal consonants
assimilate in place of articulation to a following obstruent, for
instance, each individual place is specified by a distinct feature
(or set of features), and the overall unity of the process as one
applying exactly to all and only nasals, regardless of their place
of articulation, is not expressed. Nothing in the notation, that is,
makes it clear that a rule assimilating labiality, coronality, and
velarity is more coherent in some sense than one assimilating
labiality, voicing, and nasality.
The response to this problem was a program to treat the
features themselves as organized into a hierarchy, such that all
place-of-articulation features (for example), and no others, are
daughters of a unitary node [Place]. On that approach, place
assimilation could be viewed as a unitary association of the
[Place] node itself, rather than individually to each of its various possible values, while no such single unit corresponds to the
hypothetical alternative. Attention focused on such problems of
the internal geometry of the feature system generally led to the
assumption that the way to approach them was to assume that
the theory of rules should be limited to a very simple set of reassociations and deletions within the autosegmental structure of
an utterance, and that a single, universal feature hierarchy could
be specified on the basis of which all observed natural rules
(and no unnatural ones) could be formulated. Arguments
for and against specific proposals about such a hierarchy have
drawn considerable attention, though it is perhaps notable that
the theoretical assumptions underlying the program have been
much less discussed.
LEXICAL PHONOLOGY. In classical generative phonology, the
interface between word structure and sound structure is quite
simple. morphological elements are combined into words
in the syntax, these elements are provided with phonological
(underlying) forms, and the resulting syntactically organized
labeled, bracketed structure serves as the input to the phonology.
At least some of the phonological rules were assumed to apply
according to the principle of the cycle, based on this structure,
in a uniform way. To the extent that morphological elements display different phonological properties in their combinations with
others, this was represented as differences within an inventory of
boundary elements separating them from adjacent items.
Originating from the apparent generalization that elements
with the same phonological behavior (hence, associated with the
same boundary type) tend to appear adjacent to one another,
the theory of lexical phonology proposed a substantial revision to
this architecture. Instead of constructing the entire representation once and for all and then submitting it to the phonology for
realization, this view proposed that the lexicon of morphological
elements is divided into multiple strata or levels. Basic roots can
combine with elements of the first stratum; after each such morphological addition, the resulting form is subject to adjustment
by the rules of a corresponding level of the phonology, and the
output is then eligible to serve as the input to further morphological elaboration. At some point, addition of elements from the
first stratum is replaced by use of the morphology and phonology

Phonology
of the next, and, from then on, no further elements from the initial stratum can be added. This process continues (perhaps vacuously) through all of the strata of the lexicon, yielding a potential
surface word. All of the words in a given syntactic structure are
then subject to adjustment by another set of postlexical phonological processes.
There are a number of further points that characterize this
view, including proposed differences in the properties of lexical
and postlexical rules and the relations between rules on one level
and those on the others. The central point for a broader theory
of grammar, however, is probably the replacement of a syntaxbased (but purely phonological) notion of cyclic rule application
by a repeated cycle of morphological addition and phonological adjustment. This results, for example, in the possibility that
a phonologically derived property (on one cycle) can be relevant
to the conditioning of a morphological operation (on a following
cycle), a possibility that has been shown to be quite real.
OPTIMALITY THEORY. In the early 1990s, a much more radical
challenge to the classical model was presented by the development of optimality theory (OT), a view of phonology based
on a system of ranked, violable constraints on surface shape,
as opposed to a system of ordered rules deriving the phonetics
from an underlying phonological representation. These constraints govern (in the standard formulation) a one-step relation
between underlying and surface representations (cf. underlying structure and surface structure), with no intermediate stages of the sort produced in a rule-based description. The
constraints can be divided into general classes: a) markedness
constraints, which express universally preferred configurations,
and b) faithfulness constraints, requiring that contrasts present
in the phonological representation be preserved in the surface
form. In general, these are in conflict, and the ranking of the constraints governs the resolution of those conflicts, in conformity
with general principles of grammar.
Initially, OT seemed to offer its greatest promise in the analysis of stress, syllable structure, and related phenomena, but subsequent development has encompassed a full range of segmental
and other facts. Descriptions in constraint-based terms are at
least superficially very different from those couched in terms of
traditional rules, and theoretical discussion in phonology since
their introduction has been largely dominated by comparisons
of the two frameworks.

Current Approaches to Phonology


The central issues in phonology in the first decade of the twenty-first century concern the comparative merits of OT and
rule-based descriptions. On the one hand, constraint-based
formulations seem much better equipped to describe global
properties of phonological systems. It was noted in work from
the classical period of generative phonology that multiple distinct processes in an individual language may all have the effect
of ensuring (or avoiding) a single characteristic property of surface form, but no satisfactory account of the unity displayed by
these conspiracies was ever achieved. OT, in contrast, provides
a very direct description of such facts.
In some ways, the surface constraint approach goes beyond
anything available in principle to the rule-based theory. For

example, when languages accommodate loan words to the surface patterns of other words of the language, the adjustments
needed to achieve this may include changes that do not correspond to any rule of the phonology of native forms. Constraints
accomplish this directly and without further stipulation, whereas
a system of rules may have to be arbitrarily extended to account
for loanword adaptation.
On the other hand, some of the same issues that rule-based
phonology dealt with (and at least largely resolved) have resurfaced as serious challenges to the architecture of grammar generally assumed in constraint-based theories. Most important
among these is the problem of opaque generalizations. The
standard model of OT assumes that its constraints apply directly
to surface forms and govern a single-stage mapping between
these and underlying phonological representations, and so
has no place for generalizations that crucially apply to any sort
of intermediate level. Nonetheless, a number of compelling
examples of such phenomena have been demonstrated, and
some sort of accommodation of these facts must be provided by
an adequate phonological theory.
Some responses to this challenge have attempted to maintain
the standard OT model by introducing new sorts of constraints.
Mechanisms such as output-output constraints or sympathy theory, however, have not generally succeeded in dealing with all of
the relevant phenomena and have been shown to produce new
difficulties of their own.
One approach that seems promising is that of stratal OT,
an architecture that grafts a constraint-based account onto the
standard model of lexical phonology. The result is a framework
in which the phonological mapping at each stage is a one-step
process governed by a constraint system. Since the model is
built on a cyclic interaction of phonology and morphology, however, it also provides for multiple successive stages in the overall
derivation, thus accommodating opacity to the extent it can be
related to morphological structure (as in the best-established
examples).
Examples also seem to exist in which the specific changes
through which a language achieves conformity with a general constraint on surface forms do not follow directly from
the content of the constraint (together with other interacting
generalizations). In such a case, something like a rewriting
rule might be necessary, as a supplement to the constraint
system a notion that is clearly antithetical to the basic philosophy of OT.
A quite different problem concerns the very nature of the
universals of phonological structure (see phonology, universals of). Phonological theorizing has generally accepted
the premise that generalizations that are true of phonological
systems in general result from the cognitive organization of the
human language faculty and, thus, must be incorporated in some
way into the architecture of phonological theory. Recently, however, it has been argued that at least some such typological regularities result not from the content of a universal grammar
constraining synchronic systems but, rather, from the universals
of language change (see language change, universals of)
governing the diachronic developments resulting in the systems
we observe. To the extent that this is true, it requires investigators to examine closely the arguments for incorporating any

611

Phonology, Acquisition of
particular regularity into phonological theory per se, as opposed
to seeking its basis elsewhere.

understand the system that relates childrens stored representations to their productions, and to formalize the developmental
paths that children follow.

Conclusion
While there have, of course, been other trends not covered here,
it seems fair to say that the bulk of the theoretical discussion in
phonology from the 1960s to the present has been devoted to
the elaboration and refinement of the generative program of
Chomsky and Halle (1968). The most recent developments in
that tradition, involving the wholesale replacement of rules by
constraints as the mechanism for expressing regularities of a languages sound pattern, have shown great promise but cannot yet
be considered wholly consolidated. Apparently, some appropriate synthesis of the classical and OT models remains to be found,
and it is that search that dominates discussion today.
Stephen R. Anderson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, Stephen R. 1985. Phonology in the Twentieth Century: Theories
of Rules and Theories of Representations. Chicago: University of Chicago
Press. Describes the development of phonological theory, from its origins through the classical period of generative phonology.
Chomsky, Noam, and Morris Halle. 1968. The Sound Pattern of English.
New York: Harper and Row.
Gussenhovern, Carlos, and Haike Jacobs, 2005. Understanding Phonology.
2d ed. New York: Oxford University Press. Lucid elementary introduction to current phonology.
Kager, Ren 1999. Optimality Theory: A Textbook. Cambridge: Cambridge
University Press. Introduces the main ideas of optimality theory in
phonology and their implementation.
Kenstowicz, Michael 1994. Phonology in Generative Grammar.
Cambridge, MA: Blackwell. Provides a comprehensive description of
the principal themes in phonology up to the introduction of optimality
theory.

PHONOLOGY, ACQUISITION OF
A diversity of issues informs work in the field of phonological
acquisition, as it encompasses both first (L1) and second (L2)
language acquisition examined by researchers in linguistics,
psychology, speechlanguage pathology, and language education. In L1, there are questions such as how the acquisition
of phonology interfaces with perceptual and motor development (Locke 1993), and how an examination of disordered
development can illuminate the normally developing grammar (Bernhardt and Stemberger 1998; Dinnsen 1999). In L2,
there are questions as to whether the acquisition process is
fundamentally like L1 acquisition (Flege 1995), or whether L2
grammars are in some sense impaired due to, for example, L1
constraints that impede native-like attainment (see Brown 1998
on perception).
Due to space constraints, this entry focuses on L1, although
many of the same issues arise for L2. The acquisition of phonology is examined from the perspective of generative grammar;
thus, a principal theme is to examine how acquisition research
has used linguistic theory to inform development. This theme
considers the starting hypothesis to be that childrens productions are largely system driven: Acquisition research strives to

612

Childrens Grammars as Possible Grammars


The focus of research on phonological acquisition is on the shapes
of early grammars in the segmental and prosodic domains; thus,
it parallels research on end-state (adult) grammars. (Segmental
phonology is concerned with individual speech sounds, prosodic
phonology with larger units including syllables and feet.) There is
typically a comparison drawn between the shapes of developing
grammars and some end-state grammar. Order of emergence
of segmental (Dinnsen 1992) and prosodic complexity (Fikkert
1994; Levelt, Schiller, and Levelt 1999/2000), as well as error patterns observed in the segmental and prosodic domains, whether
these patterns are expressed through rules (Smith 1973; Ingram
1974), templates (Macken 1992; Fikkert 1994), or constraints
(Pater and Barlow 2003; Goad and Rose 2004), are all considered
in relation to some adult grammar.
One exception is a body of research that views childrens
grammars as self-contained systems subject to their own constraints (Stoel-Gammon and Cooper 1984; Vihman 1996). This
research program developed in response to the observation that
childrens grammars are not simply reduced versions of the
target grammar; indeed, variation across learners is rampant
(Ferguson and Farwell 1975).
While childrens grammars may be self-organizing in that
they contain processes not present in the target language, they
can still be viewed as possible grammars (White 1982; Pinker
1984) if these processes have correlates in other adult languages.
The notion of possible grammar thus requires that, at each
stage, childrens grammars respect the constraints of adult grammars, even if they bear little resemblance to the target system. In
optimality theory (OT) (Prince and Smolensky [1993] 2004),
for example, alternate routes observed across learners, as well as
stages in the development of a single learner, are viewed from
the perspective of the typological options that adult languages
display: Both are accounted for by different rankings of the same
constraints.

Markedness
Although children take different paths to the adult grammar, early
phonologies are also strikingly similar (Jakobson [1941] 1968).
As Roman Jakobson emphasizes, these similarities reflect crosslinguistically unmarked properties. markedness constrains the
shapes of linguistic systems such that less complex properties are
favored. For example, there is a well-documented preference for
consonant+vowel (CV) syllables among children (Ingram 1978;
cf. Grijzenhout and Joppen-Hellwig 2002); this is also a syllable
shape that no end-state grammar forbids (Jakobson 1962). Since
unmarked patterns are systematically observed across learners,
one might reasonably infer that they reflect early grammatical
organization. However, markedness has not always been well
integrated into the theory of grammar (as part of the theory of
representations or formulation of rules/constraints). This begs
the question of whether markedness should instead be part of
the theory of acquisition, which interfaces with, but is independent of, the theory of grammar.

Phonology, Acquisition of
Table 1.
Ambient
form:

Stage

Grammar:

Stored form:

Grammar:

[wei]

M >> F-perc, F

/wei/

M >> F-perc, F

F-perc >> M >> F

F-perc, F >> M

Perception Versus Production


Most work in phonological acquisition has focused on production; indeed, researchers typically assume that children accurately perceive the ambient input. This is due, in part, to the
observation that prelinguistic infants can perceptually discriminate perhaps all contrasts exploited by the worlds languages
(Eimas et al. 1971; Werker et al. 1981). This ability largely declines
by age one (cf. Best, McRoberts, and Sithole 1988), coinciding
with a reorganization of perceptual categories according to what
is contrastive in the target language (Werker and Tees 1984). As
children start to speak around age one, it would appear that perception is complete by the onset of production.
Research on phonemic perception, which requires the ability to form soundmeaning pairings, has challenged this view
(Shvachkin [1948] 1973; Edwards 1974; Brown and Matthews
1997). Although experiments examining minimal contrasts
between native-language sounds have revealed that perceptual
development is mostly complete by age two, some contrasts
develop as late as three. Even age three is probably conservative
because, for consonant perception, this research has focused
almost exclusively on word-initial position. Since contrasts in
other positions are harder to discriminate, many non-target
patterns that childrens productions display could reflect perceptual miscoding, rather than production constraints (Macken
1980).
If perception and production both reflect aspects of childrens
competence, both must be included in the grammar (cf. Hale
and Reiss 1998). However, the time lag observed (production
trails perception) has suggested to some researchers that they
form independent (interacting) grammatical modules (see Menn
and Matthei 1992). This approach, though, cannot predict that
perception and production abilities develop in a similar order.
The latter favors the postulation of a single grammar if the time
lag can be built in. In Pater (2004), this is accomplished by introducing perception-specific faithfulness constraints into OT.

[wei]

F-perc >> M >> F

/wei/

An advantage of OT is that the formal devices for expressing


phonological generalizations include a set of markedness constraints. Most researchers have proposed that learners begin
acquisition with a ranking wherein markedness constraints dominate faithfulness (which favor identity between inputs [stored
representations] and outputs) (e.g., Demuth 1995; Gnanadesikan
[1995] 2004; Smolensky 1996; Pater 1997; Ota 2003; cf. Hale and
Reiss 1998). Throughout development, constraints are reranked
to yield more marked outputs. However, many paths can be followed, as there are many options for what to rerank. Thus, the
idea that grammars are initially unmarked is not inconsistent
with their being self-organizing.

Produced
form:

F-perc, F >> M

[wei]

As shown in Table 1, at Stage 1, both perception-specific


faithfulness (F-perc) and general faithfulness (F) are outranked
by markedness (M). The result is unmarked forms stored in perception and uttered in production. In the example provided, the
ambient form [wei] away undergoes truncation of the pretonic syllable (an unstressed syllable immediately preceding a
stressed syllable) in both components of the childs grammar;
accordingly, words of this shape are perceived and produced
without this syllable. At Stage 2, the childs perceptual abilities
become more target-like (i.e., he/she learns to correctly identify information in the ambient language); this indicates that
the relevant markedness constraints have been demoted below
perception-specific faithfulness. General faithfulness is still outranked, yielding a mismatch between what the child perceives
and what he/she produces. At Stage 3, markedness is demoted
below general faithfulness, and the form is correctly produced.
The perception-production time lag results because forms that
are correctly perceived at Stage 2 are not correctly produced until
Stage 3.

Phonological Theory and Phonological Acquisition


As the preceding discussion reveals, research in phonological
acquisition has been directly impacted by thinking in generative
phonology. Modern generative phonology began with Chomsky
and Halles (1968) Sound Pattern of English (SPE). Although more
recent work has situated the shapes of developing grammars
within the typological range manifested by adult systems, this
was less the case in the SPE-based literature. Much of this work
used SPE as a tool only, in part because, with the formal apparatus employed by the theory, it was difficult to constrain what
a possible grammar is: developing or end state. And although
the theory contained an evaluation metric to guide learners in
selecting the most highly valued among descriptively adequate
grammars, rules for unattested processes were as easy to formalize as rules for commonly attested processes. Finally, SPE contained no workable theory of markedness and, thus, childrens
grammars could not be considered relative to some notion of
optimal.
To facilitate a comparison between SPE and later theories,
we draw on truncation, further exemplified in (1) from Amahl,
age 2.60 (Smith 1973; [b,g] are voiceless unaspirated lenis stops).
(The discussion focuses on the stage when perception is targetlike and truncation is restricted to production.)
(1)

[gep] escape
[an] banana

In SPE, every deviation from adult forms required one or more


rules, and so there was little in common between the rule sets

613

Phonology, Acquisition of
for developing and target systems. To capture truncation, Neil
Smith (1973) provides the rules below, neither of which operates
in the adult grammar:
(2)

R14: V / # (C) ______ C V


[-stress] [+stress]
R16: [+sonorant] / [+consonantal] ______

R14 deletes initial vowels in words like escape. For consonantinitial forms like banana, the result is [bnan], which then undergoes R16, yielding [ban].
Since SPE employed linear representations, the theory did
not offer any insight into why pretonic rather than posttonic syllables delete (escape [gep], but tiger [aig], *[ai()]).
The development of nonlinear phonology (see Goldsmith 1985
for an overview), notably the move to highly articulated prosodic
representations, led to significant breakthroughs in understanding this asymmetry. In trochaic languages, where the foot (the
rhythmic unit in which stress is assigned) is left-headed (stressinitial), escape cannot form a single foot, [s(kip)Ft]Wd, whereas
tiger can, [(tig)Ft]Wd.
Much work in nonlinear phonology has explored the idea
that prosodically defined templates constrain output shape
(McCarthy and Prince 1995). Paula Fikkert (1994) proposes that
templates, which at early developmental stages reflect what
is unmarked, are responsible for truncation. If the childs productions are limited to one foot, circumscribed from the adult
output, this template will determine which material is preserved
from the adult form and which is deleted:
Wd

A dult output:

Ft
s

e
Child output:

keip
Ft
Wd

In contrast to SPE, nonlinear phonology reveals the relationship between target and truncated forms, and the role that
markedness plays in shaping outputs. The material inside the
foot survives, as syllables organized by feet ([keip]) are less
marked than those linking directly to the word ([s]). One problem with the templatic approach, however, is that it is too rigid: If
the segments predicted to survive are precisely those delimited
by the constituent that serves to organize them in the adult form,
it becomes difficult to capture the observation that material from
the truncated syllable can also survive. For example, in Amahls
pronunciation of banana in (1), onset selection favors [b], replacing [n] from the stressed syllable; that is, his production is [an]
not *[nan] as expected from adult [b(nan)Ft]Wd (see Kehoe
and Stoel-Gammon 1997 for other problems with the templatic
approach).
This problem is rectified in OT. First, there are no templates;
templatic effects arise from the interaction of markedness constraints. Second, segmental content (e.g., labial preservation)
is the responsibility of faithfulness constraints. Finally, all constraints are interranked; thus, the co-occurrence of truncation
and onset selection is not unexpected (see Pater 1997).

614

Table 2.
ParseSyll
a.

[b(nan)Ft]Wd

b.

[(nan)Ft]Wd

c. [(ban)Ft]Wd

Max[lab]-IO

Max-IO

*!

**

I-Contig

*!

**

To illustrate, concerning truncation in Table 2, the constraint ParseSyllable (syllables are parsed into feet), along
with other markedness constraints, must be satisfied at the
expense of the lower-ranked faithfulness constraint Max-IO
(every segment in the input has a correspondent in the output). Fully faithful (a) is thus eliminated because the initial syllable is unfooted. Concerning onset selection, Max[labial]-IO
(every [labial] in the input has a correspondent in the output) must be ranked over I-Contiguity (the portion of the
input standing in correspondence forms a contiguous string).
Preservation of [labial] in banana will thus be favored, (c),
even though the result violates I-Contig through morphemeinternal segment deletion.
OT has had a major impact on acquisition research.
Phonological processes are now generally expressed through
constraints, rather than rules, as this provides a better conceptualization of the observation that markedness shapes early
grammars. As discussed, childrens productions become more
target-like when markedness constraints are demoted below
faithfulness. A similar idea, that development is best viewed as
the gradual relaxing of constraints, had been proposed earlier
(Stampe 1969; Menn 1980), but it was difficult to formally implement it in the rule-based frameworks of the time.
OT seems to provide an appealing view of the initial state and
of development; researchers can address important questions,
such as how the theory may restrict what a possible developing grammar is, and how, in turn, data from development may
inform the theory. However, this is not to say that OT has solved
all problems in phonological acquisition. One understudied
problem is rogue behavior. We have been assuming that childrens grammars are possible grammars, thereby ignoring the
fact that some commonly attested processes, notably consonant harmony (CH), have no adult analogs (Drachman 1978). In
CH, consonants share place over vowels of any quality (Vihman
1978), as seen in (3) for Amahl, age 2.60 (Smith 1973):
(3)

[aig] tiger
[ok] stroke

Some recent accounts of CH (Goad 1997; Rose 2000) incorrectly predict that the process should be attested in adult grammars; others (Pater 1997) appeal to child-specific constraints,
thereby challenging the notion that childrens grammars are possible grammars. Neither of these approaches questions whether
CH is truly grammar-driven nor addresses, more generally, the
criteria that should factor into the determination concerning
what is grammar-driven and what is not. I leave these questions
to future work.
Heather Goad

Phonology, Acquisition of
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bernhardt, Barbara, and Joe Stemberger. 1998. Handbook of Phonological
Development from the Perspective of Constraint-Based Nonlinear
Phonology. San Diego, CA: Academic Press.
Best, Catherine, Gerald McRoberts, and Nomathemba Sithole. 1988.
Examination of perceptual reorganization for nonnative speech
contrasts: Zulu click discrimination by English-speaking adults and
infants. Journal of Experimental Psychology: Human Perception and
Performance 14: 34560.
Brown, Cindy. 1998. The role of the L1 grammar in the L2 acquisition of
segmental structure. Second Language Research 14: 13693.
Brown, Cindy, and John Matthews. 1997. The role of feature geometry
in the development of phonetic contrasts. In Focus on Phonological
Acquisition, ed. S. J. Hannahs and Martha Young-Scholten, 67112.
Amsterdam: Benjamins.
Chomsky, Noam, and Morris Halle. 1968. The Sound Pattern of English.
New York: Harper & Row.
Demuth, Katherine. 1995. Markedness and the development of prosodic
structure. Proceedings of NELS 25: 1325.
Dinnsen, Daniel. 1992. Variation in developing and fully developed phonetic inventories. In Phonological Development: Models, Research,
Implications, ed. Charles Ferguson, Lise Menn, and Carol StoelGammon, 191210. Timonium, MD: York.
. 1999. Some empirical and theoretical issues in disordered child
phonology. In Handbook of Child Language Acquisition, ed. William
Ritchie and Tej Bhatia, 647704. San Diego, CA: Academic Press.
Drachman, Gaberell. 1978. Child language and language change: A conjecture and some refutations. In Recent Developments in Historical
Phonology, ed. Jacek Fisiak, 12344. The Hague: Mouton.
Edwards, Mary Louise. 1974. Perception and production in child phonology: The testing of four hypotheses. Journal of Child Language
1: 20519.
Eimas, Peter, Einar Siqueland, Peter Jusczyk, and James Vigorito. 1971.
Speech perception in infants. Science 171: 3036.
Ferguson, Charles, and Carol Farwell. 1975. Words and sounds in early
language acquisition. Language 51: 41939.
Fikkert, Paula. 1994. On the Acquisition of Prosodic Structure. The
Hague: Holland Academic Graphics.
Flege, James. 1995. Second language speech learning: Theory,
findings, and problems. In Speech Perception and Linguistic
Experience: Theoretical and Methodological Issues, ed. Winifred
Strange, 23377. Timonium, MD: York.
Gnanadesikan, Amalia. [1995] 2004. Markedness and faithfulness
constraints in child phonology. In Constraints in Phonological
Development, ed. Ren Kager, Joe Pater, and Wim Zonneveld, 73108.
Cambridge: Cambridge University Press.
Goad, Heather. 1997. Consonant harmony in child language: An
optimality-theoretic account. In Focus on Phonological
Acquisition, ed. S. J. Hannahs and Martha Young-Scholten, 11342.
Amsterdam: Benjamins.
Goad, Heather, and Yvan Rose. 2004. Input elaboration, head faithfulness and evidence for representation in the acquisition of leftedge clusters in West Germanic. In Constraints in Phonological
Development, ed. Ren Kager, Joe Pater, and Wim Zonneveld, 10957.
Cambridge: Cambridge University Press.
Goldsmith, John, ed. 1995. The Handbook of Phonological Theory.
Oxford: Blackwell.
Grijzenhout, Janet, and Sandra Joppen-Hellwig. 2002. The lack of onsets
in German child phonology. In The Process of Language Acquisition,
ed. Ingeborg Lasser, 31939. Frankfurt am Main: Peter Lang.
Hale, Mark, and Charles Reiss. 1998. Formal and empirical arguments
concerning phonological acquisition. Linguistic Inquiry 29: 65683.

Ingram, David. 1974. Phonological rules in young children. Journal of


Child Language 1: 4964.
. 1978. The role of the syllable in phonological development.
In Syllables and Segments, ed. Alan Bell and Joan B. Hooper, 14355.
Amsterdam: North-Holland.
Jakobson, Roman. [1941] 1968. Child Language, Aphasia and Phonological
Universals. Trans. Allan Keiler. The Hague: Mouton.
. 1962. Selected Writings. Vol. 1. Phonological Studies. The
Hague: Mouton.
Kehoe, Margaret, and Carol Stoel-Gammon. 1997. The acquisition of
prosodic structure: An investigation of current accounts of childrens
prosodic development. Language 73: 11344.
Levelt, Clara, Niels Schiller, and Willem Levelt. 1999/2000. The acquisition of syllable types. Language Acquisition 8: 23764.
Locke, John. 1993. The Childs Path to Spoken Language.
Cambridge: Harvard University Press.
Macken, Marlys. 1980. The childs lexical representation: The puzzlepuddle-pickle evidence. Journal of Linguistics 16: 117.
. 1992. Wheres phonology? In Phonological Development: Models,
Research, Implications, ed. Charles Ferguson, Lise Menn, and Carol
Stoel-Gammon, 24969. Timonium, MD: York.
McCarthy, John, and Alan Prince. 1995. Prosodic morphology. In
Goldsmith 1995, 31866.
Menn, Lise. 1980. Child phonology and phonological theory. In Child
Phonology. Vol. 1: Production. Ed. Grace Yeni-Komshian, James
Kavanaugh, and Charles Ferguson, 2342. New York: Academic Press.
Menn, Lise, and Edward Matthei. 1992. The two-lexicon account to
child phonology. In Phonological Development: Models, Research,
Implications, ed. Charles Ferguson, Lise Menn, and Carol StoelGammon, 21147. Timonium MD: York.
Ota, Mits. 2003. The Development of Prosodic Structure in Early Words.
Amsterdam: Benjamins.
Pater, Joe. 1997. Minimal violation and phonological development.
Language Acquisition 6: 20153.
. 2004. Bridging the gap between receptive and productive
development with minimally violable constraints. In Constraints
in Phonological Development, ed. Ren Kager, Joe Pater, and Wim
Zonneveld, 21944. Cambridge: Cambridge University Press.
Pater, Joe, and Jessica Barlow. 2003. Constraint conflict in cluster reduction. Journal of Child Language 30: 487526.
Pinker, Steven. 1984. Language Learnability and Language Development.
Cambridge: Harvard University Press.
Prince, Alan, and Paul Smolensky. [1993] 2004. Optimality
Theory:
Constraint
Interaction
in
Generative
Grammar.
Oxford: Blackwell.
Rose, Yvan. 2000. Headedness and Prosodic Licensing in the L1 Acquisition
of Phonology. Ph.D. diss., McGill University.
Shvachkin, N. Kh. [1948] 1973. The development of phonemic speech
perception in early childhood. In Studies of Child Language
Development, ed. Charles Ferguson and Dan Slobin, 91127. New
York: Holt, Rinehart and Winston.
Smith, Neil. 1973. The Acquisition of Phonology: A Case Study.
Cambridge: Cambridge University Press.
Smolensky, Paul. 1996. On the comprehension/production dilemma in
child language. Linguistic Inquiry 27: 72031.
Stampe, David. 1969. The acquisition of phonetic representation.
Chicago Linguistic Society 5: 43344.
Stoel-Gammon, Carol, and Judith Cooper. 1984. Patterns of early lexical and phonological development. Journal of Child Language
11: 24771.
Vihman, Marilyn. 1978. Consonant harmony: Its scope and function in
child language. In Universals of Human Language. Vol. 2: Phonology.
Ed. Charles Ferguson, 281334. Stanford, CA: Stanford University Press.

615

Phonology, Evolution of
. 1996. Phonological Development: The Origins of Language in the
Child. Oxford: Blackwell.
Werker, Janet, John Gilbert, Keith Humphrey, and Richard Tees. 1981.
Developmental aspects of cross-language speech perception. Child
Development 52: 34953.
Werker, Janet, and Richard Tees. 1984. Cross-language speech perception: Evidence for perceptual reorganization during the first year of
life. Infant Behavior and Development 7: 4963.
White, Lydia. 1982. Grammatical Theory and Language Acquisition.
Dordrecht, the Netherlands: Foris.

PHONOLOGY, EVOLUTION OF
True Phonology: How Did It Evolve?
phonology is the study of how languages use segmental and
prosodic categories to build spoken words and signal their differences in meaning. Many animals that communicate vocally
use distinct sound patterns to signal different meanings, but
their repertoires are typically small and closed sets. By contrast,
humans have large vocabularies and learn new words all of
their lives. From infancy to adolescence, children acquire lexical entries at a remarkably fast rate. The vocabulary size of high
school students has been conservatively estimated at 60,000 root
forms a number that implies an average acquisition rate of
more than 10 words per day (Miller 1991).
This difference is linked to the uniquely human method of
coding information: the combinatorial use of discrete entities.
Combinatorial structure, the hallmark of true language, creates
the conditions for open-ended lexical and syntactic systems
that provide the foundation for the singular expressive power of
human languages. How did it evolve?
We focus on two areas of empirical research. One is the study
of human cognition. The other is the investigation of the phonetic
signal space from which all phonological patterns are drawn.
The first theme highlights mans rich semantic abilities. The
second looks for phenomena that presage combinatorial sound
structure.

Cognitive Growth
The virtually infinite set of meanings encodable by language
raises the question of how mans cognitive capacity evolved
from skills not unlike those of present-day apes. How do we picture the transition from a nonhuman to a human primate mind?
What was the role of language?
According to Merlin Donalds synthesis of neurobiological, psychological, archeological, and anthropological evidence
(1991), our ancestors broke away from the stimulus-driven
behavior of apes in two steps. First, during the period of Homo
erectus (from 1.5 million years ago), an adaptation called mimesis
occurred, a communicative culture allowing individuals to share
mental states and begin to represent reality in new and expanding ways.
Mimetic behavior is an ability to voluntarily access and
retrieve motor memories and to rehearse and model them for
communication with others. The whole body is used as a representational device, as in imitating vocal, manual, and postural
movements for a communicative purpose. Mimesis involved
major changes of motor and memory mechanisms based on
existing capacities.

616

Communication during this period was based on gestures.


Spoken language emerged in the second transition. In Donalds
scenario, it takes until the end of the period associated with
archaic Homo sapiens (45,000 years ago) for spoken language to
appear.
If mimesis was a basically gestural mode of communication,
would it not imply a proto-language that was signed, rather than
spoken (cf. Arbib 2005)? Donald assumes that as mimetic messages grew more elaborate, they eventually reached a complexity that favored faster and more precise ways of communicating.
The vocal/auditory modality offered an independent, omnidirectional channel useful at a distance and in the dark. It did not
impede locomotion, gestures, or manual work. The vocal system
came to be exploited more and more and further adaptations
occurred: first lexical invention and high-speed phonological
speech, syntax later.

Specializations of the Vocal/Auditory Modality


A number of comparative studies have been undertaken in
attempts to evaluate the adaptive significance of novel features of human anatomy (see speech anatomy, evolution
of): for example, disappearance of air sacs (Hewitt, MacLarnon,
and Jones 2002), bigger hypoglossal and vertebral canals, smaller
masticatory muscle mass, genetic changes, and uniqueness of
craniofacial sensorimotor system (Fitch 2000; Kent 2004).
Perhaps the most conclusive example of a speech-related
adaptation is the descent of the larynx, which makes swallowing
more hazardous but expands the space of possible sound qualities (Lieberman 1991; Carr, Lindblom, and MacNeilage 1995).
A human ability, central to language but curiously absent
in primates, is vocal imitation. A beginning of a neural account
of imitation was suggested by the discovery of mirror neurons.
First identified in the macaques premotor cortex, these neurons
discharge when the monkey manipulates objects and when it
observes other monkeys or humans do the same. Neurons that
respond to sound and to communicative or ingestive actions
have also been identified. Although there is no direct evidence
for a human mirror system, brain stimulation and imaging studies indicate increased activity in speech muscles when subjects
listen to speech (Hurley and Chater 2005; see mirror systems,
imitation, and language).

Signals for Speaker, Listener, and Learner


In technical jargon, phonology has been characterized as providing an impedance match between semantics and phonetics in
the sense that it succeeds in coding a large number of meanings
despite its use of only a small set of phonetic dimensions (Bellugi
and Studdert-Kennedy 1980). How was this match achieved?
GESTURES AS BASIC UNITS. One answer is that the building
blocks of speech are phonetic gestures, units corresponding to
the discrete articulators. The argument is that, evolutionarily,
as holistic utterances were processed by the mirror system, they
came to be parsed into the basic articulators of the vocal tract
and their preferred, natural motions. Data from early speech
have been used to argue that these units, when properly timed
and modulated in amplitude, produce the vowels and consonants of the ambient adult input (Studdert-Kennedy 2005).

Phonology, Evolution of
PHYLOGENY OF THE SYLLABLE. The frame/content theory
(MacNeilage 2008) offers an evolutionary account of the syllable.
Syllables are universally associated with openclose alternations
of the mandible, vowels being open and consonants closed articulations. This movement has a parallel in childrens babbles,
which resemble consonant-vowel sequences such as [bababa],
but are in no way organized in terms of discrete segments. Rather,
their syllabic and segmental character arises fortuitously from
adding phonation to the openclose jaw motion. This rhythmically repeated up-and-down movement is also found in so-called
lipsmacks, a facio-visual behavior in primates often combined
with phonation during grooming.
Accordingly, the evolutionary path to the syllable began in
deep prehistoric time when mammal biomechanics evolved for
feeding. A second stage was the use of this machinery in primate
communication. In a third step, this primate mechanism was coopted for speech by scaffolding early phonology on its pseudosyllables and pseudo-segments.
QUANTAL THEORY. The acoustic consequences of a continuous
articulatory movement are often noncontinuous, as illustrated by the pseudo-segmental character of babbling. In the
babble example, the jaw moves continuously but the acoustics
shows an abrupt change from a vowel-like to a stoplike pattern.
This quantal jump illustrates a general fact about the phonetic
space. The mapping of articulation onto acoustic parameters
creates a set of acoustic patterns that forms a number of disjoint
subspaces, rather than a single continuous, coherent space.
Within each such subregion, sound quality is homogeneous.
Voiced and voiceless sounds, as well as different manners of
articulation (e.g., stops, nasals, fricatives, trills), exemplify such
distinct subspaces (Stevens 1989).
USER-BASED CONSTRAINTS: ON-LINE SPEECH. The human voice
is an expressive instrument that undergoes moment-to-moment
retuning by many nonlinguistic factors. Consequently, the phonetic patterns conveying linguistically the same utterance exhibit
great variability. However, the need for messages to be both intelligible and pronounceable imposes a systematic distribution on
phonetic variations, placing them between clear hyperforms and
reduced hypoforms. This view portrays speakerlistener interactions as a tug-of-war between the listeners need for comprehension and the speakers tendency to simplify. There is a great deal of
experimental evidence for this view of speech (Lindblom 1990).
USER-BASED CONSTRAINTS: PHONOLOGY. These user-based constraints also leave their mark on phonology, as is evident from
typological data on strengthening and weakening processes in
phonological rules and sound changes (Kiparsky 1988) and from
attempts to simulate segment inventories. These studies indicate
that systemic selections have been favored that simultaneously
optimize distinctiveness and articulatory ease. An example of
the effect of these conditions is the size principle: The larger the
system, the greater the proportion of articulatorily complex segments (Lindblom and Maddieson 1988)
SELF-ORGANIZATION. These user-based constraints in conjunction with the quantal nature of the signal space help explain

why phonologies do not recruit more of a humans total soundmaking capabilities (e.g., mouth sounds and other non-speech
vocalizations; Catford 1982) but prefer practically the same small
set of phonetic properties. However, the study of these constraints
only partially illuminates the roots of combinatorial coding. This
topic has been explicitly addressed in computer modeling experiments. One such study shows how discrete phonetic targets and
reuse can emerge from a dynamic systems network of agents
(speaker/listener models) whose vocalizations initially randomly distributed in phonetic space tend to converge (driven
by a magnet-like dominance of the patterns heard most often) on
a few targets (Oudeyer 2006).
TARGETS AND MOTOR EQUIVALENCE. Traditionally, the basic units
of speech have been assumed to be targets, the intertarget transitions being primarily determined by the response characteristics
of the production system. Speech, like other movements, exhibits motor equivalence: the ability of motor systems to compensate
and reach a given goal irrespective of initial conditions. This view
implies that the end state of phonetic learning is a set of contextindependent targets and a system capable of motor equivalence.
It moreover suggests that once a target has been learned in one
context, it can immediately be reused in other contexts, since the
motor equivalence capability handles the new trajectory. Also, it
means that, developmentally, discrete segments derive from the
emergent targets and recombination from motor equivalence.
A further relevant observation on the target hypothesis is that
linguistic systems with phonemically coded vocabularies would
be learned faster, more easily, and in an open-ended manner
than repertoires based on holistic forms (Lindblom 2007).

Conclusion
Where does combinatorial structure come from? From prespecifications in our genetic endowment? Or from a modality-independent principle shared by sign and speech and perhaps also
operating in genetics and chemistry (cf. the particulate principle
[Abler 1989])? Or from a mutually reinforcing interplay between
cognitive growth and a suite of conditions entailed by communicating by vocal sounds?
In view of the materials reviewed here, a positive treatment of
the last possibility appears within reach. More lexical inventions
imply an increasing number of soundmeaning pairs. The linking of phonetic shapes with distinct meanings would be subject
to numerous user-based constraints and processes shaping the
instrinsic content of lexical entries, fractionating them into discrete units and facilitating unit recombination. Sound structure
could, thus, plausibly have evolved in response to the expressive
needs associated with growing semantic abilities and as a process of phonetically biased scaling, self-organizing without
any formal a priori or modality-independent blueprint.
Bjrn Lindblom
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abler, William L. 1989. On the particulate principle of self-diversifying
systems. Journal of Social and Biological Structures 12: 113.
Arbib, Michael. 2005. The mirror system hypothesis: How did protolanguage evolve? In Language Origins: Perspectives on Evolution, ed.
Maggie Tallerman, 2147. New York: Oxford University Press.

617

Phonology, Universals of
Bellugi, U., And M. Studdert-Kennedy. 1980. Signed and Spoken
Language: Constraints on Linguistic Form. (Dahlem Konferenzen.)
Weinheim, Germany: Verlag Chemie GmbH.
Carr, Ren, B. Lindblom, and P. MacNeilage. 1995. Rle de lacoustique
dans lvolution du conduit vocal humain. Comptes Rendus de
lAcadmie des Sciences (Paris) t 30, srie Iib: 47176.
Catford, John C. 1982. Fundamental Problems in Phonetics.
Bloomington: Indiana University Press.
Donald, Merlin. 1991. Origins of the Modern Mind. Cambridge: Harvard
University Press
Fitch, W. Tecumseh. 2000. The evolution of speech: A comparative
review. Trends in Cognitive Science 4.3: 25867.
Hewitt, G., A. MacLarnon, and K. E. Jones. 2002. The functions of laryngeal air sacs in primates: A new hypothesis. Folia Primatol 73: 7094.
Hurley, Susan, and Nick Chater. 2005. Perspectives on Imitation: From
Neuroscience to Social Science. Vols. 1, 2. Cambridge, MA: MIT Press.
Kent, Ray D. 2004. Development, pathology and remediation of
speech. In From Sound to Sense: 50+ Years of Discoveries in Speech
Communication, ed J. Slifka et al. Cambridge, MA: Research
Laboratories of Electronics, MIT.
Kiparsky, Paul. 1988. Phonological change. In Linguistics: The Cambridge
Survey. Vol. 1. Ed. F. J. Newmeyer, 363415. Cambridge: Cambridge
University Press.
Lieberman, Philip. 1991. Uniquely Human. Cambridge: Harvard
University Press.
Lindblom, Bjrn. 1990. Explaining phonetic variation: A sketch of the H&H
theory. In Speech Production and Speech Modeling, ed. W. Hardcastle
and A. Marchal, 40339. Dordrecht, the Netherlands: Kluwer.
. 2007. The target hypothesis, dynamic specification and segmental independence. In Syllable Development: The Frame/Content
Theory and Beyond, ed. B. Davis and K. Zajd. Hillsdale, NJ: Lawrence
Erlbaum.
Lindblom, B., and I. Maddieson. 1988. Phonetic universals in consonant systems. Language, Speech and Mind, ed. Larry M. Hyman and
C. N. Li, 6278. London and New York: Routledge.
MacNeilage, Peter F. 2008. The Origin of Speech. New York: Oxford
University Press.
Miller, George A. 1991. The Science of Words. New York: Freeman.
Oudeyer, Pierre-Yves. 2006. Self-Organization in the Evolution of Speech.
New York: Oxford University Press.
Stevens, Kenneth N. 1989. On the quantal nature of speech. J Phonetics
17: 346.
Studdert-Kennedy, Michael. 2005. How did language go discrete? In
Language Origins: Perspectives on Evolution, ed. Maggie Tallerman,
4867. New York: Oxford University Press.

PHONOLOGY, UNIVERSALS OF
Phonological universals are those aspects of languages sound
system that are found either in every or most human languages or in diverse languages where their presence cannot be
accounted for by inheritance from a common parent language,
geographical proximity, or borrowing. They are often referred
to as unmarked or default conditions in languages phonologies
when these terms imply reference to common cross-language
patterns. There are universal patterns in languages 1) sound
inventories, including their prosodies, 2) sequential constraints
(how sounds are sequenced), and 3) sound changes and the
phonological alternations they create within a given language.
They are of interest because they give insight into the physical
factors that shape human speech, help to elucidate mechanisms
of sound change, and, perhaps, suggest something about the

618

supposed human innate capacity for language. There is a sizable


literature on phonological universals (e.g., Greenberg, Ferguson,
and Moravcsik 1978; Maddieson 1984), and it will not be possible
in this limited space to discuss and exemplify more than a few
of those that have been discovered. What is more important and
what will be emphasized here is a consideration of the explanation for phonological universals. The best evidence presented so
far points to their phonetic origin.
A caveat: Phonological universals as with any other phonological generalization are inevitably stated in terms of a traditional pretheoretic taxonomy. One should always be alert to the
possibility that the taxonomic terms devised for purely practical
and descriptive purposes may not conform to the true essence
of speech, just as, for example, a pretheoretic category for living
animals of those that fly would result in a heterogeneous class
that included birds, bats, flying fish, and winged insects to the
exclusion of penguins, ostriches, emus, and kiwis.

Universals Deriving from Speech Aerodynamics


All languages have consonants and vowels. Among consonants, all languages employ stops. Among stops, voiceless stops
are the default; that is, if a language employs voiced stops it
will also have voiceless stops, but not the reverse. This can be
explained by the aerodynamic voicing constraint (AVC) (Ohala
1983): Voicing requires air flow through the approximated vocal
cords, and this requires a positive pressure differential between
the subglottal and the oral air pressures. During obstruents, the
flowing air is blocked by the consonantal closure so that air accumulates in the oral cavity, thus increasing the oral air pressure
above the glottis such that eventually the required pressure differential diminishes, thereby reducing transglottal airflow below
the level needed for vocal cord vibration. Another universal pattern explained in part by the AVC is that among languages that
do have voiced stops, it is often the case that the back-articulated
stop is missing, for example, as in Dutch and Thai. This is because
insofar as the AVC can be ameliorated, it is due to the compliance of the surfaces of the vocal tract to the impinging oral air
pressure. The magnitude of this compliance is greatest for labial
obstruents (due to expandability of the cheeks), less for apicals,
and least for velars, which have the least surface area exposed
to the oral pressure. These factors also help to explain the kind
of sound change that occurred in Nubian, now manifested as
a morphophonemic alternation, whereby geminated voiced
stops become voiceless at all places of articulation except labial
(Table 1; data from Bell 1971).
Among languages that have both voiced and voiceless stops,
there are many that have only voiceless fricatives (e.g., Thai,
Galician, Taba). Again, the AVC is part of the explanation: As

Table 1
Noun stem

Stem + and

English gloss

/fab/

/fab:n/

Father

/sgd/

/sgt:n/

Scorpion

/ka/

/ka: n/

Donkey

/mg/

/mk:n/

Dog

Phonology, Universals of
Table 3.

Table 2.
Language

Voiceless and voiced

Voiced only

a. Sundanese:

Awadi

i, u, e

a, o

to wet
to be rich

Campa

o, e, a

bhr

Chatino

i, u

o, e, a

hkn

to inform

Dagur

i, u, e

o, a

msih

to love

Huichol

i, , e

u, a

Serbo-Croatian

i, u

e, o, a

Tadjik

i, u, a

e, o, u

Tunica

i, e, , a, , o

Uzbek

i, u

e, , o, a

mentioned, optimal conditions for voicing require oral pressure


as low as possible (with respect to subglottal pressure), but optimal conditions for generating frication (turbulence) at an oral
constriction requires the oral pressure as high as possible (with
respect to atmospheric pressure). These conditions are contradictory. Thus, voiced fricatives are less common than voiceless ones. Phonetically, in languages that have both voiced and
voiceless fricatives (e.g., English, French, Italian, etc.), the frication noise of voiced fricatives is always less than that for voiceless
fricatives.
Although all languages have voiced vowels, some languages
feature voiceless vowels as well, though often these are contextually determined, for example, word finally or in the environment of voiceless consonants. In any case, it seems to be
always the case that a voiceless vowel has a voiced counterpart.
J. H. Greenberg (1969) provided a survey of the incidence of
voiceless vowels in several languages. He found a virtually uniform pattern: Voiceless vowels appear as the counterparts to
vowels higher in the vowel space. (See Table 2.) The explanation
for this also requires reference to the AVC. Among vowels, high,
close vowels like [i] and [u] are almost obstruents. If articulated
sufficiently close, they impede the exiting airflow almost as much
as fricatives. This, in combination with other factors that could
create a slightly open glottis via coarticulation, such as appearing in word- (and thus utterance-) final position or near voiceless obstruents, can lead to the vowel being voiceless. The same
factors apply to glides (approximants) that are high, close like
[j], [w], and [] and account for the frequent devoicing and fricativization that gives rise through sound change to cases like the

dialectal alternations in English as Tuesday [thjuzdi] ~ [th uzdi],

h
and lieutenant [lwtnnt] ~ [lftnnt] and truck [t k]~[th k]
(and similar patterns in many other languages). The same factors
frequently lead to the affrication of stops before high close vowels or glides as in Japanese, for example, or the sound change that
converted Benjamin Franklins natural [ntjul] to the modern
pronunciation [n t l].
Aerodynamic factors also explain patterns of nasal prosody
in languages as diverse as Sundanese (spoken in the Indonesian
archipelago) and Tereno (spoken in the Mato Grosso, Brazil).
As shown in Table 3, in these (and other) languages, the presence of a nasal consonant induces nasalization on all vowels and

b. Tereno:
1st person

3rd person

piho

mbiho

I/he went

ahjaao

anaao

I/he desire(s)

iso

nzo

I/he hoed

owoku

wgu

my/his house

ajo

jo

my/his brother

emou

my/his word

iha

nza

my/his name

Sources: For Sudanese: Robins 1957; for Tereno: Bendor-Samuel 1960, 1966.

glides following, unless blocked by a buccal obstruent (that is,


one made in the oral cavity from the uvular-velar region to the
lips). Nonbuccal obstruents such as the glottal fricative [h] or
the glottal stop [] do not block it. This follows from a straightforward physiological constraint: Buccal obstruents, insofar as they
require the buildup of oral pressure, cannot tolerate venting of
this pressure via an open velic port. The nonbuccal obstruents
require a pressure buildup in a cavity that does not access the
velic port, and so whether the velic port is open or closed is irrelevant to their production.
Among fricatives, the most common are the apical s-like fricatives (Maddieson 1984). This stems from a combination of aerodynamic and anatomical factors. Apical fricatives have relatively
long and intense noise in the high frequencies (3 to 8 kHz) and
are thus easily detected and are distinct from all other speech
sounds. This is due to the fact that the approximation of the
tongue apex at or near the alveolar ridge enables the generation
of a relatively focused high-velocity air jet, which itself generates
noise, but the air jet is also directed at the incisors, which act as a
baffle and cause the generation of more high-frequency noise as
the air hits the teeth surface (this is why s sounds are impaired
in the speech of juveniles when they lose their primary teeth and
before the growth of their permanent teeth). Additionally the
small space between the tongue apex and the lips constitutes a
resonator that reinforces high frequencies.
The existence and properties of a resonator downstream of
the point where turbulent noise is generated underlies another
marked asymmetry in the incidence of stop types. We saw previously that in languages that have both voiced and voiceless stops,
the voiced velar stop [g] is often missing. Among voiceless stops,
the bilabial [p] is often missing (Sherman 1975), for example, in
Arabic and in Aleut (except for loanwords) and in Proto-Celtic.
Noise generated by the air turbulence at the lips have no downstream resonator to amplify the noise.

619

Phonology, Universals of
Table 4.
Kpelle: [w] patterns with velars in nasal assimilation:
Indefinte

Definite

`mi

wax

lu

`nui

fog, mist

`il

dog

we

`wei

white clay

Notes: Melanesian: m > / __w:


Common Melanesian /limwa/ hand ~ Fijian /linga/ (= phonetic [liwa])
/mala/ ~ /mwala/ ~ /wala/ (name of the Mala Island in different dialects of
the island)
Sources: For Kpelle: Welmers 1962; for Melanesian: Ivens 1931.

Figure 1. A schematic representation of the resonating cavities during the


production of different nasal consonants. The solid line demarcates the
main pharyngeal-nasal cavity, which is the same for all such nasals. What
differentiates one nasal from another is the effect produced by the oral
resonator, which branches off this main cavity. Even though a labial velar
consonant has two main constrictions, it is only the rearmost, the velar
constriction, that matters and thus sounds similar to the velar nasal [].

Virtually all languages employ nasal consonants (Ferguson


1963). However, there are never more place distinctions among
nasals than there are obstruents, and there are often fewer. The
acoustics of nasals probably account for this. All nasal consonants have in common the pharynx-plus-nasal air space. What
differentiates one nasal consonant from another is the effect of
the oral cavity, which branches off the nasal-pharyngeal cavity
(see Figure 1). Thus, although nasals are highly distinct as a class
from non-nasals, they are auditorily very similar to one another.
This also partly accounts for the frequent pattern whereby nasals
assimilate in place to a following stop, for example, English
incredible [ikhbl]< in (neg. prefix) + credible; Latin qunctus
> quntus (where original n = [] > [n] / __t).
An interesting cross-language pattern is the character of nasal
assimilation to labial velar consonants such as [kp], [ b] and
[w], that is, segments that have equal constrictions in the labial
and velar region. The nasal that appears before such segments
is invariably a velar [] not the labial [m], for example, Kpelle
and Melanesian (Table 4). The explanation for this pattern can
be seen in Figure 1. What matters for the place of articulation of
a nasal consonant is the first buccal constriction encountered
from the nasal pharyngeal cavity. In a labial velar, this is the velar
constriction; the labial constriction, being beyond that, is acoustically largely irrelevant (Ohala and Ohala 1993).

Phonotactics
The conventional view of common cross-language sound
sequencing or phonotactics is couched in terms of whats
called the sonority hierarchy (attributed to E. Sievers and O.
Jespersen), whereby the favored pattern at syllable onset shows
sounds sequenced in the following order (where omissions
are possible): stop + fricative + nasal + liquid (i.e., non-nasal

620

continuant) + glide + vowel and at syllable offset, the reverse.


The English words swamp and tryst, the French words plume
[plym] and soir [swa], and the Czech Psov name of a city
[psf] would thus adhere to this generalization. But there are
reasons to be skeptical of the sonority hierarchy. First, there is
no empirical content to the term sonority; it has never been adequately defined. Second, it ignores such very common clusters
as /sp/, /st/, and so on in syllable initial position and /ps/, /ts/,
and so on in syllable final position. Third, it ignores cross-language prohibitions of onset sequences like /tl/, /dl/, /ji/, /wu/,
/twu/, and /bji/, that is, sequences that have similar elements.
John Ohala and H. Kawasaki-Fukumori (1997) suggest replacing a) the one-dimensional concept of sonority with a multidimensional measure, where similarity of sounds is a function of
acoustic amplitude, formant frequencies and spectral shape in
general, degree of periodicity (whether from fricatives or stop
bursts), and even fundamental frequency; and b) the notion of
the fixed hierarchy with a measure of the degree of similarity of
sounds according to (a). The more two sounds are similar, the
less common would such sequences be found; the greater the
difference in sounds, the more common. By this criterion, initial
sequences like /sp/and final sequences like /-kst/ (in English
text), and so forth are normal, and initials /tl/, /ji/, /wu/, and so
forth are less preferred.

Sound Changes
Of the thousands of regular sound changes that have been
identified using the comparative method in historical phonology, certain ones are recognized as showing independent
cross-language incidence. One such is velar palatalization, k >
t, t, s, /__i (j), (and similar changes involving the voiced velar
/g/), for example, English cheese [thiz]] from Latin caseus (cf.
Dutch kaas); Ikalanga [ti-ledu] chin < Proto-Bantu *ki-dedu.
Traditionally, the causes of sound change were attributed to two
opposite tendencies: speakers striving for ease of articulation,
which would lead to assimilations and reductions, and speakers striving to speak more clearly, which would lead to exaggeration of articulation and augmentation of pronunciation.
There is no doubt that speakers do alter their pronunciation
in these manners, but it may be seriously questioned whether

Phonology, Universals of

Phrase Structure

these changed forms replace previous norms of pronunciation.


There is no evidence for this. There is, however, an alternative
scenario of sound change that does have empirical support: listeners errors. There have been numerous speech-perception
experiments, some involving natural speech, which revealed
errors that mirrored sound change; for example, in a study in
Winitz Scheib, and Reeds (1972) where listeners heard a fragment of consonant/vowel (CV) syllables, [khi] was misidentified
as [thi] 47 percent of the time, paralleling the change in place
found in velar palatalization. Ohala (1981) has elaborated a
theory of sound change based on listeners misperception or
misparsing of the speech signal. Such common sound change
includes VN > V , for example, Sanskrit dant tooth > Hindi /
dt/, Latin bon- good > French /b/, and the assimilation of
place in C1C2 consonant clusters, Latin scriptu > Italian scritto,
English congress [khagrs] < (ultimately) Latin com- together
+ gradi to walk.

Phonological Universals and Universal Grammar


It has also been proposed that phonological universals arise from
humans genetic endowment in the form of whats called universal grammar (Pertz and Bever 1975). Such claims have been disputed by those who find phonological universals rooted in the
physical and physiological attributes of all human speakers and
hearers. The dust has not settled on this issue as yet.
John Ohala
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bell, H. 1971. The phonology of Nobiin Nubian. African Language
Review 9: 11559.
Bendor-Samuel, J. T. 1960. Some problems of segmentation in the phonological analysis of Tereno. Word 16: 34855.
. 1966. Some prosodic features in Terena. In In Memory of J. R.
Firth, ed. C. E. Bazell, J. C. Catford, M. A. K. Halliday, and R. H. Robins,
309. London: Longmans.
Ferguson. C. A. 1963. Some assumptions about nasals. In Universals of
Language, ed. J. H. Greenberg, 427. Cambridge, MA: MIT Press.
Greenberg, J. H. 1969. Some methods of dynamic comparison in linguistics. In Substance and Structure of Language, ed. J. Puhvel, 147203.
Berkeley and Los Angeles: University of California Press.
. 1970. Some generalizations concerning glottalic consonants,
especially implosives. International Journal of American Linguistics
36: 12345.
Greenberg, J. H, C. A. Ferguson, and E. A. Moravcsik, eds. 1978. Universals
of Human Language. Vol. 2. Phonology. Stanford, CA: Stanford
University Press.
Ivens, W. G. 1931. A grammar of the language of KwaraAe, North Mala,
Solomon Islands. Bulletin of the School of Oriental Studies 6: 679700.
Maddieson, I. 1984. Patterns of Sounds. Cambridge: Cambridge University
Press.
Ohala, J. J. 1981. The listener as a source of sound change. In Papers
from the Parasession on Language and Behavior, ed. C. S. Masek,
R. A. Hendrick, and M. F. Miller, 178203. Chicago: Chicago Linguistics
Society.
. 1983. The origin of sound patterns in vocal tract constraints.
In The production of Speech, ed. P. F. MacNeilage, 189216. New
York: Springer-Verlag.
Ohala, J. J., and H. Kawasaki-Fukumori. 1997. Alternatives to the sonority hierarchy for explaining segmental sequential constraints. In
Language And Its Ecology: Essays In Memory Of Einar Haugen, ed.

S. Eliasson and E. H. Jahr, 34365. Trends in Linguistics: Studies and


Monographs, Vol. 100. Berlin: Mouton de Gruyter.
Ohala, J. J., and J. Lorentz. 1977. The story of [w]: An exercise in the
phonetic explanation for sound patterns. Berkeley Linguistics Society,
Proceedings, Annual Meeting 3: 57799.
Ohala, J. J., and M. Ohala. 1993. The phonetics of nasal phonology: Theorems and data. In Nasals, Nasalization, and the Velum, ed.
M. K. Huffman and R. A. Krakow, 22549. San Diego, CA: Academic
Press.
Pertz, D. L., and T. G. Bever. 1975. Sensitivity to phonological universals
in children and adolescents. Language 39: 34770.
Robins, R. H. 1957. Vowel nasality in Sundanese. In Studies in Linguistic
Analysis, 87103. Oxford: Blackwell.
Sherman, D. 1975. Stop and fricative systems: A discussion of paradigmatic gaps and the question of language sampling. Stanford Working
Papers in Language Universals 17: 131.
Welmers, W. E. 1962. The phonology of Kpelle. Journal of African
Languages 1: 6993.
Winitz, H., M. E. Scheib, and J. A. Reeds. 1972. Identification of stops
and vowels for the burst portion of /p,t,k/ isolated from conversation
speech. Journal of the Acoustical Society of America 51.4: 130917.

PHRASE STRUCTURE
It is an ancient observation that natural language syntax is hierarchically organized. As can be seen from a variety of diagnostics,
the words comprising a sentence do not behave as beads on
a string but group into successively larger units, or constituents.
Phrase structure (PS) is a formal representation of this constituent structure. PS is typically depicted as a tree-structured
graph (Figure 1), which encodes three sorts of structural information: i) dominance, specifying the words and constituents that a
constituent contains within it (e.g., as shown by vertical placement
in the figure, prepositional phrase (PP) dominates on and television); ii) precedence, specifying the temporal orderings among the
words and constituents (e.g., as shown by horizontal position, the
constituent most fans precedes the constituent watched the game
on television); and iii) labeling, specifying the grammatical category of each word and constituent (e.g., the constituent the game
is a noun phrase (NP)). In PS-based approaches, this structural
information plays an important role in defining the conditions
under which grammatical dependencies may obtain (see agreement, anaphora, binding, and case), and PS is often taken to
be the input to transformational operations (see movement and
transformational grammar). Further, PS representations
serve as the interface between syntax and semantics, as they provide the structural information necessary for interpretation (see
compositionality, thematic roles, and logical form).
A fundamental question concerns how the range of possible
PS is specified in a grammar. The earliest answer comes from
Noam Chomsky (1957), who suggests that PS is generated by a
set of phrase structure rules, like the following:
1. S NP VP
2. NP N
3. NP Det N
4. VP V NP
5. VP VP PP
6. PP P NP

621

Pidgins

Figure 1. Phrase structure representation for Most fans watched the game
on television.

In these rules, a symbol appearing to the left of the arrow can be


rewritten as the sequence of symbols to the right of the arrow. The
process of PS generation begins with a distinguished start symbol S and successively rewrites the symbols in the string using
the rules of the grammar until no rewritable symbols remain.
An example of this process follows, with the number above each
arrow indicating the rewriting rule used:
1

S NP VP Det N VP Det N VP PP Det N V NP PP


6

Det N V Det N PP Det N V Det N P NP Det N V Det N P N

The PS in Figure 1 can be understood as a history of this


derivation: The children of a node correspond to the sequence
of symbols into which that node is rewritten. Some recent
approaches have maintained rewriting as part of the grammar
but have questioned the nature of the rewrite rules employed
in this system, generalizing and modifying them in a variety
of respects (see x-bar theory and minimalism ). Other
approaches have abandoned rewriting, taking well-formed
PS representations to be those that best satisfy a set of grammatical constraints (see head-driven phrase structure
grammar, lexical-functional grammar , and optimality theory ).
Robert Frank
WORKS CITED AND SUGGESTIONS FOR FURTHER READING.
Baltin, Mark, and Anthony Kroch, eds. 1989. Alternative Conceptions of
Phrase Structure. Chicago: University of Chicago Press.
Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton.

PIDGINS
Pidgins are the worlds only non-native languages. They are
typically acquired by adults, after the critical period for language
acquisition has passed. They normally arise wherever sufficient
speakers of mutually incomprehensible languages must interact with one another. Some pidgins arose through and for trade;
the most plausible derivation offered for the origin of the name
pidgin attributes it to the Chinese pronunciation of business

622

(Baker and Muhlhausler 1990). Pidgins used widely for trading


purposes (but not limited to such functions) include Russenorsk,
Chinese Pidgin English, and Chinook Jargon, once widely used in
the American Northwest. Other pidgins arose where large numbers of slaves and/or indentured laborers had to work together
on colonial plantations. Such pidgins were usually short-lived,
evolving into creole languages; contrary to some claims, careful
examination reveals manifold signs of their pidgin ancestry, and
the earliest attestations of some (Baker and Corne 1982; van den
Berg 2000) show pidgin-like structures. Although the reality of this
pidgin-to-creole cycle has been denied (see entry on creoles),
massive empirical evidence exists in Hawaii, as described by S. J.
Roberts (1995, 1998); there is also evidence of prior pidginization
in other creoles, such as fossilized sequence markers and marked
vocabulary mixture (see the following examples).
As compared with natural languages (including creoles), all
pidgins are severely impoverished, with sharply reduced vocabularies, few structural consistencies, and few if any inflectional
affixes; complex sentences very seldom occur. Function words
are rare, if not completely absent; categories normally expressed
via auxiliary verbs of tense, mood and aspect are indicated, if at
all, by two adverbial forms meaning roughly soon or finish
that are attached, not adjacent to the verb as in natural languages generally, but clause-finally or clause-initially. We find,
for example, baimbai (English by-and-by) and pau (Hawaian
finished), baimbai and pinis in Tokpisin; also a number of similar pidgin fossils in creoles (sometimes inside, sometimes still
outside the verb phrase), such as fin(i), finish in French-related
creoles, done in English-related creoles, or (ka)ba (Portuguese
acabar finish) in Portuguese-related creoles.
If a pidgin persists in a relatively stable population (one not
subject to the rapid expansion and turnover that typically characterize creole societies) and is widely used over a long enough
period, it may acquire a more stable (although still limited)
structure. However, pidgins still suffer from widespread misunderstanding of the linguistic mechanisms through which they
arise. According to many writers (e.g., Bakker 1995; Manessy
1995) they are reduced or simplified versions of preexisting
languages, or failed attempts by speakers with inadequate access
to acquire the locally dominant language a view reinforced by
standard usage of expressions such as Pidgin English, Pidgin
French, and so on.
Pidgins do not derive from processes applied to any preexisting natural language, however, but (as is clear both from historical data in Hawaii and reminiscences of older residents; see, e.g.,
Bickerton 1981, 11) arise naturally from strategies employed by
individuals of any ethnic background in a multilingual situation
where no single existing language is both viable and accessible.
Speakers seek to communicate by any means possible, using isolated words from their own language, from their interlocutors
language (if they know any), and from any third or fourth language that they may happen to have picked up.
These words are seldom assembled in the way words are
assembled in modern human languages, that is, hierarchically.
Except for occasional rote-learned phrases, words are attached
sequentially, like beads on a string. Consequently, no true grammatical relations exist, limiting utterances to brief strings of a few
words without embedding.

Pidgins

Pitch

The degree to which pidgins (and subsequent creoles) show


lexical mixture has been underestimated in the literature.
Noteworthy are Russenorsk (with roughly equal quantities of
Norwegian and Russian words, but also 14% of its vocabulary
drawn from other languages; Broch and Jahr 1984) and Chinook
Jargon (only 41% from Chinook, with at least 11 other languages,
European and non-European, contributing to the remainder;
Gibbs 1863). The baragouin that preceded the formation of the
Lesser Antillean French Creoles showed a similar mixture (Wylie
1995).
Evidence from creoles suggests that the pidgins they evolved
from had equally mixed vocabularies. Berbice Dutch draws 27
percent of its vocabulary from one African language, Ijaw (Smith,
Robertson, and Williamson 1987). Saramaccan may have as
many as 50 percent African words (Price 1976). Comparison of
Saramaccan and Sranan vocabularies shows that these creoles,
both derived from the same pidgin, differ in perhaps as many as
75 percent of their vocabulary items; contra most sources, relatively few of these differences involve a Portuguese/English contrast, strongly suggesting an antecedent macaronic pidgin that
drew on English, Dutch, and Portuguese, as well as a variety of
African and Amerindian languages, and from which Sranan and
Saramaccan each made a different selection.
Why pidgins have so often been regarded as simplifications of particular (almost invariably, European) languages is
revealed by the massive database of contemporary citations
gathered by Roberts (summarized in Roberts 1995, 1998. 2005,
but not yet published in its entirety). From this data, it is clear
that pidgin descriptions have been shaped by observer bias.
Most citations from English-language sources contain a preponderance of English words, showing why J. E. Reinecke (1969)
and others characterized the lingua franca of early Hawaii as a
predominantly English pidgin with a sprinkling of Hawaiian
words. However, the abundant Hawaiian-language sources
reverse this picture, presenting a predominantly Hawaiian
vocabulary with a sprinkling of English words, while the much
sparser Japanese-, Chinese-, and Portuguese-language sources
each contain a higher admixture of their own languages (brief
sentences containing words from three different languages are
by no means uncommon). Clearly, in a pidgin situation, observers record what they best understand and downplay or ignore
the rest.
The well-attested existence of a pidgin phase in the life cycle
of creoles also helps to explain the strong structural similarities that hold between creoles of widely different provenance.
For such similarities to arise, input had first to be reduced to an
abnormally low level of structure, forcing children to draw on
their innate language faculty for the systematic structures that a
pidgin can manage without, but that are essential for any natural
language.
Derek Bickerton
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baker, P., and C. Corne. 1982. Isle de France Creole. Ann Arbor,
MI: Karoma.
Baker, P., and P. Muhlhausler. 1990. From business to pidgin. Journal
of Asian Pacific Communication 1: 87115.

Bakker, P. 1995. Pidgins. In Pidgins and Creoles: An Introduction, ed. J.


Arends, P. Muysken, and N. Smith, 2539. Amsterdam: Benjamins.
Bickerton, D. 1981. Roots of Language. Ann Arbor, MI: Karoma.
Broch, I., and E. H. Jahr. 1984. Russenorsk: A new look at the
Russo-Norwegian pidgin in northern Norway. In Scandinavian
Language Contacts, ed. P. S. Ureland and I. Clarkson, 2164.
Cambridge: Cambridge University Press.
Gibbs, G. 1863. A Dictionary of the Chinook Trade Jargon or Trade
Language of Oregon. New York: Gramoisy.
Manessy, G. 1995. Croles, Pidgins, Varits Vehiculaires. Paris: CNRS
Editions.
Price, R. 1976. The Guiana Maroons: A Historical and Bibliographical
Introduction. Baltimore: Johns Hopkins University Press.
Reinecke, J. E. 1969. Language and Dialect in Hawaii. Honolulu: University
of Hawaii Press.
Roberts, S. J. 1995. Pidgin Hawaiian: A sociohistorical study. Journal of
Pidgin and Creole Languages 10: 156.
. 1998. The role of diffusion in the genesis of Hawaiian Creole.
Language 74: 139.
. 2005. The Emergence of Hawaii Creole English in the Early 20th
Century: The Sociohistorical Context of Creole Genesis. Ph.D. diss.,
Stanford University.
Smith, N., I. Robertson, and K. Williamson. 1987. The Ijaw element in
Berbice Dutch. Language in Society 16: 4990.
van den Berg, Margot. 2000, Mi no sal tron tongo: Early Sranan in Court
Records, 16671767. Unpublished Masters thesis, Radboud University
Nijmegen.
Wiley, J. 1995. The origin of Lesser Antillean French Creole: Some literary and lexical evidence. Journal of Pidgin and Creole Languages
10: 71126.

PITCH
When an object vibrates, its movement produces changes in
air pressure that radiate like waves from the source. If the frequencies of the vibrations are roughly between 20 and 20,000
cycles per second, or Hertz (Hz), ideally they can be heard by a
young, healthy human listener. The range of hearing frequencies declines with age. The physical characteristic of the vibrating body, frequency, produces a psychological experience called
pitch. In general, a low frequency produces the sensation of a low
pitch (for example, the 60 Hz hum produced by electrical power
in a poorly grounded radio), with the pitch increasing as the frequency increases (a male voice at 100 Hz, a female voice at 200
Hz, a childs voice at 300 Hz). Because there is a close correspondence between frequency and pitch, people frequently use
the terms interchangeably. However, in addition to frequency,
the sensation of pitch is influenced by an interaction between
the amplitude of the vibration and the range of the frequency.
Pitch is also influenced by the complexity of the vibration and its
corresponding wave form.
A vibrating body oscillates as a single entity, producing a frequency referred to as the fundamental frequency. So, when the
key for A above middle C is played on a piano, a string vibrates
at 440 Hz. Vibrating bodies are not perfectly rigid, though, and
the string also vibrates in parts as if it is two strings (producing
a frequency of 880 Hz), and three strings (1320 Hz), etc. Thus, a
vibrating body produces a series of frequencies beginning with
the fundamental frequency (f0) and including its harmonics,
which are multiples of the f0. The distribution of acoustic energy

623

Pitch
across the harmonic series contributes to the quality or timbre
of the sound. In addition to sound quality, the harmonics contribute significantly to the perception of pitch. The fundamental
frequency of a harmonic series can be artificially removed without changing the pitch, a demonstration referred to as the missing fundamental. In speech, the harmonic series is a function of
the complex wave produced by the glottal source. Formants are
bands of resonance that concentrate the acoustic energy produced by the glottal source as a function of the vocal tract configuration and have center frequencies that reflect the vocal tract,
rather than the harmonic series.
Pitch can be experienced from pure tones (f0 alone), which
do not occur naturally, as well as from complex tones (f0 + harmonic series), but the pitch of complex tones is a stronger percept, allowing finer discriminations of f0 frequency differences.
The processing of pitch from a determination of the f0 versus
the pattern recognition of a harmonic series relies on different neurological systems. Simple frequency determination can
occur at multiple levels of the nervous system, but complex pitch
processing occurs in auditory areas in the right cerebral hemisphere that complement speech and language areas in the left
cerebral hemisphere (Sidtis 1980; see also right hemisphere
language processing and left hemisphere language
processing).
For practical purposes, pitch in language can be viewed as a
direct function of f0. Unlike the pitch distinctions made in music,
linguistic pitch distinctions are comparatively coarse. Whereas
a musical octave can be divided into 12 semitones, and vibrato
in experienced singers can be consistently less than a semitone,
most linguistic communicative situations only require distinctions of three semitones or more. Further, the pitch distinctions
in language are relative, allowing men, women, and children to
make the same linguistic and paralinguistic distinctions despite
different vocal f0s (see paralanguage), whereas pitch distinctions in music reference specific frequencies (e.g., a musical
scale tuned to 440 Hz).
A number of linguistic and paralinguistic phenomena are
provided by pitch. At the suprasegmental level, pitch produces
the melodic line of an utterance to convey linguistic intonation (e.g., declination effect: falling pitch anticipating the end
of a statement, rising pitch indicating a question), sociolinguistic information (e.g., uptalk, rising pitch at the end of a
statement, falling pitch as a cue for turn-taking), and paralinguistic information (e.g., emotion, attitude). Pitch can also be
used with loudness to provide syllable accent at the segmental level.
Pitch also has a lexical role in tone and pitch accent languages. Tone languages may have many tone patterns (estimates vary, but the numbers are fewer than the 12 notes in the
musical octave), and they tend to fall in relative categories like
high, medium, and low, further distinguished by rising and falling patterns. Because such distinctions are relative, the listener is
required to perform a tone normalization to identify a speakers
lexical tones. Just as simple and complex pitch perception rely
on different brain mechanisms, the processing of pitch for linguistic and nonlinguistic purposes engage different neurological
systems, principally the temporal lobes in the left and right
cerebral hemispheres (Van Lancker and Fromkin 1973).

624

Poetic Form, Universals of


In sum, the perception of pitch can play linguistic and paralinguistic roles at the suprasegmental and segmental levels of
utterances. Pitch is closely related to the physical stimulus frequency, but as a psychological event, it is influenced by the complexity, frequency range, and loudness of the tone. Pitch can be
processed in a low-resolution mode at many levels in the nervous system or at high-resolution mode in specialized areas of
the cerebral cortex of the brain in the right temporal lobe. Pitch
can also be processed in linguistic and nonlinguistic modes by
the left and right temporal lobes of the brain, respectively. The
variation of pitch during fluent speech can be considered a truly
integrative process that conveys both linguistic and paralinguistic information.
John Sidtis
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Crystal, David. 1987. The Cambridge Encyclopedia of Language. New
York: Cambridge University Press. Presents multiple brief discussions
of pitch in different linguistic roles.
Sidtis, John J. 1980. On the nature of the cortical function underlying right
hemisphere auditory perception. Neuropsychologia 18.3: 32130.
Sidtis, John J., and D. Van Lancker Sidtis. 2003. A neurobehavioral approach to dysprosody. Seminars in Speech and Language
24.2: 93105. Describes how different aspects of prosody rely on different brain structures.
Van Lancker, Diana, and V. A. Fromkin. 1973. Hemispheric specialization for pitch and tone: Evidence from Thai. Journal of Phonetics
1: 1019.

POETIC FORM, UNIVERSALS OF


A poetic universal is manifested by a feature that is found very
widely (for example, rhyme; see rhyme and assonance) or by
a relation between features that is found very widely (for example, rhyme is generally found in verse, not in prose). Like linguistic universals, poetic universals might be studied by comparative
work (often depending on fairly salient features) or by focused
work on the abstract forms (hypothesized abstract universals)
underlying the surface poetic forms in a particular language.
A theory of universals can be formulated in terms of universal
parameters (sets of related formal options) from which a specific
poetic tradition makes specific choices. Unlike a language, where
only one choice can be made, a literary tradition can divide into
subtraditions, each making a different choice (thus, for example,
classical Sanskrit literature includes quite different kinds of metrical verse). There is no presupposition that different modalities
will throw up significantly different universals; thus, the general assumption is that written, oral, and signed literatures will
have similar characteristics (see oral composition and sign
languages). The term literature is here used interchangeably
with verbal art and should not be used to imply a special status
for written literature.
With the exception of folklore studies (which, however, tend
to have a narrow areal range), no discipline or subdiscipline
takes as its responsibility the investigation of poetic universals.
Some researchers are actively hostile to universals in favor of
an alternative emphasis on the special characteristics of each
tradition, and some fieldworkers ignore verbal arts when they

Poetic Form, Universals of


gather information about a language; some missionary linguists
have even been known to displace indigenous verbal arts with
hymns or Bible stories, with the odd result that it is these, rather
than indigenous texts, that are gathered in grammars and other
reports. Thus, there has been relatively little work, either descriptive or theoretical, on poetic universals.
The universal of poetic form that is most widely manifested,
and may indeed be found everywhere, is the possibility of verse,
as a way of organizing language. A text that is in verse is a text
cut into a sequence of lines (= verse lines). A line is a section of
text that supports two or more generalizations, and the investigation of universals of poetic form is largely an investigation of the
generalizations formulated in terms of the line.
For example, Miltons poem Paradise Lost is in lines, and
so is verse. (This example is chosen [see Fabb 2002] because
in the eighteenth century some critics claimed that it was not
verse.) Here are five generalizations that are supported by the
line in this poem: 1) There are 10 syllables in each line; 2) the
end of the line coincides with the end of a word; 3) while there
is a tendency for stressed syllables to be in even-numbered
positions, the line-initial position is often also occupied by a
stressed syllable; 4) the word of is found with greater than
expected frequency as the first word in a line, as seen, for example, in the first two lines of the poem; and 5) the printed form of
the text arrays lines in a vertical sequence. The study of poetic
universals seeks to establish the distribution of each of these
kinds of form (i.e., each of these generalizations) and then to
understand whether these distributions imply anything about
poetic universals.
What would the study of poetic universals make of these
generalizations? Generalization 1 can be understood more
abstractly as the line contains a specific number of syllables,
and this is definitely a universal, in the sense that it is true of
many verse traditions; however, we might also ask whether the
fact that there are specifically 10 syllables in the line also constitutes a (more narrowly manifested) universal. Generalization
2 is also very widespread; while lineation does not necessarily
respect phrase or sentence boundaries, word boundaries
are usually respected (and this connects with the fact discussed later that metrical rules control for word boundaries
but not for phrase or sentence boundaries); hence, there is an
interesting potentially universal relation between the line that
is a nonlinguistic section of text and the word that is a linguistic constituent. Generalization 3 in its specific formulation is
generally true of English, but more abstractly the possibility of
relaxing a rule at the beginning of a line is found widely; for
example, Greek verse lines often begin with a syllable whose
weight is uncontrolled (anceps). Generalization 4 holds true
also of eighteenth- and nineteenth-century verse after Milton;
is it telling us something significant about of or about the naturalness of beginning a line with a preposition phrase? On the
one hand, we might say that post-Miltonic verse is just imitating a kind of form that Milton may have invented, but we might
also note that much twentieth-century free verse also favors
preposition-initial lines. It is worth noting that the Greek early
elegiac poets tended to begin and end lines mainly with words
used previously by Homer, suggesting again that choice of particular words at line edges has the potential to be a universal.

This generalization also draws attention to the fact that literary


practices can have features in common because of imitation
of an admired writer or foreign tradition. Finally, generalization 5 is not true of all ways of writing verse, but we might ask
whether the wide acceptance of this practice tells us something about the cognitive status of lines (e.g., that we cognize
each line as a separate, isolated unit). These are the kinds of
questions we might ask in exploring the possibility of universals of poetic form.
Relative to the line, six categories of poetic form might be identified, which follow; there may be others (such as the tendency
to use specific words at line edges), and the grouping in this list
depends on theoretical assumptions and is not simply given to
us by the data. For each of the categories, we might explore its
status as a universal. None of these kinds of form is required in a
verse tradition. Most verse traditions are either metrical (i.e., they
involve the counting of syllables) or parallelistic, which is itself
an interesting universal. Either metrical or parallelistic verse can
also have rhyme and alliteration, though rhythm and word
boundary rules are usually found only in metrical verse. There
are some verse traditions, such as modern free verse, that do not
consistently manifest any of these categories of poetic form (but,
as noted, they may manifest other categories of poetic form, such
as the tendency to use particular words or particular syntactic
structures at line edges).
1. The counting of syllables: In all metrical verse, the
line has a specific number of syllables (Irish deibhidhe has 7,
Icelandic drttkvaett has 6) or a defined range of possible numbers of syllables (English iambic pentameter is normatively 10
but permits 911, French alexandrin 1213, Homeric dactylic
hexameter 1317, Japanese haiku 35 or 47 in different lines,
etc.). Some theories of meter suggest that units other than syllables can also be counted, such as morae (subsyllabic units)
or larger groupings of syllables. Across literary traditions, we
find that not all syllables in the line are counted for metrical
purposes; in particular, when a vowel-final syllable precedes a
vowel-initial syllable, many traditions permit or require these
to count as a single metrical syllable. Various other generalizations can be made about the counting of syllables, which may
be the source of universals (this is the basic claim of Fabb and
Halle 2008).
2. The patterning of syllables, requiring a division of syllables into two classes for metrical purposes: Accentual rhythms
manifest this type of patterning, where syllables are distinguished into two classes as stressed versus unstressed and patterned on this basis, for example, into triplets where every third
syllable is stressed. More generally, most kinds of metrical verse
divide syllables into two classes, on the basis of stress, syllable
weight, lexical tone, or whether they alliterate and on possibly
other characteristics yet undiscovered); the class membership
of a syllable then admits it to specific positions within the verse
line. In some cases, the distribution of the two types of syllable is
periodic (e.g., a regular recurrence as in an iambic rhythm) and
in other cases partially periodic or apparently nonperiodic (as
in the superficially aperiodic sequences of heavy and light syllables required in Classical Sanskrit verse). An interesting rhythmic universal is that syllables are divided into just two classes
for metrical purposes, even when there would be a basis in the

625

Poetic Form, Universals of


language for more than two classes. For example, Vietnamese
has six types of lexical tone, which are grouped into just two
tonal classes for the purposes of metrical regulation. It has been
claimed that another rhythmic universal is based on the (optimality-theoretic) phonological notion of the moraic
trochee as a basic rhythmic unit (see, for example, Golston and
Riad 2005). Robbins Burling (1966) claimed that a certain combination of meter and rhythm is found universally in childrens
verse.
3. Word-boundary rules: In the metrical line, two adjacent
syllables can be required to be in separate words that is, a
word boundary must intervene (by a caesura rule); or they can
be required to be in the same word that is, a word boundary
must not intervene (by a bridge rule). Thus, for example, the
sixth syllable in a French 12-syllable alexandrin must be wordfinal. Word-boundary rules are widespread, and this suggests
an underlying universal. In particular, the word seems to
have a special status in meter: Metrical rules do not control
for phrase or sentence boundaries, and this also points to a
universal.
4. Rhyme understood as the repetition of the end of the
syllable (usually including its nucleus): Rhyme is very widespread, including in nonmetrical verse: Parallelistic verse can
have rhyme, and we may even find rhyme of a kind in prose.
Rhyme would seem to manifest a universal. Furthermore, it is
cross-linguistically true that sound sequences can be counted
as rhyme that are phonetically dissimilar but share underlying
similarities; this possibility, and perhaps the way in which dissimilar phonetic sequences are admitted as equivalent, may
manifest universals.
5. Alliteration: Understood as the repetition of the beginning
of the syllable (sometimes including its nucleus), alliteration is
much rarer than rhyme, which may itself tell us something about
poetic universals. The fact that words beginning with dissimilar
vowels are considered to alliterate in separate traditions (e.g.,
Old English and Somali) may suggest a universal. Alliteration
also appears to be subject to locality constraints that do not hold
for rhyme; thus, alliteration tends to be line-internal or between
adjacent lines and does not interlace as rhyme does in ABAB
structures (Fabb 1999).
6. Parallelism: This formal property is very widespread in the
literatures of the world, and Roman Jakobson (1960) thought of
it as a defining formal characteristic of poetry because it draws
attention to form by repeating it (he included meter, rhythm,
rhyme, and alliteration as types of parallelism). There are different kinds of parallelism, all quite widely distributed, including parallelism of sound sequences, parallelism of words, and
parallelism of syntactic structures. Universals have yet to be
established.
Lines may be organized into larger units, such as stanzas. The
possibility of organizing lines into stanzas is sufficiently widespread as to count as a universal. Stanzas have characteristics,
such as having a specific number of lines, lines of the same or
varying lengths, or rhyme. While there is clearly much variety,
universals may be discovered, perhaps involving the way lines
are counted in a stanza, or the possible ways in which rhyme
patterns can be structured. For example, there may be locality
effects, such as limits on the possible distance (e.g., number of

626

intervening lines) between related elements in different lines.


(Bruce Hayes and Margaret MacEachern 1998 discuss universals
in stanza structure.)
Are there any kinds of poetic form that are unrelated to the
line? In part, this is a matter of definition (i.e., of whether we
intend poetic to mean verse). Clearly, various types of figure
and trope are widely found in the worlds literatures, and not only
in verse (though these may be better understood as linguistic or
pragmatic universals, rather than poetic universals; see pragmatics, universals in). And there are possible universals of
narrative form (see narrative universals) that might also be
thought of as poetic, some of which may in fact be related to
universals of verse form. It is possible that there are universals
that relate verbal art to counting (perhaps via the aesthetic,
and perhaps extending beyond verbal art). Metricality is based
on counting, as are the kinds of form closely related to metricality, such as rhythm and word-boundary placement. Parallelism
may be based on counting of a different kind (a tally or one-toone alignment). Narratives seem to involve counting at various
levels, including Dell Hymess (1992) suggestion that narratives
are structured around pattern numbers, with narrative units
organized in two and four or in three and five in a particular
tradition.
What is the relation between poetic form and linguistic
form? One widely held view (associated, for example, with
Jakobson) is that the forms of poetry in a particular language
are dependent on the linguistic form of that language; the fixing
of a choice from a poetic parameter is thus dependent on the
fixing of a choice from a linguistic parameter. Thus, for example, the claim might be that some languages are better suited to
quantitative meters (where the distinction between heavy and
light syllables is criterial) and others better suited to accentual
meters (where the distinction is instead between stressed and
unstressed syllables); English, for one, has successful accentual
verse, but neither nonaccentual syllable counting nor quantitative meters have taken hold in the poetic tradition, despite
attempts to introduce them. Kristin Hanson and Paul Kiparsky
(1996) propose a theory of poetic universals that has a parameter offering a range of differently sized phonological units that
can match metrical positions; in a specific tradition, a specific
size of phonological unit matches the metrical position. Thus,
for example, Chinese and Japanese verse both have five- and
seven-unit lines, whose positions are filled by syllables in the
former and subsyllabic morae in the latter. An alternative position is taken by Nigel Fabb and Morris Halle (2008), who argue
that poetic form and linguistic form have systemic subcomponents in common, including some parameters (and indeed
share subcomponents also with music); however, there is no
necessary relation between the poetic form and the linguistic
form of a particular language.
Is verbal art itself a universal? That is, is there any single way
in which it can be characterized, and distinguished from general verbal behavior? The most commonly given answer to this
question is yes: that verbal art is distinguished from verbal
behavior because it draws attention to its own form. This is the
basis of Jakobsons (1960) projection principle, of Nelsons
Goodmans (1978) notion of style as exemplified by a text, or
of Richard Baumans (1984) notion of verbal art as a text that is

Poetic Language, Neurobiology of


fully performed. All of these answers assume that the question
is not about the work itself, which cannot categorically be said
to be either verbal art (literature) or not verbal art (not literature); instead, works are more or less verbal art to the extent that
they carry the distinguishing characteristics of verbal art (or as
Goodman would say, works may carry symptoms of verbal art).
Being verbal art is thus a matter of degree. All of these answers
also imply that verbal art should be universal and that all users of
language should be able to have a literature because literature is
just a particular and always-possible way of using language.
Nigel Fabb
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bauman, Richard. 1984. Verbal Art as Performance. Prospect Heights,
IL: Waveland.
Burling, Robbins. 1966. The metrics of childrens verse: A cross-linguistic study. American Anthropologist 68: 141841.
Edmonson, Murray S. 1971. Lore: An Introduction to the Science of Folklore
and Literature. New York: Holt, Rinehart, and Winston. A survey of the
worlds literatures, with an interest in universals.
Fabb, Nigel. 1997. Linguistics and Literature: Language in the Verbal Arts
of the World. Oxford: Blackwell. A survey of linguistic work on literature, drawing out potential universals.
. 1999. Verse constituency and the locality of alliteration. Lingua
108: 22345.
. 2002. Language and Literary Structure: The Linguistic Analysis of
Form in Verse and Narrative. Cambridge: Cambridge University Press.
Fabb, Nigel, and Morris Halle. 2008. Meter in Poetry: A New Theory.
Cambridge: Cambridge University Press. The first comprehensive theory of the worlds meters.
Golston, Chris, and Tomas Riad. 2005. The phonology of Greek lyric
meter. Journal of Linguistics 41: 77115.
Goodman, Nelson. 1978. Ways of Worldmaking. Indianapolis: Hackett.
Hanson, Kristin, and Paul Kiparsky. 1996. A parametric theory of poetic
meter. Language 72: 287335.
Hayes, Bruce, and Margaret MacEachern. 1998. Quatrain form in English
folk verse. Language 64: 473507.
Hymes, Dell. 1992 Use all there is to use. In On the Translation of
Native American Literatures, ed. B. Swann, 83124. Washington,
DC: Smithsonian Institution Press.
Jakobson, Roman. 1960. Linguistics and poetics. In Style in Language,
ed. T. Sebeok, 35077. Cambridge, MA: MIT Press.
Preminger, A., and T. V. F. Brogan, eds. 1993. The New Princeton
Encyclopedia of Poetry and Poetics. Princeton, NJ: Princeton University
Press. The best source of information about the worlds poetic
traditions.

POETIC LANGUAGE, NEUROBIOLOGY OF


Not unlike the elephant approached by a delegation of blind
men, each of whom investigated a body part seemingly unrelated to the others, the neurobiology of poetic language has
been approached from such widely varying perspectives that
the results hardly seem to share tusks and a tail. One can peer
into ancient poetry in search of evidence that consciousness has
changed over time, or explore cross-cultural poetics for clues
to common neural processing mechanisms. One can map the
regions of the brain involved in the processing of poetic devices,
or pursue the question of how poets and nonpoets may differ in
their neural functioning.

In The Origins of Consciousness in the Breakdown of the


Bicameral Mind ([1978] 1990), psychologist Julian Jaynes
claimed that Homers Iliad and the oldest books of the Hebrew
Old Testament portrayed human beings in a twilight state of
awareness. According to Jaynes, the characters in the worlds
most ancient poetry take action not as the result of personal
thought and conscious decision but because they hear the voice
of a god ordering them to do so. He hypothesized that the voices
of the gods were actually auditory hallucinations produced in the
brains right temporal lobe. Transmitted to the left temporal
lobe, seat of left hemisphere language processing, they
were perceived as coming from outside the self. Pointing to the
metered verse spoken by Greek oracles, the language of Hebrew
prophets, and the god-dictated Vedas of India as evidence of
the link between poetry and god-speech, Jaynes asserted that
the god-voices spoke in verse. Beginning around 1000 b.c.e., he
believed, the discovery and spread of writing brought about a
breakdown in the functioning of the bicameral mind, although
the auditory hallucinations of modern-day schizophrenics furnish evidence that contemporary consciousness can revert to its
earlier state. While the book was a finalist for the National Book
Award, his theory has generated controversy.
Homers dactylic hexameter verse line is among those
surveyed by literary scholar Frederick Turner and psychophysicist Ernst Pppel in their essay, The neural lyre: Poetic meter,
the brain, and time (1989). Comparing the metrical verse line
lengths of various language cultures, Turner and Pppel found
that almost all of the lines took two to four seconds to recite,
with distribution peaking in the range of 2.5 to 3.5 seconds. The
authors suggested that their findings might reflect a constant in
human neural processing: a human present moment or information buffer averaging about three seconds in length, subject
to variation due to cultural factors. Literary critics have targeted
the essays biological reductionism and its underlying politics,
as the authors view free verse as an historical anomaly compatible with bureaucratic or even totalitarian modes of cognition. To
date, their thesis has not been subjected to empirical scientific
testing.
Initial data on the neurobiology of poetic language came from
studies of subjects who had sustained brain damage or undergone commissurotomy, surgical severing of the corpus callosum. Those findings suggested that comprehension of many
poetic devices involved right hemisphere language processing, even though the left hemisphere was known to control
language in most persons: Verbal intelligence tests of the isolated
left hemispheres of commissurotomy subjects fell in the normal
range, while subjects experienced aphasia after left (but rarely
right) hemisphere damage. However, over time, tests of righthemisphere-damaged (RHD) subjects revealed subtle linguistic
deficits in comprehending poetic devices such as metaphor or
connotation, while other studies showed that the isolated right
hemisphere recognized certain concrete nouns (i.e., images),
vowel sounds (i.e., assonance), and emotional prosody in spoken or written language all important for understanding poetry
(Kane 2004).
For example, Ellen Winner and Howard Gardner (1977) had
left-hemisphere-damaged (LHD), RHD, and control subjects
match a spoken expression such as He has a heavy heart to one

627

Poetic Language, Neurobiology of


of four pictures, with the correct response being metaphoric. To
their surprise, RHD patients performed poorly, often selecting
the literal match for example, an illustration of someone carrying a giant heart. Similar results were obtained from metaphoric
word-matching studies. Then G. Bottini and colleagues (1994)
used PET (positron emission tomography; see neuroimaging) to scan normal brains processing literal and metaphoric
sentences; blood flow (signaling brain activation) increased in
six regions of the RH when metaphoric but not literal sentences
were being processed. The right hemispheres role in controlling
metaphor seemed obvious or was it?
As advances in technology have made fMRI (functional magnetic resonance imaging) studies of normal linguistic processing
possible, the results have raised as well as answered questions.
It is now known that conventional or frozen metaphors are
processed much like ordinary denotative language, primarily
in the left hemisphere, whereas novel metaphors as well as
ironies and the literal meanings of idioms light up additional
regions of the right hemisphere (Giora et al. 2000; Mashal, Faust,
and Hendler 2005; Sotillo et al. 2005; Eviatar and Just 2006, Faust
and Mashal 2007). Thanks to fMRI, the precise brain regions
involved in novel metaphoric processing can be pinpointed: the
right homologue of wernickes area, right and left premotor
areas, right and left insula, and brocas area (Mashal, Faust,
and Hendler 2005). Of course, novel and not conventional metaphors are the stuff of poetic language, unless ones definition
of poetry extends to greeting card verse, and so the role of the
right hemisphere remains significant. It was at first assumed that
the right hemispheres increased involvement in novel metaphoric processing corresponded to visuospatial processing of
evoked imagery, whereas conventional metaphors were unlikely
to evoke pictures in the mind. However, Rachel Gioras graded
salience hypothesis (1997; Giora et al. 2000), which assumes that
the most common or salient meaning of an expression is processed first, regardless of whether it is literal or metaphoric, and
that right hemisphere language processing regions get recruited
only when secondary meanings must be accessed, provides an
alternate explanation.
Concrete nouns are the building blocks of poetic images, and
preliminary studies of commissurotomy patients led by Michael
Gazzaniga showed that the isolated right hemisphere was capable of recognizing simple nouns. Subsequent tests of normal subjects, isolating either the right or left visual field, suggested that
the left hemisphere excelled at processing abstract nouns and
low-imagery nouns, adjectives, and verbs, while the right performed as well as the left in processing high-imagery nouns and
adjectives. Once again, neural-imaging studies have revealed a
more nuanced model than the simple association of left with
words and right with pictures (Kiehl et al. 1999). Marcel Just
and his colleagues (1996) and Jean Franois Demonet, Guillaume
Thierry, and Dominique Cardebat (2005) suggest that as cognitive
processing increases in complexity, right-hemispheric regions
get recruited to handle the additional demand. That hypothesis
does not necessarily conflict with behavioral data showing the
right hemisphere to be poor at processing abstractions but good
at processing concrete nouns on its own.
Studies of connotation, another essential element of poetic
language, have followed a similar trajectory from brain-damaged

628

to normal subjects, and from behavioral tests to technology-assisted observations. In the 1970s and 1980s, RHD subjects performed poorly on connotative word meaning tests, while LHD
subjects experienced problems with denotation (Gardner and
Denes 1973; Brownell, Potter, and Michelow 1984; Drews 1987).
One might have assumed that the left hemisphere processed
denotation and the right, connotation, but over time, a more
complex picture emerged. Christine Chiarello and others established, using visual-field testing, that primary and subordinate
word meanings are initially activated in both hemispheres, but
that subordinate meanings are quickly suppressed in the left
hemisphere, resulting in a more efficient processing time for the
dominant meaning not unlike Gioras graded salience model,
where the most salient meaning of an expression, metaphoric or
not, gets processed first and faster than a less commonly occurring meaning (Chiarello and Maxfield 1995).
Finally, the neurobiology of poets may play a significant role
in the neurobiology of poetic language. Poets are known to suffer
from affective disorders in particular, hypomania and bipolar
illness at rates far exceeding those of the general population
or other categories of writers (Andreasen 1987; Jamison 1989;
Ludwig 1994; Post 1996). Feeling negative emotion strongly,
being introspective, and spending time alone are traits associated with expressive writing as well as mental dysfunction, and
mentally ill persons may feel drawn to express their anguish in
writing; James Kaufman and John Baer (2002) propose these
and other behavioral explanations for the poetry/affective disorder connection. Taking a neurobiological approach, Felix
Post (1996) suggests that the intensive intellectual and emotional effort involved in writing poetry may trigger overactivation
of neural networks and, thus, cause mental illness. Julie Kane
(2004) suggests the opposite, that overactivation may precede
poetic output: Pointing to substantial evidence that handedness and dominance for language can shift temporarily from the
left to right hemisphere during manic episodes, she proposes
that abnormal mood elevation may activate right-brain regions
involved in processing poetic language. Recently, too, Dawn
Blasko and Victoria Kazmerski (2006) have shown that poets and
nonpoets differ in the brain regions that they activate while reading poems.
There is a vast amount of territory yet to be covered in exploring the elephant of poetic language, complicated by the fact
that new research findings often seem to challenge the old. But
as neuroimaging techniques become more precise and less invasive, illuminating features that could only be guessed at before,
one thing becomes increasingly clear: The neurobiology of poetic
language is not the same animal as the neurobiology of ordinary
language.
Julie Kane
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Andreasen, Nancy. 1987. Creativity and mental illness: Prevalence
rates in writers and their first-degree relatives. American Journal of
Psychiatry 144: 128892.
Blasko, Dawn, and Victoria Kazmerski. 2006. ERP correlates of individual
differences in the comprehension of nonliteral language. Metaphor
and Symbol 21.4: 26784.

Poetic Language, Neurobiology of


Bottini, G., R. Corcoran, R. Sterzi, E. Paulesu, P. Schenone, P. Scarpa, R.
Frackowiak, and C. Freith. 1994. The role of the right hemisphere in
the interpretation of figurative aspects of language: A positron emission tomography study. Brain 117: 124153.
Brownell, Hiram, Heather Potter, and Diane Michelow. 1984. Sensitivity
to lexical denotation and connotation in brain-damaged patients: A
double dissociation. Brain and Language 22: 25365.
Chiarello, Christine, and Lisa Maxfield. 1995. Initial right hemispheric
activation of subordinate word meanings is not due to homotopic callosal inhibition. Psychonomic Bulletin and Review 2: 37580.
Demonet, Jean-Franois, Guillaume Thierry, and Dominique Cardebat.
2005. Renewal of the neurophysiology of language: Functional neuroimaging. Physiological Reviews 85: 4995.
Drews, Etta. 1987. Qualitatively different organizational structures of
lexical knowledge in the left and right hemispheres. Neuropsychologia
25: 41927.
Eviatar, Zohar, and Marcel Just. 2006. Brain correlates of discourse processing: An fMRI investigation of irony and conventional metaphor
comprehension. Neuropsychologia 44.12: 234859.
Faust, Miriam, and Nira Mashal. 2007. The role of the right cerebral hemisphere in processing novel metaphoric expressions
taken from poetry: A divided visual field study. Neuropsychologia
45.4: 86070.
Gardner, Howard, and Gianfranco Denes. 1973. Connotative judgements by aphasic patients on a pictorial adaptation of the semantic
differential. Cortex 9: 18396.
Giora, Rachel. 1997. Understanding figurative and literal language: The
graded salience hypothesis. Cognitive Linguistics 7.1: 183206.
Giora, Rachel, ed. 2007. Is metaphor unique? Neural correlates of nonliteral language. Brain and Language 100.2 (Special Issue).
Giora, Rachel, Ofer Fein, Ann Kronrod, Idit Elnatar, Noa Shuval, and
Adi Zur. 2004. Weapons of mass distraction: Optimal innovation and
pleasure ratings. Metaphor and Symbol 19: 11541.
Giora, Rachel, Eran Zaidel, Nachum Soroker, Gila Batori, and Asa
Kasher. 2000. Differential effects of right- and left-hemisphere damage on understanding sarcasm and metaphor. Metaphor and Symbol
15: 6383.
Jamison, Kay Redfield. 1989. Mood disorders and patterns of creativity
in British writers and artists. Psychiatry 52: 12534.
Jaynes, Julian. [1978] 1990. The Origins of Consciousness in the Breakdown
of the Bicameral Mind. 2d ed. Boston: Houghton.
Just, Marcel, Patricia Carpenter, Timothy Keller, William Eddy, and Keith
Thulborn. 1996. Brain activation modulated by sentence comprehension. Science 274.5284: 11416.
Kane, Julie. 2004. Poetry as right-hemispheric language. Journal of
Consciousness Studies 11.5/6: 2159.
Katz, Albert, ed. 2006. Metaphor and Symbol 21.4. Special issue on neural
processing of nonliteral language.
Kaufman, James, and John Baer. 2002. I bask in dreams of suicide: Mental illness, poetry, and women. Review of General Psychology
6.3: 27186.
Kiehl, Kent, Peter Liddle, Andra Smith, Adrianna Mendrek, Bruce Forster,
and Robert Hare. 1999. Neural pathways involved in the processing of
concrete and abstract words. Human Brain Mapping 7: 22533.
Ludwig, Arnold. 1994. Mental illness and creative activity in women
writers. American Journal of Psychiatry 151: 16506.
Mashal, Nira, Miriam Faust, and Talma Hendler. 2005. The role of
the right hemisphere in processing nonsalient metaphorical meanings: Application of principal components analysis to fMRI data.
Neuropsychologia 43.14: 2084100.
Mashal, Nira, Miriam Faust, Talma Hendler, and Mark Jung-Beeman.
Processing salient and less-salient meanings of idioms: An fMRI
investigation. Cortex. In press.

Poetic Metaphor
Post, Felix. 1996. Verbal creativity, depression and alcoholism: An investigation of one hundred American and British writers. British Journal
of Psychiatry 168: 54555.
Sotillo, Maria, Luis Carreti, Jos Hinojosa, Manuel Tapia, Francisco
Mercado, Sara Lpez-Mrtin, and Jacobo Albert. 2005. Neural activity associated with metaphor comprehension: Spatial analysis.
Neuroscience Letters 373: 59.
Turner, Frederick, and Ernst Pppel. 1989. The neural lyre: Poetic meter,
the brain, and time. In Expansive Poetry: Essays on the New Narrative
and the New Formalism, ed. Frederick Feinstein, 20954. Santa Cruz,
CA: Story Line.
Winner, Ellen, and Howard Gardner. 1977. The comprehension of metaphor in brain-damaged patients. Brain 100: 71729.

POETIC METAPHOR
Since Aristotles first articulation of a comparative theory of
metaphor, metaphor studies in literary and ordinary language
have proceeded without interruption in philosophy, rhetoric,
linguistics, and literary criticism. Two traditions have emerged
in metaphor theory: conceptual and linguistic traditions. The
conceptual view emphasizes metaphors fundamental role in
everyday thought and language; the linguistic tradition limits the range of metaphor to local pragmatic and aesthetic
functions (Ortony 1993). The range of accounts within both
traditions is variegated and well beyond the scope of this entry.
Nonetheless, for the language sciences, it seems that the conceptualist tradition has dominated in recent years. The present
discussion assumes a conceptual view of metaphor as understood through the frameworks of conceptual metaphor
theory (CMT) and conceptual blending theory (CBT),
where poetic metaphor is regarded as a special case of these
underlying conceptual operations. At present, poetic or literary
metaphor cannot be easily extracted from the central questions
of metaphor theory in general, namely: What is metaphor, and
what is metaphor for? The present discussion merely touches on
the first question, in favor of a more elaborate treatment of the
second question.
The question of what metaphor is and how poetic metaphor
can help language sciences understand the everyday mind and
language is addressed in the first section, where I compare and
contrast the two models of metaphor. The question of what metaphor is for is addressed in the second section.

What Is Metaphor?
CMT purports to unearth the systematic correlations of experience and meaning. Meaning arises from everyday experience.
Abstract notions such as time, causation, states, change, and
purposes depend on a rich system of metaphors. Metaphor is
the name given to the process of conceptual mappings from
source to target domains (see source and target). The latest
incarnation of CMT (Lakoff and Johnson 1999) builds on Joseph
Gradys (1997) theory of primary metaphor, in which the ontogenetically basic process of domain correlation constitutes the
experiential basis of conceptual metaphors. A primary metaphor
is a correlation of subjective experience with a more abstract
concept. For instance, MORE IS UP is a primary metaphor, based
on the tight ontological correlation between the accumulation of
the same entities and vertical height.

629

Poetic Metaphor
CBT, while not a theory of metaphor, accounts for metaphor
as a species of conceptual blending that often involves the integration of concepts that do not normally go together. CBT takes
a decidedly usage-based perspective to metaphor and other
phenomena, in which systematic correlations arise from conceptual blending itself, the process of constructing new scenes
and scenarios with specific emergent properties from multiple
mental models. The aim is to see how metaphors arise on the
fly as we think and talk. CMT has as its basic unit of cognitive
structure the conceptual domain. CBT has as its basic unit of
organization a mental space, or scenes and scenarios set up as
we think, talk, and otherwise interact. CBT models the dynamic
unfolding of a language users representations. In this respect,
CBT has developed analytic routines and modeling techniques
that capture constitutive principles and governing constraints of
blending. (See Fauconnier and Turner 2002, 30952.)

What Is Metaphor For?


This question has no straightforward answer, but Samuel Levin
provides an initial approximation by suggesting that these ontologically bizarre notions are constructed for the purpose of
conceiving what a world would have to be like were it in fact to
comprise such states of affairs (1993, 121). Levin suggests that
we construct worlds in which the metaphor is literally true, but
only in order to tease out inferences that guide reasoning about
the real world. Consider the opening line of John Miltons poem,
On Time, in which the poet commands:
Fly envious Time, till thou run out thy race,

This conceit depends on the conventional metaphoric mapping


TIME IS A MOVER, creating a world in which time is literally an
intentional being running a race, the purpose of which is to focus
attention on the theological implications of speeding up the pace
at which the known world ends. In the poets world, the notion
of time as running a race can be considered preternatural, but
the theological implication of the end of days is the great truth
to be disclosed. In a similar vein, consider now the conventional
metaphor STATES ARE SHIPS.
The text in question is the sermon, The Negro Element in
American Life: An Oration. Delivered by Reverend A. L. DeMond
on January 1, 1900, this oration illustrates the degree to which
a conventional metaphor can be extended and elaborated. The
reverend ends with a poem that makes elaborate use of the Shipof-State metaphor, a potentially disastrous rhetorical maneuver,
given the history of the forced importation of Africans. The sermon ends thus:
As the old ship of State sails out into the ocean of the 20th century,
the Negro is on board, and he can say:

(1) Sail on, O ship of State,


(2) Sail on, O Union, strong and great;
(3) Humanity, with all its fears,
(4) With all the hope of future years
(5) Is hanging breathless on thy fate.
(6) We know what master laid thy keel,
(7) What workman wrought thy ribs of steel;

630

(8) Who made each mast, and sail and rope;


(9) What anvils rang, what hammers beat;
(10) In what a forge and what a heat
(11) Were shaped the anchors of thy hope.
(12) Fear not each sudden sound and shock,
(13) Tis of the wave, and not the rock;
(14) Tis but the flapping of a sail,
(15) And not a rent made by the gale.
(16) In spite of rock and tempests roar,
(17) In spite of false lights on the shore,
(18) Sail on, nor fear to breast the sea,
(19) Our hearts, our hopes are all with thee;
(20) Our hearts, our hopes, our prayers, our tears,
(21) Our faith triumphant oer our fears
(22) Are all with thee, are all with thee.
A CMT analysis begins by positing cross-domain mappings
between the source domain of ships and the target domain of
states or nation-states. The conventional mappings between
source and target domains include the following correspondences offered in Grady, Oakley, and Coulson (1999, 109):
Nation-State

Ship

Leader

Ships captain

National policies/actions

Ships course

National success/improvement

Forward motion of the


ship

National failures/problems

Sailing mishaps

Circumstances affecting the


nation

Sea conditions

All these metaphoric mappings derive from the basic primary metaphoric couplings of ACTION-AS-SELF-MOTION,
COURSES-OF-ACTION-AS-PATHS, SOCIAL-RELATIONSHIPS-ASDEGREES-OF-PHYSICAL-PROXIMITY,andCIRCUMSTANCES-ASWEATHER. These experiential correlations (and perhaps others)
interact in a way that motivates the framing of a nation and its history as a ship gliding through water.
As George Lakoff and Mark Turner (1989, 6772) argue, the
power of poetic metaphor, in particular, issues from the extension of these mappings for local expressive purposes. A conventionalized metaphor never gives you all you need, and poetic
thought is marked by its ability to stretch or extend conventional
metaphors. Notice that with lines 67, DeMond extends the typical range of mappings to include ship building and the role
shipwright.
Poems also employ expressions in which the schemas and
domains underlying the metaphor can be elaborated in unusual
or novel ways. Lines 6 and 7, when understood against the context of the whole speech, take on unusual significance. The implication of line 6 is that the master shipwright is God, while the
workman is identified with the Negro, echoing a consistent
theme of the speech the hard labor of the Negro race in building America.

Poetic Metaphor
A blending analysis helps account for ways in which the
NATION-AS-SHIP metaphor is not a simple and obvious mapping between two conceptual domains. While conceptual
domains name large depositories of knowledge about the physical and social world, mental spaces comprise on-line scenes and
scenarios; they are specific and sensitive to pressures from local
context.
Levins (1993) account of poetic metaphor is more completely captured in CBT, a theoretical framework in which preternatural scenes are constructed to reveal how to reason and
draw inferences about something else. Conceptual blends are
often richly counterfactual, but rarely do they exist for their
own ends. In this case, the blended scenario extends and elaborates the conventional metaphor for local rhetorical purposes. A
basic blending analysis of DeMonds introductory sentence and
the first five lines of the poem proper would include a discourse
ground specifying the participants, the situation, and setting, a
mental space for Seafaring, a mental space for Nation, and the
initial blended space for Nation-as-Ship, each of which is set up
in the very first line of the poem.
Let us assume the analysis from the perspective of a worshiper
sitting in the Dexter Avenue Baptist Church in 1900. Under these
conditions, the ground includes the identities of the churchgoers, the speaker, and the setting. Let us further assume that the
discourse participants are African Americans and that the poetic
persona represents them. Initially, the Seafaring and Nation
spaces project conceptual structure into the blend under the
influence of the cross-space mappings as specified here. In the
blend, America is a ship, Negro citizens are among its passengers, the ocean is time, and the twentieth century is an unspecified landmark on open water. The blend allows the audience
to imagine temporally, causally, and spatially diffuse political
events as attaining, for the moment, the look and feel of primary
experience.
Once composed, the blend and the network of mental spaces
permit the addition of new information and relations. A noteworthy contribution of the blending framework here is that it offers
precise ways of accounting for the elements of the Nation-asShip image that have no specific counterparts in the target space
of nations and politics. Once the network is up and running,
readers can combine concepts fluidly. For instance, line 2 commands, Sail on, O Union, strong and great, wherein the poet
fuses elements from different mental spaces into tight syntactic
units. Thus, in the blend it is perfectly natural and logical for a
union to sail. What is more, it is perfectly natural for the ship to
plot a straight course. Once the image is created, many other elements of ships become mentally accessible. For instance, ships
must be made of particular materials in order to be seaworthy.
The phrase, ribs of steel, in line 6 satisfies local formal and conceptual imperatives in 1) providing completion for the couplet
with line 5, and 2) suggesting that the nation is made of sturdy
material and (opportunistically) made from the very material
that the Negro worker has been responsible for manufacturing.
Importantly, the mapping between Shipwright and Creator is
responsible for all aspects of the nation.
The goal of using this conventional metaphor is to construct
a view of social reality for the Negro race, focusing on communal
activities and on achieving collective goals. The ship of state has

Poetics
been conventionalized for just that purpose because the image
potential associated with building, operating, and navigating is
of rich social activities. In the blended space, however, the choice
to sail on is framed as an all-or-nothing proposition. If the ship of
state does not sail, it ceases to exist. In the sailing space, however,
the ship, once built, exists whether or not the crew sails; in the
sailing space, a captain and crew can choose when and when not
to sail, and the crew can still be referred to as sailors whether on
land or on sea. In the blend, a refusal to board and sail is tantamount to renouncing ones citizenship. By exploiting elements
of the shipwright (a collective activity) and by attributing that
activity to a divine creator, DeMonds version of the ship of state
takes on the voice of a divine decree.
As suggested, DeMond takes considerable risk in quoting a
poem that makes extensive use of this metaphor, for members of
the congregation may generate a metaphoric mapping in which
the cross-domain counterpart of American Negro is not passenger but cargo, destroying the political legitimacy of the image.
DeMond, however, assiduously avoids focusing any attention
on the circumstances that brought them to America. Instead, he
picks up the story at their arrival and tells of the Negro race as
those who built the nation.
The present analysis presents CMT and CBT as complementary analytic frameworks, wherein the first focuses solely on conventionalized mappings, while the latter is much more interested
in how these mappings operate in local rhetorical contexts, and
thus can point scholars in the direction of a usage-based theory
of poetic metaphor.
Todd Oakley
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
DeMond, A. L. 1900. The Negro element in American life: An oration.
Available online at: http://lcweb2.loc.gov/cgi-bin/query/r?ammem/
murray:@field(DOCID+@lit(lcrbmrpt0e10div2).
Fauconnier, Gilles, and Mark Turner. 2002. The Way We Think. New
York: Basic Books.
Grady, Joseph. 1997. Foundations of meaning: Primary metaphors and
primary scenes. Ph.D. diss., University of California, Berkeley.
Grady, Joseph, Todd Oakley, and Seana Coulson. 1999. Blending and
metaphor. In Metaphor in Cognitive Linguistics, ed. F. Gibbs and G.
Steen, 10124. Amsterdam: Benjamins.
Lakoff, George, and Mark Johnson. 1999. Philosophy in the Flesh. New
York: Basic Books.
Lakoff, George, and Mark Turner. 1989. More Than Cool Reason: A Field
Guide to Poetic Metaphor. Chicago: University of Chicago Press.
Levin, Samuel. 1993. Language, concepts, and worlds. In Metaphor
and Thought. 2d ed. Ed. A. Ortony, 11223. Cambridge: Cambridge
University Press.
Ortony, Andrew. 1993. Metaphor, language, and thought. In Metaphor
and Thought. 2d ed. Ed. A. Ortony, 116. Cambridge: Cambridge
University Press.

POETICS
In ancient Greece, as Aristotle pointed out, there was no common name for all the different poetic genres (Poetics, 47b),
including epic, tragic drama, dialogue, elegy, and poems written in various meters. Poetry in the sense of making or creation
became the general name for literary expressions in diverse

631

Poetics
forms, and Poetics, the term used by Aristotle for his treatise on
tragedy and epic, thus represented the kind of critical and analytical treatment of poetry that would be called in later times literary criticism or literary theory.
Aristotles Poetics offers an important model in Western literary criticism, but it was not widely known in Europe in antiquity
or in medieval times, and it did not become a classic until the
latter half of the sixteenth century. During the time that it was lost
in medieval Europe, however, the Poetics, along with some other
works by Aristotle, was being studied by Arabic scholars, notably
Ibn Rushd, known in the West as Averros. But once it was rediscovered and commented on by such influential Renaissance critics as Lodovico Castelvetro (150571) and Francesco Robertello
(151667), the Poetics quickly became one of the most influential
works in Western literary criticism. Epic and tragedy discussed
therein became the two major classical genres before the rise
of the modern novel and, after Dante, poets of every European
nation tried to create an epic in the vernacular to mark the maturity of a modern language and the establishment of a national
literary tradition. Aristotles philosophical treatment of plot, language, and rhetoric of the tragic drama provides a model of critical analysis, and many basic concepts used in the Poetics, such
as imitation, recognition, the reversal of fortune, tragic hubris,
and the catharsis of pity and fear, have all had a tremendous
influence on later criticism. In our own time, Aristotles Poetics
remains a major classic and continues to be discussed and commented on by important critics and theoreticians from various
perspectives.
As the aforementioned Arabic commentaries suggest, the
systematic study of the literary art is by no means confined to
the European tradition. There are, for example, well-established
traditions of sophisticated literary criticism or poetics in South
and East Asia. The earliest treatise on dance and dramatic art in
ancient India, Bharatamunis Ntyastra (ca. second cent. b.c.),
offers a comprehensive discussion of Sanskrit drama in terms of
taste and emotions (rasa) and of language and bodily gestures
that give expression to various emotions. In the seventh century,
Sanskrit poetics was fully established by such important theorists
as Bhmaha and Dandin. In the ninth century, nandavardhana
made significant contributions to its further development with
discussions of the theoretical notions of rasa and dhvani, while
Abhinavagupta and Kuntaka in the tenth century explored new
areas by debating on the issue of indirect and suggestive expressions (vakrokti) in poetic language. Indeed, as an Indian scholar
remarks, A study of Sanskrit poetics from Bharata (5th century
b.c.) to Panditarja Jaganntha (17th century a.d.) will bear witness to the existence of a highly developed poetics in ancient
India, with a rigorous scientific method for description and analysis of literature (Pathak 1998, 345).
In China, the Great Preface to the Mao edition of the Book
of Poetry (second century b.c.) articulated the Confucian ideas
about poetry and its functions, and laid the foundation of a poetics that both acknowledges the release of emotions as the origin
of poetry and the efficacy of moral teaching as its ultimate justification. Lu Ji (261303), with his Rhyme-Prose on Literature added
to the critical tradition a more focused attention on the importance of emotions (qing), and he argued for the necessity to learn
both from nature and from the ancients. Liu Xies (465?520?)

632

Point of View
Literary Mind and the Carving of Dragons is deservedly famous
as the most systematic study of the literary art in the Chinese
critical tradition. This substantial work of Chinese poetics relates
literature to the cosmic tao and the exemplary classics of ancient
sages, thereby elevating literature to a position of high social and
moral values. Its focus, however, is on the art of literature. The
Literary Mind first formulates some basic principles of the idea of
wen or literature, gives a survey of all the literary genres in classical Chinese literature, commenting on their origin and development, and then presents a highly developed theory of literary
creation, making contributions to the important issues of the
relationship between poetry and reality, the style and characteristics of a literary work, the effect of imagery and poetic imagination, and the regulations of metric composition. Since the eighth
century in Tang China, and particularly the eleventh century in
the Song Dynasty, there have been numerous works in a critical
genre known as remarks on poetry (shihua), which often contain
valuable insights into the nature of poetry, the techniques of the
literary art, and the principles of aesthetic appreciation. Like the
aforementioned Indian example, the Chinese critical tradition
also offers an alternative form of poetics outside the Aristotelian
and European tradition.
In a broad sense, then, poetics can be understood as a critical, theoretical, and more or less systematic treatment of poetry
or literature in general. In such an expanded usage, what the
term signals is a theoretical discourse on a subject in arts or literature, covering a considerable range of oeuvre, and offering
some philosophical insights into the nature of the subject under
discussion. Poetics, therefore, becomes a general term for a sustained argument or a long essay in literary and art criticism.
Zhang Longxi
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aristotle. 1987. Poetics with the Tractatus Coislinianus, Reconstruction
of Poetics II, and the Fragments of the On Poets. Trans. Richard Janko.
Indianapolis: Hackett.
Averros. 1977. Three Short Commentaries on Aristotles Topics,
Rhetoric, and Poetics. Ed. and trans. Charles E. Butterworth.
Albany: State University of New York Press.
De, S. K. 1988. History of Sanskrit Poetics. Calcutta: Firma KLM Pvt. Ltd.
Liu, James J. Y. 1975. Chinese Theories of Literature. Chicago: University
of Chicago Press.
. 1988. Language-Paradox-Poetics: A Chinese Perspective. Ed.
Richard John Lynn. Princeton, NJ: Princeton University Press.
Miner, Earl. 1990. Comparative Poetics: An Intercultural Essay on Theories
of Literature. Princeton, NJ: Princeton University Press.
Pathak, R. S. 1998. Comparative Poetics. New Delhi: Creative Books.
Rajendran, C. 2001. Studies in Comparative Poetics. Delhi: New Bharatiya
Book Co.

POINT OF VIEW
In narrative studies (see narratology), this term, also perspective or focalization, refers to textual strategies that provide
the reader with the illusion of seeing things through the eyes
of a character. These strategies are mostly linguistic in nature,
ranging from deictic positioning in the characters mental here
and now (see deixis) to lexical choices linking up with the

Point of View
characters worldview and ways of thinking and perceiving the
world. Point of view, from a linguistic perspective, is therefore an
important aspect of linguistic pragmatics.

Origins and History


The strategies of point of view narration are of fairly recent date.
They came into existence as part of the shift toward increasingly
subjective literary narratives near the end of the nineteenth century, and document authors attempts to portray characters
individuality not merely in the rendering of idiosyncratic dialogue (for instance, in dialect) but also in the extensive depiction of characters minds or consciousness. Already in the 1790s,
the Gothic novel figured the female protagonists anxious meditations, and Ann Radcliffe and Charles Maturin also portrayed
the impact that the contemplation of sublime scenery had on
their heroines. Gothic novels are, therefore, important anticipations of the point of view technique which, in English literature,
came into its own in the work of George Eliot, Elizabeth Gaskell,
Thomas Hardy, Joseph Conrad, and Henry James and in the
stream of consciousness novels of literary modernism (James
Joyce, Virginia Woolf, Katherine Mansfield, D. H. Lawrence,
Aldous Huxley, E. M. Forster). Since then, the point of view technique has been standard in fictional narrative, especially so in
the short story, even though postmodernist texts of the radical
experimental sort do not employ it as often.
The term point of view (interchangeably with center of vision)
was first used by Percy Lubbock in The Craft of Fiction (1921),
although Henry James in his prefaces had already analyzed the
phenomenon under the heading of center of consciousness
and the image of the house of fiction having many windows
(James [18801/1908] 1975, 7). Point of view in James refers to
the presentation of the story from the perspective, hence point of
view, of a character, for instance, Strether in The Ambassadors: It
affected Strether: horrors were so little superficially at least in
this robust and reasoning image. But he was none the less there
to be veracious (James [1903] 1994, 99). However, even in James,
the point of view technique, in the meaning of limited perspective (seeing the world through the naive, obsessed, or puzzled
perspective of a character), is extended from the new narrative
form of the (third person) stream of consciousness novel (following the protagonists associations in the depiction of their consciousness) to experiments with unreliable or otherwise limited
first person narrators, as in Jamess The Real Thing or Daisy
Miller. In these texts, the narrator is very naive, and has a clearly
reduced intellectual capacity. For Lubbock (1921), Jean Pouillon
(1946), and Norman Friedman (1955), the term, by contrast,
comprises not one technique of focusing the narrative through a
central characters mind but a variety of three (Pouillon) to eight
(Friedman) alternative points of view that authors can choose.
Not only was point of view a vague term because it included
so many different aspects of narrative; it, moreover, was very
limiting since it focused on the visual metaphor. As a consequence, the term point of view, though still used as a general
label, became displaced in narratology by more inclusive or more
specific terms: narrative situation, perspective, and focalization.
Franz Karl Stanzels ([1979] 1984) three narrative situations
([1] authorial roughly: omniscient; [2] first person; and [3]
figural the presentation through a characters mind) follows

Friedman in looking at types of narrative. Boris Uspensky ([1973]


1983), too, extends the meaning of point of view under his term
perspective to include a) vision (spatio-temporal perspective); b)
language (phraseological perspective); c) knowledge and feelings
(psychological perspective); and d) ideology. Although these
four types of perspective are all determinable from the language
of the text (the spatio-temporal perspective through deictics; the
ideological through tell-tale phrases like tovarish [comrade]
for fellow man; the psychological through the syntax and
lexis of emotion), it is the phraseological level of perspective that
is most linguistic in its deployment of register and style to
signal narrators or characters perspectives, for instance, in the
citing of dialect words, hints at pronunciation typical of certain
social groups, or the contrast between high and low register in
heteroglossic texts (Bakhtin 1981; see dialogism and heteroglossia). For instance, Uspensky cites Tolstoys sentence
Anna Pavlova had been coughing for the last few days: she had
an attack of la grippe, as she said (1983, 33) as an example of
phraseological point of view, where la grippe registers Anna
Pavlovas class and social snobbery. Psychological perspective can be exemplified by a sentence from Toni Morrison: He
examined the bushes, the branches, the ground for a berry, a nut,
anything (1977, 255; emphasis added). The sentence traces the
order of Milkmans order of perception and the urgency of his
quest for food. Uspensky devotes a whole chapter to the interrelation of the four types of perspective in texts.
Grard Genettes reconceptualization of point of view as focalization (zero; external; internal) abides by the visual metaphor,
with focalization opposed to voice (who sees? vs. who speaks?
[1972] 1980, 186). Genettes typology of focalization is one of
limited perspective either no limitation of point of view (zero
focalization) or limitation to a view on characters from outside
(external focalization) or a subjective view from inside (internal
focalization). The narrator and narrative voice are excluded from
the discussion, in contrast to Friedmans or Stanzels analyses.
More recent models of focalization are discussed by Manfred
Jahn (2005), who has himself proposed the distinction among
strict, ambient, weak, and zero focalization based on an optical
analogy.

Linguistic Signals of Point of View


The textual inscription of point of view depends on the insertion
of signals of subjectivity and individual knowledge, opinion, or
worldview in the text such that they can be aligned with a character. The same signals can also be employed to relate the subjectivity or individual stance of the speaker/narrator of a text/
utterance, and this alignment is usually discussed under the
heading of voice and not point of view. Voice and point of view
can get into conflict or overlap as in free indirect discourse, a technique for rendering speech or thought in which the language of
the reported speaker/thinker (his/her point of view) is to some
extent preserved in the report: She had never, ever told fibs, not
for worlds. (Here, the syntax and vocabulary of the reported
speaker are integrated into the report.)
Free indirect discourse (thought representation) is one of the
most common signals of point of view in literary texts since it
introduces a characters perspective (feelings, intentions, worldview) to the reader. Moreover, the narrative can be studded with

633

Point of View
stylistic and lexical markers relating to the characters social
position, age, gender, and so on. For instance, when in Charles
Dickenss Our Mutual Friend Mrs. Veneering remarks that
these social mysteries make one afraid of leaving Baby ([18645]
1952, 414; emphasis added), the word Baby relates to the motherchild relationship of the reported speaker and represents her
point of view. At the same time, the phrase these mysteries and
the pronoun one underline Mrs. Veneerings upper-class status.
Addressee-oriented expressions like forms of address (Maam,
Sir, Your Excellency, etc.) also invoke social position by linguistic
means (cf. Fillmore 1983, 1997).
Most basically, deictics serve the function of positioning
speakers and, hence, creating point of view. For example, in
Bleak House Mr. Bucket (still grave) inquires if to-morrow morning, now, would suit ([18523] 1962, 720; emphasis added),
in which the futurity of to-morrow relates to Mr. Buckets
moment of utterance. Among linguists, Charles Fillmores work
on deixis (1983, 1997) needs to be credited with incisive insights
into the generation of point of view by means of deixis. From a
linguistic perspective, these signals are expressivity markers,
implying a speaking or thinking consciousness, a deictic center
(Bhlers [1934] origo) from which the world is being viewed. In
the widest possible sense, such expressivity markers are indicative of ideation and emotion, the latter capable of being textually
suggested by syntactic means, such as intensifying repetitions
besides merely lexical intensifiers and emphatic vocabulary.
Evaluative point of view can be illustrated in sentences like Do
talk to the poor dear. Incomplete sentences (indicating hesitation or derangement), sentence modifiers (in any case, sure
enough), clause-initial adjuncts (oh, well), interjections (good
grief), negative inversion (Never will he forget) or left and right
dislocation are among the most common strategies used (cp.
Fludernik 1993, 22779). In oral discourse, moreover, expressivity shows up in intonation and the echoing of idiosyncratic
pronunciation (imitated in writing: she sho was happy).
In medieval literature, point of view is often signaled by interjections like alas or by means of repetition. Such signals of point
of view occur intermittently in medieval literature and early
modern English prose but do not constitute a continuous representation of a characters perspective as in the Gothic novel and
the later stream of consciousness novel.
Like the study of narrative discourse markers, the focus on
expressivity signals can help to emphasize the specifically narrative uses of point of view for the linguist. Point of view markers not
only establish free indirect discourse; they are, moreover, crucial
to text beginnings, where they help distinguish between narratives with a prominent speaker (= narrator) function and others
in which the reader is eased into the story by means of a protagonists perspective. (Roland Harweg [1968] has contrasted these
as emic and etic text beginnings, respectively.) Peculiarities of
thought and worldview are also constitutive of M. A. K. Hallidays
mind-style (1971) as the distinctive linguistic representation of
individual self (Shen 2005, 312). Ultimately, an analysis of point
of view as expressivity links up with the linguistic enquiry into
individual style.
It should be noted that all of these signals of expressivity are
clichs and cannot directly claim mimetic relevance (Fludernik
1993, 43464). On the contrary, they depend on typical recurrent

634

models of speech that are employed to create an illusion of


authenticity. Moreover, the attribution of expressivity markers
to the primary frame speaker (narrator) or reported speaker
(character) is frequently problematic. The mere presence of
expressivity markers does not convey a clear point of view;
point of view needs to be constructed interpretatively by the
listener or reader in the overall context of the utterance or text.
Thus, though point of view can be fruitfully analyzed by linguistic means, it cannot be exhaustively described within a purely
formal framework. Point of view, therefore, is a pragmatic phenomenon located on the threshold between narrative pragmatics and literary narratology (see also literary character
and character types).
Monika Fludernik
WORKS CITED AND SUGGESTED FURTHER READING
Bakhtin, Michael M. 1981. The Dialogic Imagination: Four Essays. Ed.
Michael Holquist. Austin: University of Texas Press.
Bhler, Karl. 1934. Sprachtheorie. Die Darstellungsfunktionen der Sprache.
Jena: Gustav Fischer.
Dickens, Charles. [18523] 1962. Bleak House. London: Oxford University
Press.
. [18645] 1952. Our Mutual Friend. London: Oxford University
Press.
Fillmore, Charles. 1983. How to know whether youre coming or going.
In Essays on Deixis, ed. Gisa Rauh, 21927. Tbingen: Narr.
. 1997. Lectures on Deixis. Stanford, CA: Center for the Study of
Language and Information.
Fludernik, Monika. 1993. The Fictions of Language and the Languages
of Fiction: The Linguistic Representation of Speech and Consciousness.
London: Routledge.
Friedman, Norman. 1955. Point of view in fiction: The Development of a
Critical Concept. PMLA 70: 116084.
Genette, Grard. [1972] 1980. Narrative Discourse: An Essay in Method.
Trans. Jane E. Lewin. Ithaca, NY: Cornell University Press.
Halliday, M. A. K. 1971. Linguistic function and literary style: An inquiry
into William Goldings The Inheritors. In Literary Style: A Symposium,
ed. Seymour Chatman, 33065. London: Oxford University Press.
Harweg, Roland. 1968. Pronomina und Textkonstitution. Munich: Wilhelm
Fink.
Jahn, Manfred. 1999. More aspects of focalization: Refinements and
applications. In Recent Trends in Narratological Research, ed. John
Pier, 21, 85110. Tours: Groupes de Recherches Anglo-Amricaines de
Tours, University of Tours.
. 2005. Focalization. In Routledge Encyclopedia of Narrative
Theory, ed. David Herman, Manfred Jahn, and Marie-Laure Ryan,
1737. London: Routledge.
James, Henry. [18801/1908] 1975. The Portrait of a Lady. Ed. Robert D.
Bamberg. New York: W. W. Norton.
. [1903] 1994. The Ambassadors. Ed. S. P. Rosenbaum . New
York: Norton.
. [1934] 1953. The Art of the Novel. Intro. R. P. Blackmur. New
York: Scribner.
Lubbock, Percy. 1921. The Craft of Fiction. London: Jonathan Cape.
Morrison, Toni. 1977. Song of Solomon. New York: Signet.
Pouillon, Jean. 1946. Temps et roman. Paris: Gallimard.
Shen, Dan. 2005. Mind-style. In Routledge Encyclopedia of Narrative
Theory, ed. David Herman, Manfred Jahn, and Marie-Laure Ryan,
31112. London: Routledge.
Stanzel, Franz Karl. [1979] 1984. A Theory of Narrative.
Cambridge: Cambridge University Press.

Politeness
Uspensky, Boris. [1973] 1983. A Poetics of Composition: The Structure of the
Artistic Text and Typology of a Compositional Form. Trans. Valentina
Zavarin and Susan Wittig. Berkeley: University of California Press.

POLITENESS
Politeness is essentially a matter of taking into account the feelings of others as to how they should be interactionally treated,
including behaving in a way that demonstrates appropriate
concern for interactors social status and their social relationship. In this broad sense of speech oriented to an interactors
social persona or face, politeness is ubiquitous in language use.
Since taking account of peoples feelings generally involves
saying things in a less straightforward or more elaborate manner than when one is not considering such feelings, ways of
being polite provide a major source of indirectness, reasons for
not saying exactly what one means, in how people frame their
utterances.
There are many folk notions for these kinds of attention to
feelings, captured in terms like courtesy, tact, deference, sensibility, poise, rapport, and urbanity, as well as terms for the contrasting behaviors rudeness, gaucheness, social gaffes and their
consequences, embarrassment or humiliation. Such terms attest
both to the pervasiveness of notions of politeness and to their
cultural framing.
Peoples face is invested in their social status and in their
relationships with one another, and so indexing this relationship
appropriately is necessary for maintaining face expectations. In
addition, one often has interactional goals that potentially contravene face, and the expression of such communicative intentions (e.g., requests, offers, disagreements, complaints) tends to
be mitigated by attention to face.
Politeness is crucial to the construction and maintenance
of social relationships; indeed, it is probably a precondition for
human cooperation in general. Politeness phenomena have,
therefore, attracted interest in a wide range of social sciences,
particularly linguistics, anthropology, psychology, sociology,
and communication. Work in these disparate fields can be characterized in terms of three main classes of theoretical approach.

Politeness as Social Rules


To the layperson, politeness is a concept designating proper
social conduct, rules for speech and behavior stemming generally
from high-status individuals or groups (cf. standardization).
These notions range from polite formulae like please and thank
you, codified forms of greetings and farewells, honorific address
forms, and so on, to more elaborate routines, for example, for
table manners or the protocol for formal events. Politeness in
this view is conventionally attached to certain linguistic forms
and formulaic expressions, which may be very different in different languages and cultures.
Some analytical approaches to politeness are formulated in
terms of the same sorts of culture-specific rules for doing what is
socially acceptable, for example, the work by Sachiko Ide (1989)
and others on Japanese politeness as social indexing or discernment. In these approaches, politeness inheres in particular linguistic forms when used appropriately as markers of pregiven
social categories.

Politeness as Conversational Maxims


A different approach understands politeness as a set of social
conventions coordinate with Paul Grices (1975) cooperative principle for maximally efficient information transmission (Make your contribution such as required by the purposes
of the conversation at the moment), with its four maxims of
quality, quantity, relevance, and manner (see conversational
implicature). Robin Lakoff (1973) argued that three rules
of rapport underlie choices of linguistic expression, rules that
can account for how speakers deviate from directly expressing
meanings. Choice among the three pragmatic rules gives rise
to three distinct communicative styles: Rule 1, Dont impose,
produces a distant style; Rule 2, Give options, gives rise to a
deferent style; and Rule 3, Be friendly, results in a style of camaraderie. Geoffrey Leechs (1983) proposal is in the same vein.
Complementary to Grices cooperative principle, Leech postulated a politeness principle, Minimize the expression of impolite
beliefs, with six maxims of tact, generosity, approbation, modesty, agreement, and sympathy. As with Grices maxims, deviations from what is expected give rise to inferences. Cross-cultural
differences derive from the different importance attached to particular maxims.
The conversational maxim view shares with the social norm
view the emphasis on codified social rules for minimizing friction
between interactors and the idea that deviations from expected
levels or forms of politeness carry a message.

Politeness as Face Management


A more sociological perspective places face work at the core of
politeness. Erving Goffman (1967) considered politeness as
an aspect of interpersonal rituals, central to public order. He
defined face as an individuals publicly manifest self-esteem and
proposed that social members have two kinds of face requirements: positive face, or the want of approval from others, and
negative face, or the want not to offend others. Attention to these
face requirements is a matter of orientation to Goffmans diplomatic fiction of the virtual offense, or worst possible reading
(1971, 138 ff), the working assumption that face is always potentially at risk, so that any interactional act with a social-relational
dimension is inherently face threatening and needs to be modified by appropriate forms of politeness. Deference (attention
owed to the others face) can be distinguished from demeanor
(attention owed to oneself).
Building on Gricean and Goffmanian approaches, Penelope
Brown and Stephen C. Levinson ([1978] 1987) introduced a
comparative perspective by drawing attention to the detailed
parallels in the construction of polite utterances across widely
differing languages and cultures, arguing that universal principles underlie the construction of polite utterances. The parallels they noted are of two sorts: how the polite expression of
utterances is modified in relation to social characteristics of
the interloculors and the situation, and how polite utterances
are linguistically formulated. At least three social factors are
involved in deciding how to be polite: 1) One tends to be more
polite to social superiors; 2) one tends to be more polite to people
one doesnt know. In the first case, politeness tends to be asymmetrical (the superior is less polite to an inferior); in the second,
politeness tends to be symmetrically exchanged. In addition,

635

Politeness

Politics of Language

3) in any culture there are norms and values affecting the degree
of imposition or unwelcomeness of an utterance, and one tends
to be more polite for more serious impositions. The linguistic
structures for conveying particular kinds of politeness are also
underlyingly similar across languages, with the politeness of
solidarity (positive politeness) characterized by expressions of
interest in the addressee, exaggerated expressions of approval,
use of in-group identity markers and address forms, seeking
of agreement, and avoidance of disagreement, whereas avoidance-based politeness (negative politeness) is characterized
by self-effacement, formality, restraint, deference, hedges, and
impersonalizing mechanisms like nominalization or passive
constructions.
To explain these kinds of detailed parallels across languages
and cultures in the minutiae of linguistic expression in socially
analogous contexts, Brown and Levinson proposed an abstract
model of politeness as strategic attention to face, deriving strategies for constructing polite utterances in different contexts on
the basis of assessments of three social factors: the relative power
(P) of speaker and addressee, their social distance (D), and the
intrinsic ranking (R) of the face-threateningness of an imposition.
In contrast with rule-based approaches, Brown and Levinson
argued that politeness inheres not in words or sentences per se;
politeness is an implicature that may be conveyed by utterances
spoken in context, by virtue of successful communication of a
polite attitude or intention.
Politeness continues to be a major focus for research in many
disciplines concerned with social interaction, and the topic
now has its own professional journal, the Journal of Politeness
Research. Over the past 30 years, empirical descriptions of particular politeness phenomena from many different parts of the
world have accumulated, with the research emphasis largely
on cross-cultural differences. There has been much theoretical
controversy over whether, indeed, there are any universal principles of politeness and if so, what form they take. The recent
trend seems to be toward emphasizing emic rather than etic
approaches (cf. Watts 2003; Eelen 2005). But the importance of
politeness goes far beyond the ps and qs of appropriate behavior and speech in a particular cultural setting. Its wider significance is in the interactional, communicative, day-to-day basis of
social life and the conduct of social relationships. Recent developments in the theory of social interaction that take account
of our common human nature (e.g., Goody 1995; Enfield and
Levinson 2005; see also universal pragmatics) offer hope
for theoretical progress in this field.
Penelope Brown
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brown, Penelope, and Stephen C. Levinson. [1978] 1987. Politeness: Some
Universals in Language Use. Cambridge: Cambridge University Press.
Eelen, Gino. 2005. A Critique of Politeness Theories. Manchester: St.
Jerome.
Enfield, Nick, and Stephen C. Levinson, eds. 2005. Roots of Sociality.
Cambridge: Cambridge University Press.
Goffman, Erving. 1967. The nature of deference and demeanor. In
Interaction Ritual, ed. Erving Goffman, 4795. New York: Anchor
Books.
. 1971. Relations in Public. New York: Harper Colophon Books.

636

Goody, Esther. 1995. Social Intelligence and Interaction.


Cambridge: Cambridge University Press.
Grice, H. Paul. 1975. Logic and conversation. In Syntax and
Semantics. Vol. 3: Speech Acts. Ed. P. Cole and J. Morgan, 4158. New
York: Academic Press.
Hickey, Leo, and Miranda Stewart, eds. 2005. Politeness in Europe.
Clevedon, UK: Multilingual Matters.
Ide, Sachiko. 1989. Formal forms and discernment: Two neglected
aspects of linguistic politeness. Multilingua 8.2/3: 22338.
Lakoff, Robin. 1973. The logic of politeness or minding your ps and qs.
In Papers from the Ninth Regional Meeting of the Chicago Linguistic
Society, 292305.
Leech, Geoffrey. 1983. Principles of Pragmatics. London: Longman.
Placencia, Mara E., ed. 2006. Research on Politeness in the SpanishSpeaking World. Mahwah, NJ: Erlbaum.
Watts, Richard J. 2003. Politeness. Cambridge: Cambridge University
Press.
Watts, Richard J., Sachiko Ide, and Konrad Ehlich, eds. 1992. Politeness in
Language. Berlin: Mouton de Gruyter.

POLITICS OF LANGUAGE
Politics of language is not a domain or subdiscipline. It is an idea
that puts the study of language in a perspective: the idea that
language is a politically invested object and that people, consequently, act politically in, through, and on language. In that
sense, the term covers an enormous range of phenomena and
cuts across numerous disciplinarily organized practices. The
issue is one of function, and the politics of language suggests
that political meanings and effects are among the functions of
language. In fact, some would emphasize that there are no nonpolitical meanings.
Such political functions are metapragmatic: They operate
through meta-discourses on language, on things people say
about language in language. Thus, the politics of language is a
language-ideological phenomenon. Clear instances of it are
widespread utterances such as English is the language of business or Xhosa is a language for community interaction. In
both instances, a particular language is defined as a language
that operates with a specific load, a specific set of social, cultural, economic political attributes, all of them implicitly
articulated: Whenever I use English, my language use will be
framed as business, and I will speak like a businessman. The
politics of language has to do with the way in which we associate particular varieties of language (forms) with particular normative complexes, genres and topical domains, and identities
(functions). The relationship between forms and functions, thus
defined and seen as relatively stable (stable enough to generate
shared meanings), is usually defined as ideology, and authors
explicitly addressing the politics of language often focus on ideology, hegemony, and ideological naturalization.
In what follows, I first give a brief overview of some key notions
and authors, then engage in a brief survey of some recent work
and focus on language ideologies as a frame for understanding such political functions. I conclude with an appraisal of this
work.

Key Notions and Authors


Language has been defined as politically invested since Aristotle
and the Sophists; it is therefore futile to attempt a historical survey.

Politics of Language
Rather, I would suggest we read history backwards, starting from
the current approaches to the politics of language and looking into those authors who are seen as formative now. From
that vantage point, two groups of authors stand out: authors
who developed a political view of language and authors whose
political-analytic work provides tools for scholars in the field of
language. The first category is dominated by such scholars as M.
M. Bakhtin, V. N. Voloshinov, Roland Barthes, Michel Foucault,
and Pierre Bourdieu (see habitus); in the latter, Karl Marx (see
marxism) and Antonio Gramsci stand out.
This collection of authors and insights, it must be realized,
can only be discussed in a more or less coherent way when a
number of conditions are met. In particular, two presuppositions
are required:
(i) It is clear that reflections of this kind are predicated on a
view of language as a social object (not a mental object); such
reflections belong to the realm of a social theory of language.
(ii) They also are predicated on a view in which language
displays intricate connections with social structure: Either
language mirrors social structure (especially structures of
inequality) or it can become an instrument for changing
social structure.
These presuppositions ensure that the authors mentioned can
become interlocutors for current practitioners in the field, and such
practitioners would then be clustered in applied fields, such as
discourse analysis (both linguistic and foucaultian),
sociolinguistics, and linguistic anthropology.
The work of Bakhtin and Voloshinov has been influential in
its emphasis on the social and political dimensions of a key feature of real language: its heteroglossic nature (see dialogism
and heteroglossia). Heteroglossia stands for the presence
of multiple voices in an act of communication, and such
voices are intricately related to social formations and interests.
Whenever we communicate, thus, we engage with existing complexes of social (and cultural) meaning, we insert ourselves in an
intertextual tradition in which such complexes make sense,
and we articulate interests, not only (neutral, self-contained)
meanings. In addition, the articulation of such interests is not
a unilateral and linear event. Bakhtin (1981) emphasizes the
importance of evaluative uptake in interaction his dialogical
principle in which every act of communication requires ratification by the other in order to be valid, that is, in order to be
meaningful. This process of ratification is evaluative: It is done
from within ordered complexes of forms-and-meanings in
which appropriateness, social roles, fluency, and other quality
attributes are specified. Thus, even if I think I produce a cogent
story, my interlocutor may judge it to be off the mark because
what I say and how I say it do not qualify as good enough in his/
her evaluative framework. And evidently, such evaluative frameworks are reflections and instruments of the social and political
order (Voloshinov 1973).
This social and political order penetrates language at a fundamental level: It shapes discourses. Discourses are complexes
of communicative forms (genres, styles) mapped onto thematic
and social domains, and what the social and political order does
is to create spaces in which particular discourses operate while it
eliminates other such spaces. This idea is central to the work of

Barthes (1957), who emphasizes the discursive routines and the


silences that are generated by the consumer-capitalist society. It
also underlies Foucaults (1984) notion of order of discourse,
and it is reflected in Bourdieus (1991) notion of legitimate language. In each case, macrosocial order manifests itself in discourse patterns, structures, both positively and negatively. The
fact that some things can only be said in some ways is an effect of
the social and political order; the fact that some things cannot be
said at all is an effect of the same thing (Blommaert 2005).
The fact is, however, that people rarely experience this shaping of discourses as an effect of social and political forces. Mostly,
we perceive these discourse routines and absences as normal,
as just the way things are. It is at this point that we see scholars
refer to the Marxian notion of ideology an agentive notion in
which ideational complexes such as discourses have real material effects as well as to the Gramscian notion of hegemony.
Hegemony is ideological dominance, that is, dominance that
is not perceived as dominance but as a neutral, normal state of
affairs. Social and political forces operate in language through
hegemony, that is, through naturalized, neutralized, and normalized perceptions and forms of behavior.
These authors all provide frequently used key notions and
insights, all of which revolve around the same central node: that
language is not a neutral phenomenon but one that bears deep
traces of social and political structures and processes in society.
The use of language, consequently, is always an activity that has
social and political dimensions: It can reproduce existing structures or challenge them, it can empower or disempower people,
and it can enfranchise and disenfranchise them.

State of Affairs
The political load of language is one of the central concerns
for critical discourse analysis (CDA), an approach to discourse analysis that, especially since the 1990s, explicitly focuses
on the ways in which discourse reflects power and social structure and constructs it (Fairclough 1989, 1992; Blommaert 2005).
It is CDAs stated goal to analyze opaque as well as transparent
structural relationships of dominance, discrimination, power
and control as manifested in language (Wodak 1995, 204) a
paradigmatic choice that is reflected in numerous studies on racism, sexism, media, and political discourse and advertisements.
In all of these, linguistic and textual patterns are analyzed as conduits for hegemony and power abuse, and CDA has been influential in identifying registers and genres of power and control.
CDA clearly subscribes to a view of language as loaded (cf.
Bolinger 1980) and as invested with social and political interests
that steer discourse into particular, structural (i.e., nonarbitrary)
patterns of use and abuse. The influence of Foucault, Gramsci,
Bourdieu, and other critical theorists is explicitly acknowledged
in much CDA work.
The same paradigmatic choice underlies work in what
could be called critical sociolinguistics: an approach in which
the distribution of language in society is also seen as a reflection of power processes, often crystallized in normative (standards) discourses and invariably entailing judgments of users
through judgments of language use (e.g., Milroy and Milroy
1985; Cameron 1995; see standardization). Variation in language speaks to variation in society, and such forms of variation

637

Politics of Language
are evaluated given different value. Institutionalization, such
as, for example, in the education system (Rampton 1995) or in
bureaucracy (Sarangi and Slembrouck 1996) can stabilize and
reify such evaluative patterns and use them as normative, exclusive, and excluding instruments of power and control. Sensitive
social identities, such as gender and immigrant identities, can
be especially vulnerable to exclusion or marginalization in such
reified normative structures.
Both CDA and critical sociolinguistics seek an integration of
the linguistic or discourse-analytic method with social theory,
thus reversing the tendency toward autonomy and disciplinary
recognizability that characterized earlier phases in the development of these disciplines (e.g., Cameron 1992; Chouliaraki and
Fairclough 1999). This move is aimed at strengthening the fundamental theoretical assumption: that language and social structure stand in an intricate relationship to each other and that one
cannot be understood without an understanding of the other.
From another theoretical angle, linguistic anthropology has
significantly contributed to the study of the politics of language.
In contrast to the previous schools, linguistic anthropology has its
roots in an integrated science of human behavior. The anthropological notion of language, consequently, appears easier to integrate into a mature social-theoretical framework than notions of
language that have their feet in twentieth-century linguistic traditions. The fact that language forms and structures need to be
seen as reflective and constructive of sociocultural and political
realities was central to Edward Sapirs work (1921), and the post
World War II reemergence of the ethnography of communication
(Gumperz and Hymes 1972; Gumperz 1982; Hymes 1996) started
from the assumption that there is no occurrence of language that
is not drenched in social, cultural, historical, and political contexts and that, consequently, can be understood without attention to these contexts (Duranti 1997). It is from within linguistic
anthropology that the paradigm of language ideologies developed (Schieffelin, Woolard, and Kroskrity 1998; Blommaert 1999;
Kroskrity 2000; Bauman and Briggs 2003).

Language Ideologies
Language ideologies are beliefs, ideas, views, and perceptions
about language and communication. Such ideational complexes
pertain to every aspect of communication: about linguistic forms
and functions, as well as about the wider behavioral frames (often
called nonlinguistic) in which they occur. Thus, in the field of
language ideologies, people are seen to perform meanings, and
language in the narrow sense of the term is seen as just one mode
of meaning production. People produce semiosis (meaningful
symbolic behavior) as performance, and they do so within a
regimented field in which language ideologies produce stability
and recognizability. Seen from that perspective, language ideologies are of course not just ideational; they are practical in the
sense of Bourdieu, referring to the Marxian praxis, rather than to
the Mannheimian or Durkheimian notion of ideology.
The study of language ideologies emerged out of the Whorfian
concern with connections between language form and world
view (Hill and Mannheim 1992). To recap Benjamin Whorfs
basic idea, he argued that grammatical categories encoded
and thus revealed aspects of collective perceptions of reality;
as such, grammatical organization was not random, logical, or

638

autonomous but cultural and social, and it displayed coherence


with other aspects of social and cultural patterning. In that sense,
grammatical form responded to collective patterns that organized social and cultural behavior, including linguistic behavior.
The full richness of Whorfs approach was established by people
like Michael Silverstein (1979). Silverstein suggested that we
read Whorfs argument as follows: Linguistic form is indexical;
it indexes aspects of context through ideological inferences: A
particular form stands for a particular social and cultural
meaning (also Silverstein 2003). Thus, in French, tu and vous
share a great deal of linguistic meaning but are differentiated by
indexical meanings; tu indexes a low second person singular
addressee, while vous indexes a high second person singular
addressee. The one who uses tu or vous would express indexically
his/her degree of respect and social distance toward the interlocutor, and the interlocutor would attribute conventional identity features, such as polite, proper, well educated, middle
class, and so on to the one using these forms. Thus, we select
linguistic (and wider semiotic) forms in relation to socially and
culturally shared ideas about what would be appropriate, good,
useful, and salient communicative behavior in a specific context,
and our use of semiotic means creates, supports, and manipulates contexts.
This reconstruction of Whorfs foundational insight has significant implications. One effect is that it creates a new, but essentially inseparable, layer to language structure: a metapragmatic
layer. Accepting that layer means that the analyst must accept
that whenever we communicate, we not only communicate in
our communication but also communicate about our communication: We always flag socially and culturally shared (ideological) indexical meanings while we talk, and these indexicals
make others perceive our talk as serious, arrogant, funny,
or knowledgeable. The metapragmatics of language organizes
its pragmatics its meaning in society. And this, then, means
that approaches solely focused on a pragmatics of language risk
buying into commonly shared metapragmatic frames; in other
words, a normal linguistics always risks dragging along the
widespread language ideologies that dominate its object.
Another effect is that the range of variability in language is
vastly expanded, for the metapragmatic layer also provides an
enormous potential for social and cultural differentiation (distinction, to borrow Bourdieus term). In a nutshell, we can say
that every possible difference in language can become a socially
and culturally salient and important difference and that linguistic differences need not be big in order to generate important
social and cultural differences.

Evaluation
The idea that language is a politically invested object and that
people act politically in, through, and on language is by now a
well-established theoretical frame, the legitimacy of which no
longer requires debate. One reason is the fact that the different
approaches discussed here all have very strong empirical inclinations and that studies documenting the politics of language
often manage to transcend the slogans of a committed social
science and bring theoretical and methodological innovation to
the field. CDA has done much to sensitize discourse analysts at
large about the fact that discourse matters to people because it

Politics of Language

Possible Worlds Semantics

is invested with power and social capital; critical sociolinguistics has likewise drawn attention to the fact that sociolinguistic
distribution is not just a horizontal phenomenon but also a vertical one: Sociolinguistic difference is complemented by sociolinguistic inequality. And from within linguistic anthropology,
we have witnessed the emergence of a powerful ethnographic
paradigm that recovers the holistic and rich agenda developed
earlier by the likes of Sapir and Whorf and applies these insights
to an expanding field of fundamental and applied topics of language in society. The language-ideological approach appears to
be the most promising one because of its compelling theoretical coherence and empirical applicability, and it would benefit
adjacent disciplines if the central insight that every pragmatics of language is accompanied by a metapragmatics would be
adopted.
Jan Blommaert
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bakhtin, M. M. 1981. The Dialogic Imagination: Four Essays.
Austin: University of Texas Press.
Barthes, R. 1957. Mythologies. Paris: Seuil.
Bauman, R., and C. Briggs. 2003. Voices of Modernity: Language Ideologies
and the Politics of Inequality. Cambridge: Cambridge University Press.
Blommaert,
J.
2005.
Discourse:
A
Critical
Introduction.
Cambridge: Cambridge University Press.
Blommaert, J., ed. 1999. Language Ideological Debates. Berlin: Mouton de
Gruyter.
Bolinger, D. 1980. Language: The Loaded Weapon. London: Longman.
Bourdieu, P. 1991. Language and Symbolic Power. Cambridge,
UK: Polity.
Cameron, D. 1992. Feminism and Linguistic Theory. London: Macmillan.
. 1995. Verbal Hygiene. London: Routledge.
Chouliaraki, L., and N. Fairclough. 1999. Discourse in
Late
Modernity:
Rethinking
Critical
Discourse
Analysis.
Edinburgh: Edinburgh University Press.
Duranti, A. 1997. Linguistic Anthropology. Cambridge: Cambridge
University Press.
Fairclough, N. 1989. Language and Power. London: Longman.
. 1992. Discourse and Social Change. Cambridge, UK: Polity.
Foucault, M. 1984. The order of discourse. In Language and Politics, ed.
M. Shapiro, 10838. London: Blackwell.
Gumperz, J. 1982. Discourse Strategies. Cambridge: Cambridge University
Press
Gumperz, J., and D. Hymes, eds. 1972. Directions in Sociolinguistics: The
Ethnography of Communication. New York: Holt, Rinehart and
Winston
Hill, J., and B. Mannheim. 1992. Language and world view. Annual
Review of Anthropology 21: 381406.
Hymes, D. 1996. Ethnography, Linguistics, Narrative Inequality: Toward
an Understanding of Voice. London: Taylor and Francis.
Kroskrity, P., ed. 2000. Regimes of Language. Santa Fe, NM: SAR.
Milroy, J., and L. Milroy. 1985. Authority in Language: Investigating
Language Prescription and Standardisation. London: Routledge.
Rampton, B. 1995. Crossing: Language and Ethnicity among Adolescents.
London: Longman.
Sapir, E. 1921. Language: An Introduction to the Study of Speech. Orlando,
FL: Harcourt Brace.
Sarangi, S., and S. Slembrouck. 1996. Language, Bureaucracy and Social
Control. London: Longman.
Schieffelin, B., K. Woolard, and P. Kroskrity, eds. 1998. Language
Ideologies: Practice and Theory. New York: Oxford University Press.

Silverstein M. 1979. Language structure and linguistic ideology. In


The Elements, ed. P. Clyne, W. Hanks, and C. Hofbauer, 193247.
Chicago: Chicago Linguistic Society.
. 2003. Indexical order and the dialectics of sociocultural life.
Language and Communication 23: 193229.
Voloshinov, V. N. 1973. Marxism and the Philosophy of Language.
Cambridge: Harvard University Press.
Wodak, R. 1995. Critical linguistics and critical discourse analysis. In
Handbook of Pragmatics: Manual, ed. J. Verschueren, J. O. stman,
and J. Blommaert, 20410. Amsterdam: John Benjamins

POSSIBLE WORLDS SEMANTICS


Possible worlds semantics is a family of semantic theories in
which the truth conditions of modal concepts and other intensional locutions are expressed with the help of the concept of
possible world (scenario, possible state of affairs, possible course
of events). (See modality, intension and extension.)
Human beings constantly find themselves concerned with
what could happen or might have happened. The modal notions
of possibility and necessity are used to cope with such situations.
Less directly, notions like knowledge, belief, obligation, permission, and so on serve the same purpose. Concepts behaving in
essentially the same way as necessity, knowledge, and so on are
known as intensional concepts. Modal notions have several varieties, among them logical, conceptual, metaphysical, natural,
nomic, and physical modalities. When the different possibilities
can be weighted, one can also evoke the concept of probability.
It is nevertheless only relatively late that philosophers and
logicians came to think that in order to understand modal
notions (and other related notions), we have to consider unrealized courses of events or states of affairs and, hence, merely possible worlds. Earlier philosophers usually did not think in such
terms. For one thinker, Aristotle, the only reality is the succession
of present moments outside of which there are no other possible
courses of events. The idea of many worlds began its development in the Middle Ages, encouraged by the famous condemnation of 1277 of the view that God could not create other worlds.
The notion of possible world was put to major metaphysical uses
by G. W. Leibniz for whom metaphysical truths are truths holding in all possible worlds.
In twentieth-century philosophical logic, the notion of possible world became prominent when modal logic was approached
from a model-theoretical or semantic point of view. The use of the
notion of possible world in the study of modalities is analogous
to the measure-theoretical approach to probability theory, with
probability theorists sample-space points playing the same role
as logicians possible worlds. One of the pioneers of the semantic study of modalities was Rudolf Carnap (1947), who explicitly
acknowledged the inspiration he received from Leibniz. The
early treatments of the logic and semantics of modalities nevertheless relied heavily on syntactical concepts and arguments.
For instance, Carnap represented possible worlds by sets of
sentences he called state-descriptions. A state-description is a
complete list of atomic sentences and the negations of atomic
sentences that are true in some model.
In such semisyntactical theorizing, interpretational questions
were neglected, relatively speaking. Fortunately, this neglect did
not initially matter. For what is the cash value of assuming that

639

Possible Worlds Semantics


possible worlds exist? According to Van Quine, such existence
means that we can quantify over them (see quantification).
The starting point of possible worlds semantics is the insight that
many modal and other intensional concepts can be construed as
quantifiers over suitable classes of possible worlds. If NS means
it is necessary that S, it is true if and only if S is true in all possible worlds. It is possible that S, briefly PS, is true just in case
S is true in some possible world. If KaS means a knows that S,
it is true if and only if S is true in all the possible worlds not ruled
out by what a knows, and so on. Thus, the idea of possible worlds
was involved right from the beginning in the development of the
semantics of modal logic, following the work of Alfred Tarski and
his associates. (Cf. Copeland 2002; Kanger 1957; Hintikka 1957a,
1957b; Kripke 1959.) The first to emphasize the role of possible
worlds semantics as the basis of general semantics seems to have
been Richard Montague (cf. Montague 1974).
Even though this is, for most purposes, an adequate explanation of the meaning of KaS, the characterizations of necessity
and possibility need further specification, namely, an indication
of what kind of modality we are dealing with. For instance, not
all logically possible worlds are nomically (physically, naturally)
possible.
We thus seem to obtain a semantically interpreted language
by adding to a first-order language the operator or those operators we are interested in. On the basis of this idea, we can develop
much of a viable modal logic, epistemic logic, and so on, as well
as the required methods of proof.
This procedure is not sufficient alone, however. For one thing,
the possible worlds that figure in these explanations are relative
to the world w in which NS, PS, KaS, and so on are evaluated
semantically. They will be called alternatives to w. To deal with
iterated or multiple modalities, we have to consider alternatives
to alternative worlds, and so on. The alternativeness relation
involved here is sometimes called the accessibility relation.
This does not yet completely determine the possible worlds
semantics. To see what is missing, consider how the references
of linguistic expressions are determined. The guiding principle of
possible worlds semantics is that the application of a language,
including the reference of any expression e, in a given world w
must depend only on that world. The way in which the reference
of e in w is determined is, therefore, codified in the function f that
determines the reference of e as a function f(w, e) only. We could
call the totality of these functions the reference system of the
language. For instance, the reference of the 44th president of
the United States is whoever wins the 2008 election.
These ideas of reference and meaning are, in fact, the cornerstone of the version of possible worlds semantics most extensively used in linguistics, known as Montague semantics (see
montague grammar). It was developed by Montague (1974)
and applied in linguistics most vigorously by Barbara Partee
(1976, 1989).
There are further problems in the development of possible
worlds semantics, however. When we use a quantifier, we consider each of its values as being the same individual in different
possible worlds. But how can such identities be recognized?
They cannot be established by examining the different possible
worlds in question independently of one another. For instance,
a name-conferring (dubbing) ceremony in one world does not

640

automatically help the identification of the same individual in


another world. Nor does the rest of the reference system help us
here. There must exist principles defining what counts as a single
individual across possible worlds. Their totality can be called an
identification system.
The nature of such identification has given rise to extensive
discussion and controversies. The identification system codified
in our language is largely independent of the reference system.
Indeed, there are two different kinds of identification actually
used in our conceptual system. An identification system can
be visualized as a kind of map shared by the possible worlds
between which the identification is to take place. In the most
common cases of identification, the map can be thought of as a
kind of universal registry of the relevant population. For instance,
if the files of the Social Security System were to serve as such a
system, I would know who someone is if and only if I knew his or
her social security number. Such identification could be called
public. An idea of how the criteria of public identification could
work can be obtained by considering how we reidentify objects
over time. Continuity considerations obviously play a major role,
but questions as to how objects behave over time also come into
play.
An individuals position in someones perceptual space or
remembered role in someones past experiences can also serve
as a framework of identification. The simplest framework of this
kind is someones visual space. Such forms of identification are
called perspectival. Among other expressions of our language,
demonstratives rely on perspectival identification. Their operation is illustrated by Bertrand Russells onetime view that the
only logically proper names of English are this, that, and
I. The explanation is that Russell tacitly presupposed only perspectival identification.
The distinction between perspectival and public identification gains further interest and robustness from the fact that these
two systems are, in the case of visual cognition, implemented
by different parts of the human brain (Vaina 1990; Hintikka and
Symons 2003). Since quantifiers depend on identification, they
acquire a different meaning according to the kind of identification presupposed.
These observations open the doors to extensive applications of logical languages with a possible worlds semantics. For
instance, a simple wh-statement Alonzo knows who (call him or
her x) is such that F[x] can be expressed by a sentence of the form
(x)KAlonzoF[x] where x ranges over persons. This shows how to
formalize simple knows who statements in general. For example,
(x)KAlonzo(Barbara = x) says that Alonzo knows of some particular individual x that Barbara is that x. This unmistakably means
that Alonzo knows who Barbara is. Such statements may be contrasted to Ka(x)(b = x), which merely says that a knows that b
exists.
This kind of variation of operator ordering K versus (x)
cannot do the whole job, at least if we want to stay on the firstorder level. In order to do so, we have to resort to a recently
introduced idea of operator independence (Hintikka 2003).
Since modalities are characterized by quantification over possible worlds, the same kind of independence can obtain between
modal operators and quantifiers as between quantifiers. This
independence can be expressed by a slash. Thus, we can express

Possible Worlds Semantics

Possible Worlds Semantics and Fiction

(x)Ka(b = x) equivalently as Ka(x/Ka)(b = x), where x/Ka


means that is independent of Ka. (Notice that by so doing, we
can stay closer to the structure [word order] of the corresponding
English knowledge statements.) In more complicated cases, we
can, for instance, express a knows which function g(x) is by
Ka(x)(y/Ka)(g(x) = y).

We cannot stay on the first-order level here without the independence indicator.
This does not clear up all interpretational problems, however.
We can still ask: What are the relevant possible worlds in different applications? This question is connected with the question as
to what kinds of modalities and other intensional notions there
are.
The characterization of possible worlds as represented by
maximal consistent classes of sentences of a given language has
encouraged the idea that what is intended by possible worlds are
indeed worlds in the sense of entire universes. However, a comparison with probability theory shows that such grandiose interpretations are neither unavoidable nor even preferable. In most
applications of probability theory, the possible worlds (sample
space points) are not worlds in any ordinary sense of the word.
They usually are what might be called scenarios, namely, courses of
events involving a small region of space-time, for example, tosses
of a die. Some probability theorists speak of small worlds, and
practically all realistic applications of possible worlds semantics
are to such small worlds. In some of his work, Montague, in fact,
operates with contexts of use, rather than possible worlds.
There remains the question of different modalities. Are they
all viable in the light of possible world semantics? There are no
major unsolved conceptual problems about epistemic or doxastic modalities or other similar intensional concepts. The class
of alternative epistemic worlds has a clear meaning or at least
as clear a meaning as our language has. Logical (conceptual)
modalities are interpretable only if we look at the structure of
the possible worlds that they involve. If we begin to speak of all
possibilities of individual existence, the alternatives to a given
world do not form a viable class anymore than the set of all sets
in set theory. The idea of natural possibility has a clear sense if it
is taken to mean conformity with natural laws (nomic necessity).
But when it is claimed that there exists a metaphysical necessity
separate from nomic (physical) and conceptual (logical) necessities, it is hard to see what is being meant. It is not enough to
claim that we have intuitions about them, for the very notion of
intuition in its current philosophical use is highly suspect.
Jaakko Hintikka and Risto Hilpinen
WORKS CITED AND SUGGESTIONS FUR FURTHER READING
Carnap, Rudolf. 1947. Meaning and Necessity: A Study in Semantics of
Modal Logic. Chicago: University of Chicago Press. An enlarged edition appeared in 1956.
Copeland, B. Jack. 2002. The genesis of possible worlds semantics.
Journal of Philosophical Logic 31: 99137.
Hintikka, Jaakko. 1957a. Quantifiers in Deontic Logic. Helsinki: Societas
Scientiarum Fennica (Commentationes Humanarum Litterarum, Vol.
13.4).
. 1957b. Modality as referential multiplicity. Ajatus 20: 4964.

. 2003. A second generation epistemic logic and its general significance. In Knowledge Contributors, ed. by V. Hendricks, K. F. Jrgensen,
and S. A. Pedersen, 3356. Dordrecht, the Netherlands: Kluwer
Academic.
Hintikka, Jaakko, and John Symons. 2003. Systems of visual identification and neuroscience: Lessons from epistemic logic. Philosophy of
Science 70: 89104.
Kanger, Stig. 1957. Provability in Logic. Stockholm Studies in Philosophy.
Vol. 1. Stockholm: Almqvist and Wicksell.
Kripke, Saul A. 1959. A completeness theorem in modal logic. Journal
of Symbolic Logic 24: 114.
Montague, Richard. 1974. Formal Philosophy: Selected Papers by
Richard Montague, ed. by Richmond Thomason. New Haven, CT: Yale
University Press.
Partee, Barbara H. 1989. Possible worlds in model-theoretic semantics: A linguistic perspective. In Possible Worlds in Humanities, Arts
and Sciences: Proceedings of Nobel Symposium, ed. by S. Alln, 93123.
Berlin and New York: Walter de Gruyter.
Partee, Barbara H., ed. 1976. Montague Grammar. New York: Academic
Press.
Vaina, Lucia. 1990. What and Where in the Human Visual System: Two
Hierarchies of Visual Modules. Synthese 83: 4991.

POSSIBLE WORLDS SEMANTICS AND FICTION


The applications of the philosophical concept of possible world
to narrative and to fiction were first developed in the late 1970s
and early 1980s as a reaction to structuralist poetics, a
movement that adhered to Ferdinand de Saussures conception
of language as a self-enclosed system of signs. As Thomas Pavel
has argued, this theoretical position led to a moratorium on
representational topics (1986, 6) and on the notion of reference
to a world external to language. In its literary applications, possible worlds (hence, PW) semantics is an attempt to restore the
relevance of mimesis, reference, and the question of truth without reducing the fictional text to an image of reality.
The logician Jaakko Hintikka (1989) describes the conception of language to which PW semantics seeks an alternative as
language as the universal medium. According to this view, all
that language is good for is to enable us to talk about this world
(Hintikka 1989, 54). The primary target of his description is positivist philosophies that limit reference to an objectively existing
external reality, such as those of Gottlob Frege, Bertrand Russell,
and the early Ludwig Wittgenstein. For the positivist, a statement
concerning a nonexisting entity, such as Santa Claus or Emma
Bovary, is either false or indeterminate. It is, therefore, impossible
to differentiate the validity of statements made about imaginary
beings. Structuralism and deconstruction go even further in
their interpretation of language as universal medium by regarding it as the unique reality to which it is capable of referring.
To the conception of language as universal medium, Hintikka
opposes what he calls language as calculus. In this framework,
you can so to speak stop your language and step off. In less metaphoric terms, you can discuss the semantics of your language
and even vary systematically its interpretation. The operative
word highlights the thesis that language is freely interpretable
like a calculus (1989, 54). By virtue of this reinterpretability, language can be directed toward different domains of reference and
the truth value of propositions established separately for each

641

Possible Worlds Semantics and Fiction


of these domains. A statement can consequently be false in one
domain and true in another, and it becomes possible to assign a
positive truth value to the statement Emma Bovary committed
suicide by swallowing arsenic for the world of Gustave Flauberts
novel, even though the sentence is false in the real world (unless
we prefix it with in Flauberts novel).
Hintikkas conception of language as calculus relies on an
ontological model made of a plurality of worlds. A common justification for the postulation of multiple worlds is the intuitive notion
that things could have been different from what they are. Saul
Kripke formalized this intuition through a model that describes
reality the sum of the thinkable as a set of elements hierarchically structured by the opposition of one element, which can be
interpreted as the actual or real world, to all the other members of
the system. Kripke envisions a relation of accessibility that links
the actual world to those worlds that are possible but not actual.
Worlds not linked to the central element of the system are considered impossible worlds, but when the relation of accessibility is
interpreted as respect for the laws of logic (noncontradiction and
excluded middle), one may debate whether they are worlds at
all, rather than incoherent collections of propositions. There are,
however, other interpretations of accessibility that preserve the
world status of the inaccessible elements: for instance, nomological (respect of the laws of nature), epistemic (distinguishing what
is known, believed, and ignored), and deontic (based on what is
allowed, obligatory, and forbidden).
A question raised by Kripkes model is what distinguishes
the actual world from all the other members of the system.
According to a widespread view that may be called absolutist, the
actual world differs in ontological status from merely possible
ones in that this world alone presents an autonomous existence;
all the other worlds are the product of a mental activity, such
as dreaming, imagining, foretelling, promising, or storytelling.
David Lewis (1986, 8491) proposes an alternative to the absolutist view known as modal realism. For Lewis, all possible worlds
are equally real, and all possibilities are realized in some world,
independently of whether somebody thinks of them or not. But if
all possible words are real, how does one pick one of these worlds
as actual? Lewis answers this question through an indexical conception of actuality. The reference of the expression the actual
world varies with the speaker, like the reference of the deictics I,
you, here, and now (see deixis). All possible worlds are consequently actual from the point of view of their inhabitants.
The indexical conception of actuality is very important for
the description of the readers experience of fiction. We normally think of fictional worlds as imaginary and as nonexisting.
We know that in contrast to our world, they are produced by a
human mind, the mind of the author. But this does not explain
how we relate to them. In contrast to hypothetical and counterfactual statements, whose reference to an imaginary world is
stressed by the conditional mode or by an if then construct,
fictions are narrated in the indicative mode and, therefore, hide
the nonactual status of their reference world. Lewis accounts for
the formal similarity between fiction and statements of facts by
characterizing fiction as a story told as true about a world other
than the one we regard as actual by a narrator situated within
that other world. A nonfictional story, by contrast, is told as true
about our world by one of its members, and a counterfactual or

642

hypothetical statement describes another world from the point


of view of the actual world.
For the duration of our immersion in a work of fiction, we
regard or, rather, pretend to regard its world as actual. This
pseudoactuality is produced by a gesture of imaginative recentering of the reader, spectator, or player into the fictional world
(Ryan 1991, 212). The experience of fiction has been compared
by Kendall Walton (1990) to a game of make-believe, but what
exactly is it that we pretend to believe when we immerse ourselves in a work of fiction? PW theory, and more specifically the
indexical theory of actuality, spells out the rules of this game as
pretending to believe that fiction describes a world that is both
real and actual. Pretending that this world is real means pretending that it exists independently of the text, while pretending that
it is actual means transporting oneself in imagination into this
world and adopting the point of view of one of its members.
Another of Lewiss contributions to the theory of fiction is the
elaboration of an algorithm for determining the truth or falsity
of interpretive statements made by the reader about fictional
worlds (1978). This algorithm is adapted from his famous analysis
of the truth conditions of counterfactual statements (1973).
Lewiss criterion for establishing the truth value of statements
about fiction, such as Emma Bovary was a devoted mother, can
be paraphrased as follows: A sentence of the form in the fiction
f, p is true when some world where f is told as known fact differs
less, on balance, from the actual world than does any world where
f is told as known fact and p is false. This formula tells us that
Emma was not a model mother because, in order to accept this
interpretation, we would have to assume that the fictional world
adheres to a set of standards of good motherhood vastly different
from the values of our cultural corner of the actual world, though
nothing in the text authorizes such an assumption.
Lewiss algorithm entails a fundamental principle for the
phenomenology of reading. This principle, called by MarieLaure Ryan the principle of minimal departure, states that
when readers construct fictional worlds, they fill in the gaps in
the text on the basis of their experience of the actual world, and
they will not make gratuitous changes. The principle of minimal
departure can only be overruled by the text itself. For instance, if
a work of fiction mentions an elephant, the reader will imagine
the elephant as huge and gray unless the text describes it as a
polka-dotted pet the size of a chihuahua. Even then, the reader
will imagine that the elephant has thick skin, big ears, and tusks.
The assimilation of fictional worlds a concept often used
by critics informally to the more technical notion of possible
worlds can claim Aristotle as its forefather: As he writes in the
Poetics, the function of the poet is not to say what has happened,
but to say the kind of thing that would happen, i.e., what is possible in accordance with probability and necessity (1996, 16,
par. 5.5). But even if one extends the notion of possibility beyond
what could happen in our world so as to include the logically
coherent but nomologically impossible worlds of science fiction
and fantasy, the straightforward assimilation of fictional worlds
to possible worlds encounters difficulties.
Logicians consider possible worlds to be maximal states of
affairs, meaning by this formula that every proposition is either
true or false in a given world. But fictions are created by texts,
and texts can only assert a limited number of propositions. To

Possible Worlds Semantics and Fiction

Pragmatic Competence, Acquisition of

take a famous example, the tragedy of Shakespeare implies that


Lady Macbeth had children, but it does not specify the number
of these children. Should one regard fictional worlds as radically
incomplete, as does Lubomr Doleel (1998, 223), a view implying that Lady Macbeth is a creature who lacks the feature having
a determinate number of children? Or should one apply minimal departure and assume that by presenting her as a human
being, the text invites the reader to regard the number of her
children as unavailable information as would be the case for
a flesh-and-blood woman rather than as an ontological lack?
Waltons concept of make-believe offers a compromise between
these two interpretations: While readers know that fictional
worlds are the product of a finite number of textual assertions,
they imagine these worlds and their inhabitants as ontologically
complete.
Another problem with regarding fictional worlds as possible
worlds is the existence of fictions that do not respect the laws of
logic and consequently fail to satisfy the broadest notion of possibility. In logic, a single contradiction in a group of propositions renders the system radically inconsistent because this
contradiction allows everything (and its opposite) to be inferred.
This makes it impossible to imagine a world. But in a fictional
text, transgressions of logic are not totally incompatible with the
mental construction of worlds. Logically impossible objects or
events can be limited to certain areas, comparable to the holes in
a Swiss cheese, and the reader remains capable of drawing inferences for the solid parts of the cheese. We can still imagine the
world of a time-travel story that presents impossible causal loops
or of a fantastic tale situated in an inconsistent space. But some
fictions generalize contradiction by systematically negating what
they assert or by presenting dreamlike situations that continually
morph into other situations. The reader of these texts can only
construe fragments of worlds that do not fit together. All fictions
project a set of meanings, but if we conceive worlds as relatively
stable totalities populated by individuals whose evolution maintains some continuity, the extent to which these meanings form
a world is variable. A fiction made of incompatible world fragments blocks the experience of immersion because it does not
offer a target for the recentering of the imagination.
Marie-Laure Ryan
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aristotle. 1996. Poetics. Trans. and intro. Malcolm Heath. New
York: Penguin.
Doleel, Lubomr. 1998. Heterocosmica: Fiction and Possible Worlds.
Baltimore, MD: Johns Hopkins University Press.
Eco, Umberto. 1984. The Role of the Reader: Explorations in the Semiotics
of Texts. Bloomington: Indiana University Press. Outlines a semantics
of narrative based on possible worlds.
Hintikka, Jaakko. 1989. Exploring possible worlds. In Possible Worlds in
Humanities, Arts and Sciences: Proceedings of Nobel Symposium 65, ed.
Sture Alln, 5273. Berlin: de Gruyter.
Kripke, Saul. 1963 Semantical considerations on modal logic. Acta
Philosophica Fennica 16: 8394.
Lewis, David. 1973. Counterfactuals. Cambridge: Cambridge University
Press.
. 1978. Truth in fiction. American Philosophical Quarterly
15: 3746.

. 1986. On the Plurality of Worlds. Oxford: Blackwell.


Matre, Doreen. 1983. Literature and Possible Worlds. London: Middlesex
University Press. Proposes a typology of fictions based on types of
possibility.
Martin, Thomas. 2004. Poesis and Possible Worlds. Toronto: University
of Toronto Press. Investigates the consequences of Jaakko Hintikkas
notions of language as universal medium versus language as calculus
for literary theory.
Pavel, Thomas. 1986. Fictional Worlds. Cambridge: Harvard University
Press.
Ronen, Ruth. 1994. Possible Worlds in Literary Theory.
Cambridge: Cambridge University Press. A critique of the use of the
concept of possible world by literary theorists.
Ryan, Marie-Laure. 1991. Possible Worlds, Artificial Intelligence and
Narrative Theory. Bloomington: University of Indiana Press.
Walton, Kendall. 1990. Mimesis as Make-Believe: On the Foundations of
the Representational Art. Cambridge: Harvard University Press.

PRAGMATIC COMPETENCE, ACQUISITION OF


The acquisition of pragmatic competence involves the development by the first or second language learner of a wide range
of skills and capacities in using language to interact. Together
with linguistic competence, it forms the core of communicative
competence, as defined by D. Hymes (1967, 1992; Foster-Cohen
2001) and includes how learners develop the ability to convey
and interpret linguistic (and nonlinguistic) messages; how they
expand their repertoire of communicative acts; how they come
to modulate those acts for such features as directness, politeness, and informativeness; and how they design their messages
so that they conform to the social norms of their community. It
also includes the developing ability to construct linguistic units
larger than an utterance, including extended conversational
exchanges, stories, descriptions, explanations of procedures,
and so on, as well as the very basic capacity to enter into interactions with other people and the turn-taking, topic initiation,
topic maintenance, and exchange opening and closing behaviors upon which any interaction depends.
Work in the acquisition of pragmatics has often resulted in
extended taxonomies of the speech-acts and types of talk that
learners acquire. A. Ninio and P. Wheeler (1984), for example,
developed a taxonomy of 70 distinct types of talk interchanges
acquired by children, categorized at an exchange level as negotiations, discussions, performances, and so on and at the utterance
level in terms of communicative acts, such as directives, declarations, statements, questions, and the responses to each of these.
The taxonomy represents a hypothesis about the organization
of the mental representation of communicative intents in the
mind (Ninio and Snow 1996, 39) and has its roots in speech-act
theory, pioneered by Austin and Searle (see Searle 1969).
Other work has attempted to explore what is needed for children to be able to behave in ways that others accept as natural
and effective language use in a given community. E. Andersen
(1990), for example, examined childrens growing knowledge of
the language appropriate to roles with significant power differentials, such as doctor versus patient or teacher versus pupil; others
have explored issues such as how children develop gender-differentiated ways of speaking (see gender and language).
Other studies have explored what kinds of social, cognitive, and

643

Pragmatic Competence, Acquisition of


linguistic developments are needed before children can engage in
particular acts. J. Bernicot and V. Laval (2004), for example, have
explored how children learn to make a promise or understand
one. This kind of work addresses childrens growing understanding of how the interpretation of linguistic expressions depends
crucially on inferencing by correctly processing both the language
that is spoken and a wide range of contextual characteristics that
go well beyond the words actually uttered. In fact, the exploration
of childrens capacity for drawing inferences in communication
is a healthy area of research in its own right. I. Noveck (2001), for
example, has explored childrens interpretations of scalar implicatures (see conversational implicature) and suggested
that in certain ways, children are more logical in their interpretations of words, such as some, than are adults.

Pragmatic Acquisition from Birth to Adolescence


Infants are surprisingly communicative, even from birth. Perhaps
because of mirror neurons (Brten 2007), newborns respond
in kind to the social advances of adults and are soon able to initiate social exchanges with others through eye gaze, movement,
and vocalization (see communication, prelinguistic), as
well as to engage in elementary turn-taking. This is the beginning
of the development of pragmatic competence.
Elizabeth Bates and colleagues (Bates, Camaioni, and Volterra
1975) suggested that Austins tripartite distinction between perlocution, illocution, and locution might be harnessed to
describe what happens in the first year of life. They suggested that
when an infant communicates a message (such as Im hungry)
through a cry but without the intention to communicate, it can
be seen as a sort of perlocutionary event (albeit without either
illocution or locution). Then, with the emergence of gestures
and specific patterns of eye gaze, illocutionary forces, such as
requests (perhaps achieved via a reaching hand coupled with
a glance at the interlocutor and a glance at the desired object)
or making an observation (perhaps through a similar eye gaze
pattern, but this time coupled with a pointing hand shape), can
be observed while still in the prelinguistic phase and therefore
prelocutionary. Finally, the emergence of recognizable words
adds the capacity for locution. Other researchers, such as John
Dore (1975) and Michael Halliday (1975), have also added to our
understanding of just how rich very young childrens capacity
for pragmatic competence is before the emergence of significant
expressive language.
As productive expressive language emerges, from around
the first birthday, the communicative resources of the child
expand and go on expanding over the next two decades and,
for some, throughout life. Some communicative acts are easily accomplished with quite simple linguistic resources (greetings, leave-takings, agreements, and labeling of objects, for
example). Others, such as giving explanations, asking questions,
providing definitions, or making hypotheses, are dependent on
a more sophisticated level of language development. So, while
yes/no questions can be asked with only a rise of intonation
(Out?, for example), open-ended questions require at least
some control of the wh-questions forms, for example, What
Daddy do? and How you go there? Interestingly (and often
irritatingly for the parent), the ubiquitous Why? seems to be
used in the first instance for its ability to elicit a response from

644

the interlocutor, rather than for its ability to seek reasons and
motivations, which at that stage are beyond the childs level of
cognitive development.
Not long after children acquire the ability to produce individual communicative acts, they begin to combine them into larger
language acts and events, such as stories, explanations, and
complex observations. Narratives range from the simple statement of a problematic event in the past followed by a solution
(The baby cried; the mummy picked it up) to heavily scaffolded
productions such as the following:
Ross (aged 2;6) and his mother are sitting together eating a
snack.
R: Sometimes Ross come out bed bed come out night.
M: What are you talking about? What about that bed at night?
Sometimes you what the bed at night?
R: Mmm.
M: What did you say?
R: In the dark.
M: In the dark!
R: Ross, erm, Ross runs in the dark.
M: Run in the dark?!
R: Ross runs.
M: You get out of the bed in the night did you and ran around
in the dark. That sounds a daft thing to do! (Foster-Cohen
1990)

Here, Rosss mother helps him get his story out piece by piece
and puts it together for him.
As children develop their story skills, we start seeing complex
depictions with recognizable phases and characteristic packaging of information (Labov and Walestsky 1967). The literature on
childrens narratives has been enhanced by several large-scale
studies, such as that by Ruth Berman and Dan Slobin, whose collection of stories told in response to wordless books about a small
boy and a frog (known, naturally enough, as The Frog Stories)
has provided a cross-linguistic, cross-cultural view of how children develop the ability to tell a story (Berman and Slobin 1994;
Strmqvist and Verhoeven 2004).
Stories and other large discourse units are held together via
the coherence of their informational structure and by the
markers of cohesion that link individual utterances to each other.
The presentation of new information in relation to assumed or
known information is one key aspect of coherence, and requires
children to be able to infer what their interlocutor knows and
to structure the information provided accordingly. As such, the
development of coherence in childrens narratives, and in their
language use generally, is dependent on the evolving understanding of other minds (see theory of mind and language
acquisition). When there is different and conflicting information held by the child narrator and by a protagonist in the story,
it places considerable strain on the young childs pragmatic
competence. An example occurs in the Frog, where are you?
story. At one point, the boy in the story grabs what he believes to
be branches but the narrator knows to be the antlers of a deer.
Children struggle with how to represent this conflicting information, as the following representative samples from Berman and
Slobins (1994) work suggest:

Pragmatic Competence, Acquisition of


He hops on the deer. (4.7; no understanding of the boys misjudgment of the branches)
Then he got on a reindeer, because the reindeer was hiding
there. (5.2; understanding that the reindeer was not initially
visible to the boy, but no attention to the boys state of mind)
He got picked up by a reindeer. (5.8; use of get-passive suggests the narrator is aware that the boy was not an intentional
agent)
Hes holding on to some sticks. But they arent really sticks.
When uh something came up, and the little boy was on it.
Um it was a father deer, Id call it. (5.10; explicit recognition of the boys misperception, though from the point of view
of the narrator, rather than the boy)
He thought it was sticks and he got on that and the deer
came and carried him. (5.11; explicit attribution of misperception to the boy; groping for means of encoding the unintentionality of the consequences)
And then he stands up on the rock and hangs onto some
branches. Then it turns out theyre a deers antlers. So and
he gets he lands on his head. (9.11; the turns-out construction provides a means of encoding the switch in perspective,
and the interrupted he gets suggests a groping for a passive
construction)
And finally, here is an adult version: When he gets to the top
of the rock, he holds onto something that he app thinks are
branches, and calls to the frog. And what the boy took to be
branches were really antlers of a deer on which he gets caught.
These samples can also be used to illustrate cohesion markers.
In particular, we can see how the deer is introduced by the youngest child with a definite article, as if its presence were already
known. Almost all the other older children use the indefinite
article appropriate to a first mention. All the children use he
to refer to the boy, which works in these examples. However, in
another example, taken from elsewhere in the database And
he [the deer] starts running. And he [the deer] tips him off over a
cliff into the water. And he [the boy] lands we can see how the
use of pronouns undergoes development in order for the hearer
to keep track of the protagonists reliably. If the simple story
referred to had been The baby cried; the Mummy picked Johnny
up, we would be forgiven for wondering whether Johnny is the
baby or someone else.
As they develop, children are able to rely more and more
on their own skills in pragmatics and depend less and less on
a cooperative other to make sense of what they are trying to
say. As a result, their interactions with their peers can begin to
mature, and they can develop the skills for working and learning cooperatively with children their own age. These conversations are often much more combative than any conversation
between nurturing parent and child, and children need to, and
do, develop important skills for repairing the misunderstandings
that inevitably arise. However, unlike grammatical development,
which is largely complete by the age of five, pragmatic competence keeps on developing. Teenagers continue to develop their
skills of staying on topic, interrupting appropriately, showing
empathy, and entertaining others by telling jokes and acting out
stories and events (Nippold 2000). Moreover, as professional
orators, stand-up comics, negotiators seeking the release of hos-

tages and those clinching business deals know, they may hone
their pragmatic skills for the rest of their lives.

Developing Pragmatic Competence in a Second Language


Most of the work on the acquisition of pragmatics has been carried out within first language research circles. However, there is
now a thriving research stream in second language pragmatics,
pioneered most notably by Gabriele Kasper (Kasper and Rose
2002). The significance of this work for those working in developmental pragmatics with children lies in the help it provides
for teasing apart those aspects of pragmatics that are pancultural
and part of the human makeup and those that are specific to particular language and cultural groups.
Research suggests that second language pragmatics is notoriously difficult to learn. There are a number of possible reasons.
One is that, unlike grammar, inappropriate pragmatics is generally perceived by the other party as another pragmatic message.
So, inability to respond to a compliment as a native speaker
might is perceived as ungratefulness or rudeness. Overlapping
another persons speech in a way that is not native can be perceived as interrupting. As a result, learners may not receive the
kind of feedback they actually need to adjust their pragmatics
toward the native-speaker norms. Another reason is that while
learning the grammar of a second language can be perceived
by the learner as simply the learning of a code (another way of
saying something), learning the pragmatics of another language
group is learning that groups culture and, as such, is felt more
deeply and more personally. As a French learner of English
once said when I was trying to teach the pragmatics of English,
I want to learn English; I dont want to be English. Finally, and
relatedly, the reason second language pragmatics is so hard to
acquire is because researchers and pedagogues are not very
good at describing it in such a way that learners can, in fact,
learn it.

Disorders of Pragmatic Competence


Because the development of pragmatic competence depends
on a number of intersecting developments, there are multiple
ways in which it can be derailed or curtailed. Difficulties with
understanding the nature of social engagement, of inferring the
knowledge and intentions of another person, or of processing
the subtle cues of verbal and nonverbal communication can all
impact on a learners capacity to become pragmatically competent. The most well known disorder of pragmatic competence is
autistic spectrum disorder, a condition which comes in a variety
of forms and degrees of severity. It can impact all of the aforementioned prerequisites for pragmatic competence. A variety
of other developmental conditions (including Down syndrome,
Williams syndrome, and global developmental delay) also
impact on pragmatic development. More avoidable are those
disruptions to pragmatic development that come about as a
result of neglect or abuse and the consequent failure to attach
effectively to one or more key people. It is now quite clear that
the development of social relationships, and with it pragmatic
competence, is dependent on experience in responsive and
pragmatically appropriate relationships. Even an intact child
can have his or her pragmatic competence derailed by poor
experiences.

645

Pragmatic Competence, Acquisition of

Pragmatics

How Pragmatics Is/Are Acquired


As with any human development, pragmatic competence is
acquired as a result of both biological design and social experience. As already indicated, the acquisition of pragmatic competence depends crucially on nonlinguistic factors, such as innate
social responsiveness, the development of real-world knowledge,
and general problem-solving ability. It has also been claimed to
depend more heavily on overt instruction. Children are observed
being taught to be polite, to follow the rules for trick or treat
at Halloween, or to adjust their messages to take account of the
other persons knowledge and seem more able to learn from
correction and overt modeling than they are when their grammatical errors are corrected. However, given the complexity of
pragmatic development, the huge amount of unconscious inferencing it requires, and the difficulty understanding what has
gone wrong when pragmatic rules and expectations have been
violated, it is unlikely that more than the most codified of pragmatic skills (politeness formulae, terms of address, fixed events
such as trick or treat) are acquired in this manner. Rather, pragmatic skills are caught through cultural contact and spread
by epidemiological principles (Sperber 1996).
Susan Foster-Cohen
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Andersen, E. 1990. Speaking with Style: The Sociolinguistic Skills of
Children. London: Routledge, Kegan and Paul.
Bates, E., L. Camaioni, and V. Volterra. 1975. The acquisition of performatives prior to speech. Merrill-Palmer Quarterly 21: 20526.
Berman, R., and D. I. Slobin 1994. Relating Events in Narrative: A
Crosslinguistic Developmental Study. Hillsdale, NJ: Lawrence
Erlbaum.
Bernicot, J., and V. Laval. 2004. Speech acts in children: The example
of promises. In Experimental Pragmatics, ed. Ira Noveck and Dan
Sperber, 20727. Basingstoke, UK: Palgrave Macmillan.
Brten, S., ed. 2007. On Being Moved: From Mirror Neurons to Empathy.
Amsterdam: John Benjamins.
Dore, J. 1975. Holophrases, speech acts and language universals.
Journal of Child Language 2: 2040.
Foster-Cohen, S. 1990. The Communicative Competence of Young
Children. Harlow, England: Longman.
. 2001. Communicative competence: Linguistics aspects. In
International Encyclopedia of the Social and Behavioral Sciences, ed.
N. J. Smelser and P. B. Baltes, 231923. Amsterdam: Elsevier Science.
Halliday, M. 1975. Learning How to Mean: Explorations in the Development
of Language. London: Arnold.
Hymes, D. 1967. Models of the interaction of language and social setting. Journal of Social Issues 23: 828.
. 1992. The concept of communicative competence revisited. In
Thirty Years of Linguistics Evolution: Studies in Honor of Ren Dirven,
ed. M. Ptz, 3158. Philadelphia: Benjamins.
Kasper, G., and K. Rose. 2002. Pragmatic Development in Second
Language. Oxford: Blackwell.
Labov, W., and J. Waletsky. 1967. Narrative analysis: Oral versions of
personal experience. In Essays on the Verbal and Visual Arts, ed.
J. Helms, 1244. Seattle: University of Washington Press.
Ninio, A., and C. Snow. 1996. Pragmatic Development. Boulder,
CO: Westview.
Ninio, A., and P. Wheeler. 1984. A manual for classifying verbal communicative acts in mother-infant interaction. Working Papers in
Developmental Psychology, no. 1, Hebrew University, Jerusalem.

646

Nippold, M. 2000. Language development during the adolescent


years: Aspects of pragmatics, syntax, and semantics. Topics in
Language Disorders 20.2: 1528.
Noveck, I. 2001. When children are more logical than adults: Investigations
of scalar implicature. Cognition 78: 16588.
Searle, J. 1969. Speech Acts. Cambridge: Cambridge University Press.
Sperber, D. 1996. Explaining Culture: A Naturalistic Approach.
Cambridge, MA: Blackwell.
Strmqvist, S., and L. Verhoeven. 2004. Relating Events in Narrative.
Vol. 2. Typological and Contextual Perspectives. Mahwah, NJ: Lawrence
Erlbaum.

PRAGMATICS
Pragmatics refers to the study of meaning in context. Consider,
for example, the following exchange between two close friends:
Harvey: Are you going to the big party tonight?
Molly: Didnt you hear that Jason would be there?

How does Harvey interpret Mollys response to his question?


Although Mollys response is itself a question, it is considered
an appropriate answer to Harveys original question, at least
in this context, given the assumption that Harvey knows what
Molly feels about Jason. Of course, listeners who do not know
how Molly feels about Jason would be unable to infer whether
Molly implies yes or no by her response. But the information that Harvey and Molly share about Jason, and particularly
Mollys thoughts about Jason, such as that he is an ex-boyfriend
whom she wishes to avoid, should allow Harvey to easily infer
what Molly means by what she says.
Peoples pragmatic understanding of speakers utterances
in context is assumed to rely on their general knowledge of the
world, the specific discourse context, and what they know about
their interlocutors. Pragmatics is seen as distinct from semantics
in referring to contextual meaning, as opposed to context-invariant word meaning or sentence meaning, and is also viewed as
being associated with what speakers imply, as opposed to what
they literally say. Philosophers interested in ordinary language use, and not more narrowly semantic meaning, launched
the study of pragmatics in the late 1950s. For instance, J. L. Austin
(1962) described the ways in which people use words to accomplish different social actions, and he demonstrated that speakers
typically intend to communicate different or additional meanings
beyond what their words literally say (see performative and
constative). Thus, when a speaker says, Ill lend you five dollars, she communicates a promise to actually give the listener $5.
The philosopher John Searle (1975) later argued that there were
only five major types of speech-acts by which speakers perform
acts with different illocutionary force, including:
Representative or Assertive: The speaker becomes committed
to the truth of the propositional content of an utterance; such
as asserting The sun is shining today.
Directive: The speaker tries to get the hearer to fulfill what
is represented by the propositional content of an utterance,
such as Please stop talking.
Commissive: The speaker commits to act in the way represented by the propositional content of an utterance, such as
Ill lend you five dollars.

Pragmatics
Expressive: The speaker expresses an attitude toward the
propositional content of an utterance, such as Im sad your
wallet was stolen.
Declarative: The speaker performs an action just representing
himself or herself as performing that action, such as We find
the defendant guilty of murder in the first degree.
Philosophers have explored the various social and institutional facts that must hold for an utterance to be faithfully seen
as an example of any of these speech-acts, such as whether an
individual must be capable of fulfilling the act represented in
an utterance for it to be seen as a sincere promise (i.e., that the
speaker actually has $5 to loan and can give this money to the
listener).
Pragmatic theory, however, has paid greater attention in the
last 40 years to the process by which listeners infer what speakers
mean by what they say. Recall the conversation between Harvey
and Molly. Understanding that Mollys comment is meant as
a particular answer to Harveys question requires that Harvey
go through a chain of reasoning regarding Mollys intentions
because her answer does not logically follow from his question. The philosopher H. Paul Grice called the intended message
behind Mollys utterance a conversational implicature,
which is a natural outcome of speakers and listeners tacit
adherence to the cooperative principle. This states that a
speaker must make your conversational contribution such as
is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged
(Grice, 1975: 45). The cooperative principle carries with it four
maxims:
Maxim of Quantity: Make your contribution as informative as
is required, but not more so, for the current purposes of the
exchange.
Maxim of Quality: Do not say anything you believe to be false
or for which you lack adequate evidence.
Maxim of Relation: Say only what is relevant for the current
purposes of the conversation.
Maxim of Manner: Be brief, but avoid ambiguity and obscurity of expression.
Grice noted that speakers do not always uphold these maxims. So long as speakers generally adhere to the overall cooperative principle, they can flout any of these maxims to produce
certain implicatures. For example, Mollys response to Harveys
question flouts the maxim of manner to implicate that she is not
going to the party because of Jason. According to Grices analysis, Harvey would not consider Mollys response to be uncooperative. Instead, Harvey would continue to assume that Mollys
rhetorical response was cooperative and would seek an interpretation given what he assumes about Molly, and what he believes
Molly assumes about him, in order to derive an acceptable and
authorized interpretation.
One place where speakers flout conversational maxims is in
their use of figurative language, such as metaphor (e.g.,Lawyers
are sharks) and irony (e.g., A fine friend you are!). Grices
theory assumes that figurative language is understood in a series
of steps (1975). First, listeners analyze the literal meaning of the

entire expression. Second, they assess whether this literal interpretation is appropriate for the specific context. Third, if the literal
meaning is contextually inappropriate, as is the case for figurative
language, listeners must then derive the intended figurative (e.g.,
metaphorical, ironic) meaning via the cooperative principle. This
view suggests, then, that figurative language should be more difficult to comprehend than corresponding literal speech, because
figurative speech requires an additional processing step in which
the literal meanings are rejected and the intended figurative
meanings are subsequently inferred.
Many pragmatic theories, especially in philosophy, embrace
all or some of the Gricean view of conversational implicature and
his specific proposals on understanding indirect and figurative
language. Indeed, much of the focus in philosophical and linguistic studies on pragmatics is devoted to demonstrating how classic semantic phenomena, such as reference, indexicals, and
demonstratives, can be explained in terms of an understanding of
the specific facts about the speaker, time, and location of an utterance (Kaplan 1989; Stalnaker 1999). But psychological experiments
have raised important questions about Grices theory. Although
there is considerable evidence showing that speakers generally
aim to be cooperative, with talk being primarily organized around
the recovery of speakers pragmatic intentions (Clark 1996; Gibbs
1999), it is less clear that meaning is processed in the serial manner
that Grice and other pragmatists assume. For instance, numerous
psycholinguistic studies indicate that many kinds of figurative language, including novel metaphors, can be understood as quickly
as literal speech when these expressions are encountered in rich
linguistic contexts (Gibbs 1994). Thus, pragmatic knowledge may
be immediately accessed and applied in order to understand what
speakers imply by what they say, without listeners first having to
analyze the literal, semantic meaning of utterances.
A different proposal on the pragmatics of utterance interpretation assumes that speakers aim to be optimally relevant
in saying what they do. Optimizing relevance is a fundamental
tenet of relevance theory (Sperber and Wilson 1995). Under
this optimally relevant view, every act of ostensive behavior
communicates a presumption of its own optimal relevance,
that is, a presumption that it will be relevant enough to warrant the addressees attention and as relevant as compatible
with the communicators own goals and preferences (the communicative principle of relevance). Speakers design their utterances to maximize the number of cognitive effects that listeners
infer, while minimizing the amount of cognitive effort to do so.
Listeners understand speakers communicative intentions via
the relevance-theoretic comprehension procedure (Sperber
and Wilson 2002), by following a path of least effort in computing
cognitive effects. They do this by testing interpretive hypotheses
(e.g., disambiguations, reference resolutions, implicatures) in
order of accessibility, and then stopping when their expectations
of relevance are satisfied.
For example, consider the following exchange between two
university professors (Sperber and Wilson 2002, 19):
Peter: Can we trust John to do as we tell him and defend the
interests of the Linguistics Department in the University
Council?
Mary: John is a soldier!

647

Pragmatics
How does Peter understand Marys metaphorical assertion
about John? Peters mentally represented concept of a soldier
includes many ideas that may be attributed to John. Among
these are a) John is devoted to his duty, b) John willingly follows
orders, c) John does not question authority, d) John identifies
with the goals of his team, e) John is a patriot, f) John earns a
soldiers pay, and g) John is a member of the military. Each of
these ideas may possibly be activated to some degree by Marys
use of soldier in relation to John. However, certain of these
attributes may be particularly accessible given Peters preceding question where he alludes to trust, doing as one is told, and
defending interests. Following the relevance-theoretic comprehension procedure, Peter considers these implications in
order of accessibility, arrives at an interpretation that satisfies
his expectations of relevance at d, and stops there. He does not
even consider further possible implications, such as eg, let
alone evaluate and reject them. In particular, Peter does not
consider g, the literal interpretation of Marys utterance, contrary to what is advanced by the Gricean view, and consistent
with the psychological evidence on inferring metaphorical
meaning.
Relevance theory has also advanced the idea that significant aspects of what speakers say, and not just what they totally
communicate, are deeply dependent upon enriched pragmatic
knowledge. Essentially, the same sorts of inferential processes
used to determine conversational implicatures also enter into
determining what speakers say (Carston 2002; Recanati 2004;
Sperber and Wilson 1995). Consider a case where a speaker says
to you I havent eaten in response to a question about whether
she found time for breakfast that morning. Once the indexical references and the time of the utterance are fixed, the literal meaning of the sentence determines a definite proposition, with a
definite truth condition, which can be expressed as The speaker
has not eaten prior to the time of the utterance. This paraphrase
reflects the minimal proposition expressed by I havent eaten.
However, a speaker of I havent eaten is likely to be communicating not a minimal proposition but some pragmatic expansion
of it, such as I havent eaten today. This possibility suggests
that significant pragmatic knowledge plays a role in enabling
listeners to expand upon the minimal proposition expressed in
order to recover an enriched pragmatic understanding of what
a speaker says. Several experimental studies indicate that pragmatics plays a major role in peoples intuitions of what speakers say (Gibbs and Moise 1997). Thus, the distinction between
what speakers say and imply may possibly be orthogonal to any
distinction between semantics and pragmatics, contrary to the
traditional Gricean view.
The vast number of studies on a wide assortment of linguistic
and nonlinguistic phenomena conducted within the relevance
theory framework makes it the most salient model of pragmatics and utterance interpretation available today. At the very least,
part of relevance theorys significant appeal in interdisciplinary
language studies is its explicit aim to situate pragmatics within
broader concerns of human cognition and communication,
through its embrace of the principles of relevance. Not surprisingly, relevance theory has its critics, ranging from scholars,
primarily in linguistics, who assume that utterance meaning is
determined by heuristics of default or preferred interpretations

648

(Horn 2004; Levinson 2000) to psychologists who fault relevance


theory for the circularity in its proposed trade-off between maximizing cognitive effects and minimizing cognitive effort (Giora
1997).
Psychological studies on pragmatics have primarily examined figurative language understanding and the degree to
which speakers and listeners coordinate during conversational
exchanges. Neither the Gricean nor relevance theory perspective assumes that speakers and listeners rely on some definitive
common ground in order for conversation to proceed smoothly.
Some psychologists, however, have demonstrated through various empirical means that speakers and listeners actively collaborate and coordinate their beliefs and knowledge to achieve
mutual understandings in different contexts (Clark 1996; Gibbs
1999).
For example, research shows that speakers take the addressees perspective into account when designing their utterances
in naturalistic, task-oriented dialogue. One set of studies had
two people, who could not see each other, collaborate over the
arrangement of Tangram figures (geometric shapes that are
vaguely suggestive of silhouettes of people and other objects)
(Clark and Wilkes-Gibbs 1986). One person (the director)
had an ordered array of these figures and had to explain their
arrangement to the other (the matcher) so that the other person could reproduce the arrangement. Each director-matcher
pair did this six times. The main hypothesis is that as common
ground is established between the director and matcher during
the conversation, it should be easier for them to mutually determine where each figure should go. As expected, the number of
words used per Tangram figure fell from around 40 in the first
trial to around 10 in the last. For instance, a speaker referred
to one figure in Trial 1 by saying All right, the next one looks
like a person whos ice skating, except theyre sticking two arms
out in front, while in Trial 6 the speaker said The ice skater.
A similar decline was observed in the number of turns required
to complete the arrangement task, showing that the interchange
became more economical as common ground was established.
Other studies using this experimental paradigm indicate that
speakers and listeners can also coordinate to hide information
from overhearers without damaging their own understanding
of each others communicative meanings (Clark and Schaeffer
1987).
These data demonstrated that the assessment of common
ground has an integral part in determining what speakers specifically say and in facilitating listeners recovery of speakers
intentions. One implication of these findings is that utterance
interpretation is a joint activity of both speakers and listeners,
and not solely the responsibility of listeners. Indeed, psychological studies also demonstrate that conversational participants
typically try to reach the mutual belief that the addressees have
understood what the speaker meant to a criterion sufficient for
current purposes. Thus, when Molly speaks, she looks for evidence from Harvey that he has understood her. Harvey, in turn,
tries to provide that evidence by saying, oh right, nodding his
head, or taking the relevant next turn. Of course, the collaboration and coordination between speakers and listeners reflects
the operation of rapid, mostly unconscious, comprehension
processes. Conversational participants are rarely aware of the

Pragmatics

Pragmatics, Evolution and

cognitive and linguistic processes that underlie their understanding of others pragmatic intentions, unless the attempt to
coordinate fails and leads to misunderstandings.
Not all psychologists agree that speakers and listeners always
aim to be cooperative in conversation by taking the other persons perspective into account during speaking and listening.
Some experiments show, for example, that speakers and listeners can each adopt an egocentric bias as they speak and comprehend, particularly when they experience additional cognitive
load or stress (Horton and Keysar 1996). Speakers also sometimes overestimate how effective they are communicating their
messages to listeners, with listeners also sometimes assuming
that they correctly understood speakers when in fact they did not
(Keysar and Henly 2002). These studies show how there are at
least some systematic sources of misunderstanding attributable
to what might be best characterized as an egocentric bias in communication effectiveness.
The study of pragmatics will undoubtedly continue to have a
strong interdisciplinary flavor in the future. Scholarly intuitions
about how knowledge of the world and context shape utterance
interpretation must be supplemented by experimental studies that examine fast, unconscious cognitive and linguistic processes operating when speaker meaning is understood. We need
to understand not only what pragmatic information shapes contextual meaning but also when that knowledge in recruited in the
psychology of ordinary language interpretation.
Raymond W. Gibbs, Jr. and Gregory A. Bryant
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Austin, John. 1962. How to Do Things with Words. Oxford: Clarendon.
Carston, Robyn. 2002. Thoughts and Utterances: The Pragmatics of Explicit
Communication. Oxford: Blackwell.
Clark, Herbert. 1996. Using Language. New York: Cambridge University
Press.
Clark, Herbert, and Edward Schaeffer. 1987. Concealing meaning from
overhearers. Journal of Memory and Language 26: 20925.
Clark, Herbert, and Deanna Wilkes-Gibbs. 1986. Referring as a collaborative act. Cognition 22: 139.
Giora, Rachel. 1997. Discourse coherence and theory of relevance: Stumbling blocks in search of a unified theory. Journal of
Pragmatics 27: 1734.
Gibbs, Raymond. 1994. The Poetics of Mind: Figurative Thought, Language,
and Understanding. New York: Cambridge University Press.
. 1999. Intentions in the Experience of Meaning. New York: Cambridge
University Press.
Gibbs, Raymond, and Jessica Moise. 1997. Pragmatics in understanding
what is said. Cognition 62: 5174.
Grice, H. Paul. 1975. Logic and conversation. In Syntax and Semantics.
Vol 3: Speech Acts. Ed. Peter Cole and Jerald Morgan, 4158. New
York: Academic Press.
Horn, Larry. 2004. Implicature. In The Handbook of Pragmatics, ed.
Larry Horn and Gregory Ward, 328. Oxford: Blackwell.
Horton, William, and Boaz Keysar. 1996. When do speakers take into
account common ground? Cognition 59: 91117.
Kaplan, David. 1989. Demonstratives. In Themes from Kaplan, ed.
Joseph Almog, John Perry, and Howard Weinstein, 481563. New
York: Oxford University Press.
Keysar, Boaz, and Anne Henly. 2002. Speakers overestimation of their
effectiveness. Psychological Science 13: 20712.

Levinson, Stephen. 2000. Presumptive Meanings. Cambridge, MA: MIT


Press.
Recanati, Francois. 2004. Literal Meaning. New York: Cambridge
University Press.
Searle, John. 1975. A taxonomy of illocutionary acts. In Language, Mind,
and Knowledge, ed. Keith Gunderson, 34469. Minneapolis: University
of Minnesota Press.
Sperber, Dan, and Deirdre Wilson. 1995. Relevance: Cognition and
Communication. 2d ed. Oxford: Blackwell.
. 2002. Pragmatics, modularity, and mind-reading. Mind and
Language 17: 323.
Stalnaker, Robert. 1999. Context and Content. Oxford: Oxford University
Press.

PRAGMATICS, EVOLUTION AND


For at least 100,000 years, human beings have been talking the
way we do. Language is universally used by most individuals in
every culture several hours each day, primarily during conversational chatter (Dunbar 1998). How did our species come to
adopt such a strange behavior in the course of its evolution? The
question has been considered in turn as obvious and baffling.
A proper approach to the reasons why we talk requires that the
biological function of language be understood, and pragmatics is
the right place to seek out that function.
If we adopt the perspective of an ethologist, then language
appears as a distinctive feature of our species that, like any other
finely designed characteristic, must have a definite function to
have been selected through the repeated effect of differential
reproduction (Pinker and Bloom 1990). For decades, language
was thought to be essentially a means for organizing, responding
to, and manipulating the behavior of others (Brown 1991, 130) or
a tool for sharing knowledge (Pinker 1994, 367), and it was considered obvious that it had been selected for these purposes. This
traditional view has now lost most of its obviousness for two reasons: 1) Its logic contradicts Darwinian principles, and 2) what
people spontaneously do with language corresponds to a quite
different picture. We examine these two issues in turn, before
considering more plausible alternatives.
Any evolutionary account of the existence of language must
make clear what biological advantage both speakers and listeners get out of speaking. In many traditional accounts, the fact that
listeners take advantage from receiving information is taken as
sufficient explanation for the existence of language, but language
cannot evolve if there is no direct or indirect advantage on the
speakers side. If language is a way of influencing others behavior, the speakers advantage is now obvious, but Darwinian selection should have led to resistance on the listeners side: There is
an advantage in ignoring signals aiming at bringing you to serve
the interest of others.
One of the most striking and incomprehensible facts about
human language is that it relies on a positive attitude from speakers. Speakers bear all the burden of designing appropriate (Grice
1975) or even optimal (Sperber and Wilson 1986) messages to
convey intentional meaning. If they do so spontaneously and
often quite profusely, it must be because they gain some benefit
from it. Listeners, on the other hand, show much trust in what
they hear. Knowing that language is cheap, the fact that listeners give credence to most of what they hear is hard to explain

649

Pragmatics, Evolution and


in a Darwinian world in which creatures are designed to favor
their own success, not the success of others (Knight 2002). The
absence of trust is what explains the repetitiveness, the cost, and
the poverty of most animal communication (Zahavi and Zahavi
1997).
These concerns about the speakers willingness to speak and
the listeners to trust have no known solution within frameworks
in which language acts are supposed to provide immediate
benefit to either party. It has been suggested that information
exchange through language could be based on reciprocity (Pinker
2003, 28; Nowak and Sigmund 2005, 1293). The reciprocation
model, however, functions under strict limits: good benefit-tocost ratio and strict control of reciprocity. It is at odds with several observations about spontaneous language, such as the fact
that many conversational utterances are about futile topics, or
the fact that talkative behavior is far from being an exception: On
average, individuals typically talk to two persons simultaneously
(Dunbar, Duncan, and Nettle 1995).
The utilitarian conceptions of language that inspired most
traditional ideas about its biological role are dictated mainly by
theoretical considerations. Some theories emphasize the role
of language in performing actions; it is thus natural to imagine
language as having emerged from simple directives (Holdcroft
2004). Other theories see in language a process through which
individuals actively try to influence the beliefs of others (Sperber
and Origgi 2010). A natural strategy, to decide which aspect of
language use is most likely to have given a biological advantage
both to speakers and listeners, is to observe how current human
beings spontaneously talk.
Conversation constitutes by far and universally the main
occasion in which language is used. Conversational activity,
however, is not monolithic. When chatting, individuals show
essentially two forms of behavior: They tell stories and they
pursue argumentative discussions. Even if both are often intertwined, it is important to distinguish narration and argumentation, as they involve quite different cognitive processes and
might have arisen successively during evolution. Conversational
narrative analysis shows that narratives fill up to one-half of
our speaking time (Eggins and Slade 1997, 265) and may represent some 10 percent of our awake time. Speakers take time,
sometimes several minutes, to recount some past situation in
minute detail (Norrick 2000). Not all situations are likely to be
reported: Only those that can elicit specific emotions, especially
surprise, are recounted (Dessalles 2007). The following example, adapted from (Norrick 2000, 556), is about an unexpected
encounter:
Brianne: It was just about two weeks ago. And then we did
some figure drawing. Everyone was kind of like, oh my God,
we cant believe it. We- yknow, Midwest College, yknow,

Brianne: like a nude models and stuff. And it was really


weird, because then, like, just last week, we went downtown
one night to see a movie, and we were sitting in [a restaurant],
like downtown, waiting for our movie, and we saw her in the
[restaurant], and it was like, thats our model (laughing) in
clothes
Addie:

650

(laughs) Oh my God.

Brianne: we were like oh wow. It was really weird. But it was


her. (laughs)
Addie:

Oh no. Weird.

Brianne: I mean, thats weird when you run into somebody in


Chicago.
Addie: yeah.

Stories come in chunks, the so-called story rounds (Tannen


1984, 100), which may last for tens of minutes. The biological
significance of this systematic and universal tendency to report
emotional and unexpected events lies quite far away from any
immediate utilitarian effect, like behavioral influence or vital
knowledge transfer.
During argumentation, in contrast with narration, individuals are not bound to mention fully instantiated states of affairs.
They may even utter quite general statements to make a point.
Argumentation can be described, at the cognitive level, as an
oscillation between problems and tentative solutions (Dessalles
2007). During conversation, any inconsistency between beliefs
or between beliefs and desires is likely to be signaled, and it triggers a collective search for solutions. In the following example
(adapted from Tannen 1984, 62), two participants wonder
how the third one came to know about the sociologist Erving
Goffman.
Deborah:
stuff?

But anyway. How do you happen to know his

Chad:

Cause I read it.

Peter:

What do you do?

Deborah: Are you in sociology or anything?

Chad:

No.

Deborah: You just heard about it, huh?


Chad: Yeah. No. I heard about it from a friend who was a
sociologist, and he said read this book, its a good book and I
read that book n
Deborah: I had never heard about him before I started studying linguistics.
Chad: Really?

The argumentative process is the same, with its characteristic


alternation between problems and solutions, regardless of the
social situation in which it occurs: a discussion about a famous
sociologists work, the planning of some forthcoming travel, or
a harsh dispute. The biological significance of this systematic
and universal propensity to mention inconsistencies and then
to make every attempt to solve them cannot be reduced to the
pursuit of some immediate practical benefit. Quite often, casual
discussions are about futile matters that are unlikely to change
the interlocutors fate.
Why do human beings devote most of their speaking time
telling stories and dealing with apparent inconsistencies? What
utilitarian models fall short of explaining is directly addressed
by models, like the grooming hypothesis, that emphasize the
role of language in the establishment of social bonds (Dunbar
1996; Dessalles 2007). Language acts would not be biologically
motivated by their immediate benefit but because they are

Pragmatics, Evolution and

Pragmatics, Neuroscience of

reliable indicators of some speaker quality that is valued in the


establishment of solidarity networks. In these models, language
is display. In the political niche of our species, individuals who
are aware of their physical and social environment make better
coalition partners. Hence, individuals demonstrate that they are
able to witness unusual situations by reporting facts that elicit
surprise and emotion. By recounting the weird encounter with
the nude model, Brianne obeys this urge to show her ability to
surprise others.
From this perspective, language is a competition for interest. On the friendship marketplace, where solidarity bonds are
established and dissolved, individuals who report the most interesting events are, all other things being equal, the most appreciated. Now, the biological role of argumentation becomes clear.
Without the ability to detect inconsistencies, individuals would
easily shine by reporting incredible events that never occurred.
Argumentation presumably emerged as an anti-liar device,
besides checking for oneself (Dessalles 1998). As it is preferable
to have nongullible members in ones coalition, argumentation
became a way to demonstrate this quality. Hence, Deborahs and
Peters reflexes show that they could spot an apparent inconsistency during their conversation with Chad.
Recently, there have been various attempts to account for the
existence of language (Johansson 2005). The one emphasized
here highlights the political importance of talking. Language performance is indirectly vital: Those who recount in boring fashion
or who are unable to build sensible arguments are rapidly left
aside. In the world of our hominine ancestors, lonely individuals
were defenseless and likely to be exploited. Language emerged
as a way for human beings to show to their conspecifics that they
have the required qualities to be valuable friends.
Jean-Louis Dessalles
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brown, Donald E. 1991. Human Universals. Philadelphia: Temple
University Press.
Dessalles, Jean-Louis. 1998. Altruism, status, and the origin of relevance.
In Approaches to the Evolution of Language: Social and Cognitive
Bases, ed. J. R. Hurford, M. Studdert-Kennedy, and C. Knight, 13047.
Cambridge: Cambridge University Press.
. 2007. Why We Talk The Evolutionary Origins of Language. Trans.
James Grieve. Oxford: Oxford University Press.
Dunbar, Robin I. M. 1996. Grooming, Gossip, and the Evolution of
Language. Cambridge: Harvard University Press.
. 1998. Theory of mind and the evolution of language. In
Approaches to the Evolution of Language: Social and Cognitive Bases,
ed. J. R. Hurford, M. Studdert-Kennedy, and C. Knight, 92110.
Cambridge: Cambridge University Press.
Dunbar, Robin I. M., N. D. C. Duncan, and Daniel Nettle. 1995. Size and
structure of freely forming conversational groups. Human Nature
6.1: 6778.
Eggins, Suzanne, and Diana Slade. 1997. Analysing Casual Conversation.
London: Equinox.
Grice, H. Paul. 1975. Logic and conversation. In Syntax and Semantics.
Vol. 3: Speech Acts. Ed. P. Cole and J. L. Morgan, 4158. New
York: Academic Press.
Holdcroft, David. 2004. Pragmatics and evolution. Pragmatics and
Beyond 127. 11727.

Johansson, Sverker. 2005. Origins of Language Constraints on


Hypotheses. Amsterdam: John Benjamins.
Knight, Chris. 2002. Language and revolutionary consciousness. In The
Transition to Language, ed. A. Wray, 13860. Oxford: Oxford University
Press.
Norrick, Neal R. 2000. Conversational Narrative: Storytelling in Everyday
Talk. Amsterdam: John Benjamins.
Nowak, Martin A., and Karl Sigmund. 2005. Evolution of indirect reciprocity. Nature 437.27: 12918.
Pinker, Steven. 1994. The Language Instinct. New York: Harper
Perennial.
. 2003. Language as an adaptation to the cognitive niche. In
Language Evolution, ed. M. H. Christiansen and S. Kirby, 1637.
Oxford: Oxford University Press.
Pinker, Steven, and Paul Bloom. 1990. Natural language and natural
selection. Behavioral and Brain Sciences 13.4: 70784.
Sperber, Dan, and Gloria Origgi. 2010. A pragmatic perspective on the evolution of language. In The Evolution of Human Language: Biolinguistic
Perspectives, ed. Richard Larson, Viviane Dprez, and Hiroko Yamakido,
12432. Cambridge: Cambridge University Press.
Sperber, Dan, and Deirdre Wilson. 1986. Relevance: Communication and
Cognition. Oxford: Blackwell.
Tannen, Deborah. 1984. Conversational Style Analyzing Talk Among
Friends. Norwood, NJ: Ablex.
Zahavi, Amotz, and Avishag Zahavi. 1997. The Handicap Principle. New
York: Oxford University Press.

PRAGMATICS, NEUROSCIENCE OF
What Is Pragmatics and What Is the Neuroscience of
Pragmatics?
The neuroscience of pragmatics has not been extensively studied. This is understandable, inasmuch as pragmatics came
into focus fairly late as an area of study in linguistics. The field
is not dominated by one specific theoretical framework. There
are controversies concerning its delimitation with respect to,
for example, semantics and nonlinguistic behavior. Finally,
since pragmatic phenomena are crucially related to connected
discourse and communicative interaction, they do not lend
themselves easily to investigation by established experimental
approaches or to such methods as neuroimaging, electroencephalography (EEG), and so on that focus on limited, often
decontextualized, linguistic units. This entry is an attempt to
summarize some typical approaches in relating pragmatic
phenomena to neural processing, and what they have found to
date.
The term pragmatics is used here in accordance with C. W.
Morris (1938), who posited a framework in which syntax deals
with the formal relations among signs, semantics adds the relations of signs to objects, and pragmatics further adds the relations of signs to the interpreter. Pragmatics is about language use
or communication, in a broad sense and in context. It is assumed
here that there is no clear sense in which pragmatics can be separated from semantics. The focus here, however, is on phenomena that are typically considered to be part of pragmatics.
In terms of the relation of pragmatics to neuroscience, different types of approaches have to be considered, since this is far
from a uniform field of study. One set of approaches constitutes
experimental studies of phenomena that are considered to be

651

Pragmatics, Neuroscience of
important in pragmatics. This may involve trying to isolate such
phenomena in an experimental setting that allows for neuroimaging techniques, such as fMRI (functional magnetic resonance
imaging), or EEG/ERP (event-related potentials) to be used
to measure brain activity. Lesion data can be handled either
by experimental group studies, comparing, for example, individuals with left hemisphere damage (LHD), right hemisphere
damage (RHD) and no brain lesion, or by case studies comparing specific phenomena in one individual or a number of
individuals with lesions and communication disorders. Lesion
studies in the area of pragmatics often also include a focus on
communicative interaction in context, involving persons with
brain damage, and use such methods as videorecording of faceto-face interaction, transcription, and coding and microanalysis of sequences and patterns of interaction. These studies also
include more social constructivist empirical approaches, such
as conversation analysis. Lesion studies can also be done with
brain activity measurements, but this is more difficult, especially when homogeneous groups and many repeated events
are needed and when an interaction between two persons is
being studied.
The field of pragmatics primarily needs models of connected
speech, for example, of story structure and topic flow, and models of linguistic communicative interaction, that is, models
involving two (or more) participants and the interactive flow,
co-construction, activation, and so on between them, including
different levels of conscious control. These models can vary considerably in degree of detail and specificity. Typically in studies
of pragmatics, communicative, cognitive, and emotive factors
are included. Multimodality, such as body communication and
prosody in speech, is considered important. Data often consist
of sequences longer than words and sentences monologue and
dialogue samples, for example. Overall structure and the course
of communication are studied, and interactive phenomena are
often in focus. Specific pragmatic phenomena that can be studied from a neuroscience perspective are listed in Table 1.

Main Topical Subdivisions of the Field


Table 1 illustrates typical actual (x) and potential combinations
of phenomena in pragmatics with methods in neurocognitive
studies. Some of the combinations have also been attempted
but, in general, there is a dividing line between phenomena
that can be studied both in monologue and dialogue conditions
and those that can only be fruitfully studied in dialogue. Studies
involving the measurement of brain activity have generally been
limited to monologue situations. It is, of course, possible to
combine more than one method and more than one pragmatic
phenomenon in a given study. The neuroscience of pragmatics
faces the challenge of unifying the fairly rich findings from naturalistic and experimental behavioral studies of monologue and
dialogue/interaction with studies of brain activity, which so far
have been related to monological and experimental tasks only.
This requires i) extensive work on models and theories, and ii)
continued development of techniques and methodologies.

A Brief History of Modern Developments in the Field


A number of important milestones can be mentioned in the
development of the neuroscience of pragmatics. The first is

652

the acceptance of pragmatics as a discipline within linguistics,


anthropology, sociology, and communication sciences and
its introduction into clinical linguistics and medical settings
during the last 20 to 30 years. Some approaches have involved
the extension of the classical model of aphasia syndromes
and the application of cognitive neuropsychology models. But
there is also a recognition that other theoretical frameworks
must be applied in the study of pragmatics: the increasing use
of connectionist modeling, the growing community applying pragmatic theories and methods of analysis to studies of
communication involving persons with brain damage, and the
increasing interest in embodied cognition and communication. The rapid development of neuroimaging techniques, also
during the last 20 to 30 years, has coincided with the development of pragmatic approaches to neurolinguistics; until very
recently, however, the two research streams have not joined
forces. Neuroimaging studies focus on phenomena that are
easily studied in an experimental context. But the rapid development of fMRI, PET (positron emission tomography), and
MEG (magnetoencephalography) techniques, as well as EEG/
ERP, has paved the way for recent and ongoing attempts to
actually capture pragmatic phenomena as well in this type of
research.

Current State of the Field


Following are descriptions and examples of some of the dominant types of studies in this area.
MEASUREMENT OF BRAIN ACTIVITY. Some of the recurring abilities or functions attributed to brain areas are

inhibition
selection and ordering of speech, behavior, and logic
formation and execution of plans of action
memory processes (working memory, episodic memory
retrieval, emotional modulation of memory processes and
executive processing, and cues for long-term memory)
theory of mind (ToM), mental inferencing, attribution of
mental states, simulation for comprehension, visuospatial
imagery, abstraction
TYPICAL STUDIES I EEG/ERP. Studies of brain activity using
EEG/ERP involve the correlation of specific temporal components of brain activity with performance. Conversation is not
easily studied with this type of method. One of the most frequently used components is the N400, a negative ERP response
to semantic anomaly.
One of the findings from studies of the N400 is that there
is rapid incremental processing all the way through; in other
words, listeners start early to respond to unfolding words as
influenced by topic, how the speech is produced, and by whom.
Listeners also use discourse information to automatically make
predictions about semantics, syntax, phonology, and referents;
discourse information can overrule local constraints. No evidence of context-free sentence-internal interpretation has been
found, and the conclusion is that the brain does not engage in
a two-step interpretation of semantics and pragmatics. What
seems to be happening is that an initial quick and superficial

Pragmatics, Neuroscience of
Table 1. Possible and typical (x) combinations of pragmatic objects of study and neuroscientific methods
Pragmatic phenomena/methods

Lesion-based studies studies of behavior

Brain activity studies

Empirical studies of
behavior in context

Experimental studies

Neuroimaging, esp. fMRI

EEG/ERP

Comprehension and production of longer


spoken contributions and texts, such as
narratives

Inference

Cognitive semantics

Emotion in communication

Own communication management


(e.g., hesitation, self-repair)

Interactive communication management


(e.g., turn-taking, feedback)

Speech-acts, language games

Conversational principles

Flexibility and adaptation;


alignment, coupling, holistic patterns

Body communication,embodiment,
multimodality

interpretation is followed or partially overlapped by a more precise one.


TYPICAL STUDIES II NEUROIMAGING fMRI. Using fMRI
involves the comparison of brain activity in one task with
activity in another via subtraction. Naturalistic interaction
is not easily studied with fMRI. Labels used for the behavior
studied are often too broad for specific comparisons between
studies to be useful, and the method only shows activation
across trials and participants, and not individual, diffuse, and
weak signals that could also be important for modeling pragmatic processing.
One relevant fMRI study had (non-brain-damaged) subjects
listen to connected versus nonconnected sentences. In this
experiment by D. Robertson and colleagues (2000), no difference
in brain activity between the two conditions was found for the left
hemisphere (LH), whereas the right hemisphere (RH) showed
increased activation for connected discourse in the middle and
superior frontal regions. The same findings were replicated for
picture stories. In passive listening only, however, it disappears.
In addition, fMRI studies have shown that increased difficulty
leads to more diffuse activation of brain regions. Activation of
the temporal poles was also found only for this task. One suggested interpretation is that RH frontal lobe activation may only
show up when the subject is activating memory to create coherence in a story representation. The same areas have been linked
to abilities such as ToM, episodic memory, and integration.
As we have seen, studies of the comprehension and production of connected units of language, such as narratives, in

comparison to single words and sentences, have most frequently


been used to measure brain activity in studies of pragmatic phenomena. In general, many-to-many mappings of structures and
functions are found, and this points to the need to develop theories and models.
TYPICAL STUDIES III BEHAVIORAL STUDIES OF RIGHT HEMISPHERE DAMAGE. Studies of the behavior of LHD, RHD, and
control subjects with no brain damage are perhaps the most
prototypical ones in behavioral studies both experimental ones
and studies of naturalistic conversation. A number of such studies over the last 20 or 30 years have shown that RHD subjects,
in spite of their good performance on traditional aphasia tests,
definitely perform worse on many different aspects of pragmatics than LHD or control subjects. These findings have placed
RHD at the center of the neuroscience of pragmatics. Studies of
this type have used experimental group designs, as well as case
studies and microanalysis of communicative interaction. They
have given us a picture of RH functions in lexical semantics,
the semantics of connected speech and writing, prosody, body
communication, holistic processing, spatial imagery, ToM,
topic management, sensitivity to interactive cues, inferencing
(especially about emotions), and a number of other pragmatic
abilities.
Some of the limitations to this approach are the (so far) relatively broad and uncertain mapping of specific areas in the RH
to specific functions, the fact that groups of RH subjects are
not homogeneous, and the relative lack of good instruments
for measuring pragmatic functions in an experimental context.

653

Pragmatics, Universals in
It should also be stressed that when fine-grained methods are
used, pragmatic deficits stemming from LH aphasia, traumatic
brain injury, and other brain damage conditions are also found.
Concerning the analysis of face-to-face interaction, the generalizability of results tends to be fairly low. Still, these types of analysis are extremely important to the neuroscience of pragmatics, as
they provide studies of important pragmatic phenomena, which
can also serve as input for further development of theories and
methods. Most of the theoretical claims made on the basis of
brain activity studies today in the area of pragmatics were already
made much earlier on the basis of empirical studies of behavior
following brain damage.
Elisabeth Ahlsn
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brownell, H., and O. Friedman. 2001. Discourse ability in patients with
unilateral left and right hemisphere brain damage. In Handbook
of Neuropsychology. 2d ed. Vol. 3. Ed. R. S. Berndt, 189203.
Amsterdam: Elsevier.
Gernsbacher, M. A., and M. P. Kashak. 2003. Neuroimaging studies of language production and comprehension. Annual Review of
Psychology 54: 91114.
Mar, R. A. 2004. The neuropsychology of narrative: Story comprehension, story production and their interrelation. Neuropsychologia 42:
141434.
Morris, C. W. 1938. Foundations of the theory of signs. Chicago: Chicago
University Press.
Robertson, D.A., M. Gernsbacher, S. Guidotti, R. Robertson, W. Irwin,
B. Mock, and M. Campana. 2000. Functional neuroanatomy of the
cognitive process of mapping during discourse comprehension.
Psychological Science 11.3: 25560.
van Berkum, J. J. A. 2005. The electrophysiology of discourse and conversation. In The Cambridge Handbook of Psycholinguistics, ed. M.
Spivey, M. Joanisse, and K. McRae. Cambridge: Cambridge University
Press.

(Hymes 1982). But with the growth of linguistic typology and


the empirical search for language universals, it has become
increasingly clear that real universals in the straightforward
sense, properties that all languages have are vanishingly rare
(at least beyond the basic organizational principles outlined by
Hockett 1960, and some of the architectural properties sketched
by Jackendoff 2002). Instead, linguistic typologists have found
that empirical generalizations are nearly always of the kind
Across all languages, if a language has property X, then it probably also has property Y. Meanwhile, generative grammarians
have hoped to account for the diversity in terms of a limited set
of variants (see principles and parameters theory), but
such variants are not manifested in grammars in any straightforward way, and the whole attempt does not appear successful
to many dispassionate observers (Newmeyer 2004). The reality
is that there is an extraordinary diversity of linguistic types, in
which both shared patterns and differences seem best understood historically and geographically (see, e.g., Haspelmath et al.
2005).
With the waning of hopes for straightforward grammatical
universals, the case for pragmatic universals looks, in contrast,
stronger and stronger. The distinct possibility now arises that
while grammatical patterns are in large part a matter of historical and cultural evolution, principles of language usage constitute the foundational infrastructure for language, to which
commonalities across languages can be partially attributed.
This inverts the traditional view (as in Hymes 1982) that grammar is universal and language usage variable. If this inverted
picture is even partially correct, then we would expect significant absolute (unconditional) universals across the subdomains of pragmatics (see absolute and statistical
universals ). The following sections lay out the case for pragmatic universals.

Deixis
PRAGMATICS, UNIVERSALS IN
Changing Prospects for Universals in Pragmatics
The term pragmatics has come to denote the study of
general principles of language use. It is usually understood
to contrast with semantics , the study of encoded meaning, and also, by some authors, to contrast with sociolinguistics and the ethnography of speaking, which are more
concerned with local sociocultural practices. Given that pragmaticists come from disciplines as varied as philosophy, sociology, linguistics, communication studies, psychology, and
anthropology, it is not surprising that definitions of pragmatics vary. Nevertheless, most authors agree on a list of topics
that come under the rubric, including deixis, presupposition , implicature (see conversational implicature),
speech-acts, and conversational organization (see conversational analysis ). Here, we can use this extensional
definition as a starting point (Levinson 1988; Huang 2007).
With the rise of generative grammar, and the insistence on universals of grammar (see universal grammar),
anthropologists began to emphasize the diversity of language
use, implicitly accepting the underlying uniformity of grammar

654

The fundamental use of language is in face-to-face conversation, where participants take turns at speaking. Aspects of this
context are built into languages in many detailed ways. All spoken languages have a grammatical category of person, that is,
a grammatical reflection of the different roles that participants
(and nonparticipants) have in an utterance (speaker, addressee,
third party), which is likely to be reflected in personal pronouns, verbal inflections, imperatives, vocatives (as in address
forms), and so forth. Likewise, all languages have at least one
demonstrative, a special form for indicating entities in the context typically, there are contrastive forms (like this and that)
associated with pointing. They also have ways to distinguish the
time and place of speaking (they may not have tense, but they
will have forms denoting now, today, here, etc.). These
aspects of language structure are pragmatic in the sense that
they refer to aspects of the context of utterance, and their interpretation is relative to that context. The peculiarity of these systems is that as speakers alternate, the reference of these terms
also alternates (my I is your you, and my this may be your that),
a fact that children can find difficult when learning a language.
Since artificial languages (logics, programming languages) successfully purge their structures of such items, it is clear that

Pragmatics, Universals in
natural languages could be different and, thus, that deictic organization constitutes a nontrivial universal aspect of language
built for interactive use.

Presupposition
Languages have various ways to foreground and background
information, and this is crucial if the speakers current point
is to be identified. Information that is presumed in the context
(either because it has already been mentioned or is taken for
granted) is typically not asserted but presupposed, and this is
reflected in language structure. The contrast between definite
and indefinite articles, in those languages that have them, is a
simple example: Both The ninth planet has a peculiar orbit and
The ninth planet does not have a peculiar orbit presuppose that
there is a ninth planet. This constancy under negation is often
taken to be a defining property of presupposition it shows
that the presupposed content is not what is being asserted.
Note that unlike what is asserted, presuppositions are defeasible (fall away) in certain contexts, as in If there is one, the ninth
planet must have a peculiar orbit. Many structures have been
identified that signal this presuppositional property: factive
verbs like regret in he regrets publishing it (which presupposes
he did publish it), cleft-sentences like It was the police who hid
the crime (which presupposes that someone hid the crime), or
comparatives like Hes a better golfer than Tiger (which presupposes that Tiger is a golfer). Although this might seem
to be purely a matter of the arbitrary conventions of a single
language, in fact structures with similar semantics also tend
to carry similar presuppositions in other unrelated languages
(Levinson and Annamalai 1992), suggesting that it is properties of the semantic representation that trigger the presuppositional inferences. It is thus possible to make an inventory of
types of structure that tend to universally signal presuppositional content.

Implicature
A conversational implicature is an inference that comes about
by virtue of background assumptions about language use,
interacting closely with the form of what has been said. H.
Paul Grice (1975, 1989) outlined a cooperative principle
instantiated in four such background maxims of use: Speak
the truth (quality), provide enough but not too much information (quantity), be relevant (relevance), and be perspicuous (manner). For example, if A says Have you seen Henk?
and B says His office door is open, we read Bs utterance
as a partial answer (by relevance), which B chooses because
he hasnt seen Henk but wishes to provide information that
is both true (quality) and relevant, and sufficient to be useful (quantity) and clear enough (manner). By virtue of the
assumption that B is following these maxims, Bs utterance
can suggest, or conversationally implicate, in Grices terminology, that Henk is somewhere close by. Despite the fact
that we often have reasons or cultural conventions for being
obscure or economical with the truth (Sacks 1975; Ochs 1976),
such indirect answers seem to be universal, suggesting that
the background assumption of cooperation holds right across
the cultures of the world.

The maxims of quantity and manner, in particular, seem to


be responsible for detailed cross-linguistic patterns of inference
(Horn 1984; Levinson 2000). For example, the coffee is warm
suggests that it is not hot, or Ibn Saud had 22 wives suggests
that he did not have 23 even though if coffee is hot it is certainly warm, and if you have 23 wives you certainly have 22. The
reasoning seems to be that if you know the stronger quantity
holds, you should have said so not saying so implicates that it
does not hold. In a similar cross-linguistically general way, Its
not impossible that the war will still be won implicates greater
pessimism that the war will be won than the logically equivalent Its possible the war will still be won. The reasoning seems
to be that since the speaker has avoided the positive by using
a double negative, by the maxim of manner he must have had
some reason to do so. These cross-linguistic patterns seem to
have systematic effects on grammar and lexicon (Levinson 2000;
Sperber and Wilson 1995).

Speech-Acts
The speech acts of questioning, requesting, and stating are found
in conversation in any language, and they have grammatical
repercussions in all language systems for example, in interrogative, imperative, and declarative syntax (Sadock and Zwicky
1985). Languages differ, of course, in how, and the extent to
which, these acts are grammatically coded, but they always are
at least partially reflected in grammar. John Searle (1976) suggested that there are five major kinds of speech-acts: directives (a
class including questions and requests), representatives (including statements), commissives (promising, threatening, offering),
expressives (thanking, apologizing, congratulating, etc.), and
declarations (declaring war, christening, firing, excommunicating, etc.). The types are individuated by different preconditions
and intended effects, known as their felicity conditions.
The broad taxonomy offers plausible universal classes, while
subsuming culture-specific actions like declarations, such as
divorce by announcement in Moslem societies or magical spells
in a Melanesian society.
Despite the fact that there is an association between, for
example, interrogative form and questioning, the link between
form and action performed is often complex. In English, for
example, requests are rarely done in the imperative, but typically in the interrogative, as in Can you help me get this suitcase
down? It has been noticed that if a distinctive felicity condition for a successful request is stated or requested, this will itself
serve as a request (the addressee being able to get the suitcase
down being a precondition to a felicitous request). This seems
to have general cross-linguistic application, suggesting that the
action performed is in fact implicated by what is said (Brown
and Levinson 1987, 136 ff). However, in many cases, less regular strategies link what is said to the actions performed, and the
mapping from utterances to actions remains a serious theoretical problem in pragmatics.

Conversation Structure
The organization of conversation seems likely to provide some
of the most robust pragmatic universals. As far as we know,
in all societies the most informal type of talk involves rapid

655

Pragmatics, Universals in
alternation of speaking roles (Sacks, Schegloff, and Jefferson
1974). This turn-taking, of course, motivates the deictic system
already mentioned. Such informal talk is also characterized by
the immediacy of conversational repair; that is, if addressees do not hear or understand what is said, they may query either
the whole or part, getting immediate feedback in the next turn
(Schegloff, Jefferson, and Sacks 1977). Such talk is structured
locally in terms of sequences (Schegloff 2006) in the simplest
case, adjacency pairs, that is, pairs of utterances performing actions like question-answer, offer-acceptance, requestcompliance, greeting-greeting, and so forth. Sequences can be
embedded, as in A: Do you have Marlboros? B: You want
20s? A: Yes. B: Ah sorry, no. We do have 10s. They can also
be extended over more turns, for example by adding a presequence as in: A: Do you mind if I ask you something? B: No.
A: Why did you give up that amazing job? B: Burnout. Given
the general expectation for rapid turn-taking, any participant
wishing to have an extended turn at talk is likely to negotiate this, for example, through a prestory of the kind Have you
heard what happened to Bonny? During such an extended turn
at talk, feedback of restricted types (mmhm, uhuh, etc.) may be
expected. In addition to these local levels of organization, conversations also generally have overall structures for example,
they are likely to be initiated by greetings and ended with partings, each with its distinctive structure.
All of this detailed structure seems entirely general across
cultures and languages, although there may be constraints of
many local kinds about who can talk to whom and where in
this informal way. Ethnographic reports to the contrary do not
seem to stand the test of close examination. There are, though,
many aspects of cultural patterning that can be very distinctive.
For example, although in all cultures conversation makes use
of multimodal signals (gaze, gesture, facial expression, etc.) in
face-to-face interaction, the details can differ strikingly, whereas
Tzeltal speakers avoid gaze and the signals that would be thus
made available, Rossel Islanders presume mutual gaze and so
can systematically signal responses like yes, no, amazing!
and so on by facial expression.
In addition to these general observations about conversational universals, there seem to be very detailed generalizations
about specific actions. For instance, in a wide sample of languages, it seems that reference to persons follows a precise set
of expectations about the form of reference expressions, as well
as the procedures to follow when the expression proves inadequate (Stivers and Enfield 2007). Thus, utterances of the following kind, where specific components are added incrementally
and in order until recognition is signaled, can be expected
in any language: John (.) Wilkins (.) The man you met at the
party.

Human Ethology and Communication


Human language is unique in the animal world by virtue of its
complex internal structure, its potential displacement across
modalities (as in sign languages), and its wide range of
functions. It is also the only animal communication system that
exhibits great diversity in structure and meaning across social
groups. This diversity shows that it is heavily interdependent
with historical and cultural processes. Nevertheless, all normal

656

children learn a language and use it in strikingly parallel ways.


The strong universals of use suggest that language, in fact, rides
on a rich, language-independent infrastructure. A crucial element is the ability to infer intentions from actions. Grice (1957)
outlined a psychological theory of non-natural meaning or
communication along the following lines: A communicator
intends to cause an effect in an addressee by producing an
action or utterance that is designed to cause that effect just
by having that intention recognized (see communicative
intention ). Consider a nonverbal signal: A mother makes as
if to smooth her own hair, thereby signaling to her daughter in
a school concert that the daughters hair is in disarray if the
child recognizes her intent, communication has succeeded. No
conventional symbols are necessarily involved. Such a mode of
communication, which can be observed in nonconventional
sign languages like home-sign (Goldin-Meadow 2003), relies
on some form of reciprocal mind-reading abilities (Levinson
2006). It plausibly forms the basis for the learning of language,
as communication is evident in infancy (e.g., through pointing) prior to language acquisition (see communication,
prelinguistic ).
If a mind-reading ability is part of the infrastructure for
language, there are also other aspects of the pragmatic infrastructure that are potentially independent of linguistic communication. For example, systematic turn-taking is discerniable in
infantcaretaker interaction long before verbal interchange is
possible. Similarly, the use of gesture, facial expression, gaze,
and posture in interaction appears early in child development.
All of this points to a large raft of abilities and inherited dispositions that makes language use possible in the form that we
know it. It is this infrastructure that infants use to bootstrap
themselves into language. What is now observable in ontogeny
was no doubt true also in phylogeny for this infrastructure
no doubt preceded the evolutionary specializations in anatomy and brain that now drive language (Enfield and Levinson
2006).
There are yet other universals of language use that are reflections of a common human ethology. We are one of the few species
that shows evidence of cooperative instincts that are not based
on kin selection. This cooperation is made possible by the subtle
linguistic and paralinguistic expression of solidarity, dominance,
and the juggling for position (see paralanguage), much of
this explored by pragmaticists under the rubric of politeness
(Brown and Levinson 1987). Again, there seem to be systematic
universals here, both in the underlying dimensions expressed
(e.g., power, solidarity, degree of imposition) and in the basic
strategies used to express them (e.g., modulations of deference
or camaraderie).
In sum, then, an understanding of universals in pragmatics
promises to give us deep insights into the infrastructure that lies
behind human communication and the language that is so distinctive of it. This infrastructure is arguably what lies behind the
development of language in infancy, as well as the evolution of
language in the species. Taken as a core part of human ethology,
it also tells us much about human nature and how it came to be
the way it is.
Stephen C. Levinson

Pragmatics, Universals in
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brown, Penelope, and Stephen C. Levinson. 1987. Politeness: Universals
in Language Usage. Studies in Interactional Sociolinguistics 4.
Cambridge: Cambridge University Press.
Enfield, Nick J., and Stephen C. Levinson, eds. 2006. Roots of Human
Sociality: Culture, Cognition and Human Interaction. Oxford:
Berg.
Goldin-Meadow, Susan. 2003. The Resilience of Language: What Gesture
Creation in Deaf Children Can Tell Us About How All Children Learn
Language. New York: Psychology Press.
Grice, H. P. 1957. Meaning. Philosophical Review 67: 37788.
. 1975. Logic and conversation. In Syntax and Semantics. Vol.
3: Speech Acts. Ed. P. Cole and J. Morgan, 4158. New York: Academic
Press.
. 1989. Studies in the Way of Words. Cambridge: Harvard University
Press
Haspelmath, Martin, Matthew Dryer, David Gil, and Bernard Comrie,
eds. 2005. The World Atlas of Language Structures. Oxford: Oxford
University Press.
Hockett, C. F. 1960. The origin of speech. Scientific American
203: 8996.
Horn, Laurence. 1984. Toward a new taxonomy for pragmatic inference: Q- and R-based implicature. In Meaning, Form and Use in
Context, ed. Deborah Shiffrin, 1142. Washington, DC: Georgetown
University Press.
Huang, Yan. 2007. Pragmatics. Oxford: Oxford University Press.
Hymes, Dell. 1982. Models of the interaction of language and social life.
In Directions in Sociolinguistics, ed. John J. Gumperz and Dell Hymes,
3571. New York: Holt, Rinehart and Winston.
Jackendoff, Ray. 2002. Foundations of Language. Oxford: Oxford
University Press.
Levinson, Stephen. 1983. Pragmatics. Cambridge: Cambridge University
Press.
. 2000. Presumptive Meanings: The Theory of Generalized
Conversational Implicature. Cambridge, MA: MIT Press.
. 2006. On the human interaction engine. In Roots of Human
Sociality, ed. N. Enfield and S. Levinson, 3969. Oxford: Berg.
Levinson, Stephen C., and E. Annamalai 1992. Why presuppositions
arent conventional. In Language and Text: Studies in Honour of Ashok
R. Kelkar, ed. R. N. Srivastava, Suresh Kumar, K. K. Goswami, and
R. V. Dhongde, 22742. Dehli: Kalinga Publications.
Newmeyer, Fredrick. 2004. Typological evidence and universal grammar. Studies in Language 28: 52748.
Ochs, Elinor. 1976. The universality of conversational postulates.
Language in Society 5.1/3: 6780.
Sacks, Harvey. 1975. Everyone has to lie. In Sociocultural Dimensions
of Language Use, ed. M. Sanches and B. Blount, 5780. New
York: Academic Press.
Sacks, Harvey, Emanuel Schegloff, and Gail Jefferson. 1974. A simplest systematics for the organization of turn-taking in conversation.
Language 50: 696735.
Sadock, Jerrold, and Arnold Zwicky. 1985. Speech act distinctions in
syntax. In Language Typology and Syntactic Description. Vol. l: Clause
Structure. Ed. Timothy Shopen, 15596. Cambridge: Cambridge
University Press.
Schegloff, Emanuel. 2006. Sequence Organization in Interaction: A
Primer in Conversation Analysis. Cambridge: Cambridge University
Press.
Schegloff, Emanuel, Gail Jefferson, and Harvey Sacks. 1977. The preference for selfcorrection in the organization of repair in conversation.
Language 53: 36182.
Searle, John. 1976. Speech Acts: An Essay in the Philosophy of Language.
Cambridge: Cambridge University Press.

Pragmatism and Language


Sperber, Dan, and Deirdre Wilson. 1995. Relevance: Communication and
Cognition. 2d ed. Oxford: Blackwell.
Stivers, Tanya, and Nick J. Enfield, eds. 2007. Person Reference
in Interaction: Linguistic, Cultural and Social Perspectives.
Cambridge: Cambridge University Press.

PRAGMATISM AND LANGUAGE


According to a very influential conception of language, language functions by tracing truth conditions. Individual words
denote objects, properties, and relations, and combinations of
words in sentences represent possible states of affairs (see
truth conditional semantics). The pragmatist rejects this
conception of language, arguing that we must focus not on what
is the case if a sentence is true but, instead, on what follows if it
is true.
As originally formulated by Charles Sanders Peirce in the late
nineteenth century, pragmatism is motivated by a nonfoundationalist conception of scientific inquiry. Rather than taking
inquiry to begin with some basic truths, deriving consequences
in light of ones understanding of what follows from what, Peirce
takes inquiry to begin with a hypothesis, asking what would follow on that hypothesis. If various consequences are true, then
one has reason to believe that ones hypothesis is true as well; if
any consequence is false, then the hypothesis must be rejected.
Because it is impossible to exhaust all the consequences of a
given claim, it follows immediately that there is no certainty, no
indubitable truth. Anything we think we know, however self-evident it may seem, can turn out to have been mistaken; nothing
is (as Wilfrid Sellars would say) given. But although nothing is
given as the firm and indubitable foundation for inquiry, there
is, at any stage in inquiry, much that one has no reason to doubt.
That is where we must start, from where we are, while at the same
time recognizing that in our inquiries we do not stand upon the
bedrock of fact but are instead walking upon a bog, and can
only say, this ground seems to hold for the present (Peirce 1992,
1767). Judgment, on such a view, is inherently provisional; it
not only corrects its conclusions, it even corrects its premises
(Peirce 1992, 165). For the pragmatist, such a nonfoundationalist
and fallibilist conception of inquiry motivates, in turn, the idea
that meaning is to be understood not by reference to truth but by
reference to consequences.
This pragmatist conception of meaning in terms of consequences is especially plausible for the case of mathematical and
natural scientific concepts. Whereas the standard foundationalist view would seem to require some special insight into the
basic truths of mathematics, the pragmatist takes mathematics
to proceed experimentally, by axiomatizing some domain,
thereby making explicit our (current) understanding of the concepts relevant to that domain and deriving theorems as a means
of testing the adequacy of that understanding. Similarly, in the
empirical sciences, we form theories, the empirical adequacy
of which is determined by reference to the observable effects
of the theory. The pragmatist conception of meaning is much
less plausible in the case of the everyday prescientific concepts
of natural language, concepts of sensory qualities such as redness, say, or even of a substance such as water as it is prescientifically understood, concepts the contents of which seem

657

Pragmatism and Language


not to be exhausted by their (observable) consequences but
ineliminably to involve also a particular phenomenal quality
(see language, natural and symbolic ). Pragmatists have
nonetheless tended to understand the contents of all concepts,
whether belonging to natural or to symbolic language, in terms
of their consequences.
The pragmatist conception of meaning in terms of consequences shifts attention away from truth as the product of
inquiry toward the process of inquiry, the striving for truth;
and it does so because (in the absence of a given foundation)
it is not settled in advance how conflicts, as they arise, are to
be adjudicated, which of the competing claims are to be jettisoned, and which retained, if only provisionally. Suppose,
for example, that we find some stuff that looks like water but,
on analysis, is shown to be not H2O but some other chemical
stuff, call it XYZ. What should we conclude? There are many
options. Perhaps the fault lies with our analytic procedure
or in the execution of it. Perhaps water is not inevitably H2O.
Perhaps the stuff is not really water. And other responses are
possible as well. At any given point in our ongoing inquiry,
some responses will seem more plausible than others; nevertheless, it is not simply given what the correct response is. It is
only the way we actually go on in the course of inquiry that will,
retrospectively and defeasibly, settle what our words mean. In
a slogan, meaning lies in use.
As originally conceived by Peirce, and defended more recently
by Sellars, this pragmatist conception of meaning enables a
fully robust notion of objective truth, a conception of scientific
inquiry as answering to things as they are. As interpreted by
William James, and defended more recently by Richard Rorty
and (more subtly) by Robert Brandom, pragmatism entails relativism, a conception of scientific inquiry as answering only to our
interests, to what the community of, say, scientists takes to be the
case. There is, on this latter view, no objective standard governing the correctness of our judgments but only a social one, no
truth but only solidarity. And it is not hard to understand how
the pragmatist conception of meaning can seem to entail such a
view. If, as the pragmatist thinks, there is no given foundation of
meaning and truth, then it can seem to follow that we have only
our takings, our subjective conceptions of things to go on. But if
so, then objectivity would seem to require the impossible: that
we step outside of language, outside of our subjective conceptions, to see how things are independent of those conceptions.
If there is no given but only taking, then our inquiries cannot be
answerable to things as they are.
In his Philosophical Investigations (1953), Ludwig
Wittgenstein argues for what is, in effect, the pragmatist conception of meaning in terms of use and against the representationalist conception of meaning. And here again, both a Peircean,
realist reading and a Jamesian, relativistic reading are possible.
Language use is essentially normative, subject to standards of
correctness that speakers in some way grasp or understand. It
is, as Wittgenstein thinks of it, a matter of rule-following.
The task is to understand how exactly this works. We begin with
an expression of the rule, a signpost, for instance, that shows
the way. (We could equally well begin with a persons utterance
showing what that person thinks or even with something like an
apple that shows itself to a perceiver as an apple.) Because there

658

is no given meaning to the signpost (or utterance or apple), no


meaning that it has independent of the ways we actually go on
in light of it, we suppose instead that the signpost has the meaning it does because we respond to it in a certain way, because
it is taken a certain way, because, as Wittgenstein puts it, it is
interpreted. But it can then be argued that this response, too,
has no given meaning, no meaning independent of the ways
we actually go on in light of it. So in order for it to be a normatively significant response, a taking of the sign as meaning
such and such, a further response would seem to be needed.
But this clearly starts a vicious regress. Perhaps, then, we need
the notion of a response that is inherently meaningful, that just
is normatively significant. Such responses are not answerable
to things as they actually are (as this would require a given) but
only to things as they are taken to be; and they are essentially
social because otherwise whatever seems right to one will be
right, and then we cannot speak of right (see private language argument ).
According to this Jamesian reading (rehearsed, for instance,
by Brandom), the fact that nothing is given requires turning
instead to takings, socially articulated, normatively significant
responses to things such as signposts, utterances of other speakers, objects, and states of affairs, in virtue of which the things
responded to have the significances they are taken to have. A
more radical, Peircean response jettisons not merely the given
but also the whole framework relative to which we must choose
between a mythic given and a merely socially articulated taking
(see, for example, McDowell 2002). The model here is the way
animals interact with things in their environments. Grass, for
example, is food for some animals. But grass is not simply given
as food; that it serves as food depends on there having evolved
animals for whom grass is nourishing and is eaten for nourishment. Nor is grass food merely by being taken to be so, by in
fact being taken up and eaten. (An animal might on occasion
eat something that is not nourishing for it, a piece of plastic,
say; and it may at times be unable to digest food, that is, stuffs
that generally are nourishing for it.) Instead, grass has the significance of being food only relative to the kind of animal for
which it is food; and contrariwise, the animal is intelligible as
the sort of animal it is, as an instance of a particular form of
life, only in light of its environment providing opportunities
(such as food) and hazards for it. Being an animal and having
an environment are correlative notions; neither is intelligible
without the other.
Similarly, being a speaker and having the world in view as the
stuff of ones talk and the standard of the correctness of ones
judgments are correlative notions, neither intelligible without
the other. There is no given if by that one means things revealed
as meaningful to us independent of the evolution and acquisition of language, but nor is language use merely a matter of takings. Rather, through ones acculturation into natural language,
one comes to have the world in view as that about which one
speaks, much as an animal, through its development into the
kind of animal it is, comes to have an environment to which it
is perceptually sensitive and through which it moves. And it is
the world that is in view for a speaker, on this reading, precisely
because meaning and truth are not given but are instead the
fruits of inquiry.

Predicate and Argument

Preference Rules

The pragmatist critique of the representationalist conception


of language in terms of unquestionable, or given, wordworld
relations of denotation does not show merely that the foundation is different than we had thought, that it is social rather than
objective, but instead, and more radically, that the objectivity
of inquiry requires not a foundation but the capacity for critical
reflection: [E]mpirical knowledge is rational, not because it
has a foundation but because it is a self-correcting enterprise
which can put any claim in jeopardy, though not all at once
(Sellars 1997, 38). Not only do we revise our beliefs about things,
but we also revise our conceptions of the kinds of things there
are and can be. Indeed, we even revise our most fundamental
understanding of the nature of reality as a whole, and in so doing,
we come to ever more adequate languages with which to address
things as they are, the same for all rational beings.
Danielle Macbeth
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brandom, Robert. 1994. Making It Explicit: Reasoning, Representing, and
Discursive Commitment. Cambridge: Harvard University Press.
James, William. 1975. Pragmatism. Cambridge: Harvard University
Press.
Macbeth, Danielle. 1995. Pragmatism and the philosophy of language.
Philosophy and Phenomenological Research 55: 50123.
. 2007. Pragmatism and objective truth. In The New Pragmatists,
ed. C. Misak, 16992. Oxford: Clarendon.
McDowell, John. 1994. Mind and World. Cambridge: Harvard University
Press.
. 2002. How not to read philosophical investigations: Brandoms
Wittgenstein. In Wittgenstein and the Future of Philosophy: A
Reassessment after 50 Years, ed. R. Haller and K. Puhl, 25162.
Vienna: Holder, Pichler, and Tempsky.
Peirce, Charles Sanders. 193158. Collected Papers of Charles Sanders
Peirce. 8 vols. Cambridge: Harvard University Press.
. 1992. Reasoning and the Logic of Things: The Cambridge Conference
Lectures of 1898, ed. K. Ketner. Cambridge: Harvard University Press.
Rorty, Richard. 1982. Consequences of Pragmatism (Essays: 19721980).
Minneapolis: University of Minnesota Press.
Sellars, Wilfrid. 1997. Empiricism and the Philosophy of Mind.
Cambridge: Harvard University Press.
Wittgenstein, Ludwig. 1953. Philosophical Investigations. Oxford:
Blackwell.

PREDICATE AND ARGUMENT


In all languages, the vocabulary consists of two basic types of

words: those that denote entities, such as pronouns and proper

PREFERENCE RULES
Preference rule systems constitute a form of rule interaction
related to default logic and harmonic grammar (Smolensky and
Legendre 2006). They are introduced in semantic theory in
Jackendoff (1983) and in generative music theory in Lerdahl and
Jackendoff (1983) and argued to be ubiquitous in cognition.
A standard example is the meaning of the verb climb. A stereotypical case such as John climbed for hours is interpreted as John a)
moving upwards on a surface, b) with an effortful clambering manner of motion. Both conditions are violable. John climbed down the
mountain and John climbed across the cliff do not involve upward
motion; the airplane climbed steadily entails upward motion but
not clambering. However, both conditions cannot be violated at
once: *The airplane climbed down 5,000 feet.
These examples make it impossible to analyze the meaning
of climb in terms of necessary and sufficient conditions,
as assumed in the philosophical and formal logic traditions.
Neither condition is necessary, but either one is sufficient for
an action to count as climbing. At first blush, this suggests that
the conditions are simply disjunctive. However, there is a further
wrinkle: Satisfying both constraints results in a more stereotypical use of climb, and in cases where there is no evidence to the
contrary, both conditions are assumed by default. Thus, preference rule systems provide a formal characterization of Ludwig
Wittgensteins (1953) and E. Rosch and C. Merviss (1975) notion
of categories displaying a family resemblance: There is no
single criterial condition for members of the category, stereotypical members satisfy all or most conditions, and marginal members satisfy fewer conditions.
Preference rule systems differ from optimality-theoretic
rule systems in that the constraints, though violable, are not
ranked: Under proper conditions, either rule can dominate
the other. A classical example comes from gestalt principles of
visual grouping (Wertheimer [1924] 1938), where grouping of
units can be based either on their relative distance (1a) or their
relative similarity (1b). Thus, either condition is sufficient for
grouping:
(1) a. x x x x x x x x x
b. x x xX X Xx x x

[identical units with variable spacing]


[different units with identical spacing]

In displays with variable units and variable distances, alignment


of the two conditions produces stronger grouping judgments
(2a). If the two conditions are not aligned, a judgment can be
forced by sufficient disparity either in distance (2b) or in form
(2c).

names, and those (such as verbs, adjectives, and adverbs) that


present information about entities, such as their properties,
states, or transformations. In a terminology derived from logic,
the relational words are called predicates, and the entities that
they relate to are called their arguments. Predicates are like functions in mathematics, with their arguments serving as variables.
In traditional grammar, the term predicate is used also for one of
two constituent parts of a sentence, the other being the subject. (See also quantification, categorial grammar, and
montague grammar.)

When attempting to state the conditions on a category or


rule, then, one should suspect the presence of a preference rule
system when a) every condition one can think of as criterial has
important counterexamples, b) there are different counterexamples to each condition, and c) satisfaction of all (or most) conditions produces stereotypical instances of the category or rule.

Anat Ninio

Ray Jackendoff

(2)

a. x x x X X X x x x
b. x x X X X x x x x
x xxx
c. x x

X XX

[stronger judgment]
[distance overrules size]
[size overrules distance]

659

Prestige
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Jackendoff, R. 1983. Semantics and Cognition. Cambridge, MA: MIT
Press.
Lerdahl, F., and R. Jackendoff. 1983. A Generative Theory of Tonal Music.
Cambridge, MA: MIT Press.
Rosch, E., and C. Mervis. 1975. Family resemblances: Studies in the
internal structure of categories. Cognitive Psychology 7: 573605.
Smolensky, P., and G. Legendre. 2006. The Harmonic Mind. Cambridge,
MA: MIT Press.
Wertheimer, M. [1924] 1938. Laws of organization in perceptual
forms. In A Source Book of Gestalt Psychology, ed. W. D. Ellis, 7188.
London: Routledge and Kegan Paul.
Wittgenstein, L. 1953. Philosophical Investigations. Oxford: Blackwell.

Presupposition
or what people think its record to have been (1989, 4), not only
describing the status quo but also including a diachronic dimension (see synchrony and diachrony). While prestige is
mostly associated with speaker evaluations, its role as a catalyst
for other characteristics, such as functional specialization, literary heritage, and standardization, has also been acknowledged. In a dynamic model of language prestige and prestige
change, Susanne Mhleisen (2002) looks at the interaction
among societal, institutional and interactional, and sociopsychological dimensions of prestige. The dynamics of this interaction may result in various types and directions of changes in
language prestige.
Susanne Mhleisen

PRESTIGE

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

Language prestige refers to the social position of a language,


especially in multilingual settings (see bilingualism and
multilingualism), and the purposes it is used for, as well as
peoples beliefs and feelings about it. The prestige of a variety is
unrelated to its structure and can only be determined in a social
context. The prestige of a particular language may, therefore,
differ greatly from one speech community to another and may
also be subject to change. Typically, the languages of immigrant
groups (e.g., Turkish in Germany) have a relatively low prestige
in comparison to that in the country of origin (e.g., Turkish as
the national and official language in Turkey). An example of
a prestige change is the move of Ukrainian from low to high
prestige in post-Soviet Ukraine, a position that had been previously held by Russian as the exclusive high-prestige language
(Bilaniuk 1993).
Charles A. Ferguson (1959) used the term prestige to describe
the functional distribution of two language varieties of the same
language, a H(igh prestige) language and a L(ow prestige) language in diglossic situations. The term often serves as an
umbrella notion encompassing status and functions of languages, on the one hand, and language attitudes, on the other.
As of the 1960s, various classifications of status and function of
languages had been proposed (see, for instance, Ferguson 1966;
Stewart 1968), mostly for the description of national sociolinguistic profiles in multilingual societies. Some of these suggestions were later taken up and redefined in other detailed
frameworks (see Ammon 1989; Mackey 1989) where prestige
stands for an important sociocultural dimension. Along with
features such as demographic factors, institutional support, and
status (see language policy), prestige is also seen as a factor of the ethnolinguistic vitality of a linguistic group, that
is, that which makes it behave as a distinctive entity within multiethnic and multilingual settings. In language attitude studies,
we can distinguish between speaker evaluations in terms of overt
prestige (i.e., as languages of authority) and covert prestige (i.e.,
as languages of solidarity). In a diglossic language situation, for
instance in Guyana, the L-language (Guyanese Creole) is attributed a high-solidarity and a low-authority value, whereas the
H-language (English) holds a low-covert and a high-overt prestige (Rickford 1983).
The prestige of a language is often explained in historical
terms. Thus, William F. Mackey speaks of a languages record,

Ammon, Ulrich. 1989. Towards a descriptive framework for the status/


function (social position) of a language within a country. In Status
and Function of Languages and Language Varieties, ed. Ulrich Ammon,
21106. Berlin: de Gruyter.
Bilaniuk, Laada. 1993. Diglossia in flux: Language and ethnicity in
Ukraine. Texas Linguistic Forum 33: 7988.
Ferguson, Charles A. 1959. Diglossia. Word 15.2: 32540.
. 1966. National sociolinguistic profile formulas. In Sociolinguistics,
ed. William Bright, 30924. The Hague: Mouton.
Mackey, William F. 1989. Determining the status and function of languages in multinational societies. In Status and Function of Languages
and Language Varieties, ed. Ulrich Ammon, 320. Berlin: de Gruyter.
Mhleisen, Susanne. 2002. Creole Discourse: Exploring Prestige
Formation and Change Across Caribbean English-Lexicon Creoles.
Amsterdam: Benjamins.
Rickford, John. 1983. Standard and nonstandard attitudes in a Creole
community. Society for Caribbean Linguistics (Occasional Paper 16),
University of the West Indies, St. Augustine, Trinidad.
Stewart, William A. 1968. A sociolinguistic typology for describing
national multilingualism. In Readings in the Sociology of Language,
ed. Joshua A. Fishman, 53145. The Hague: Mouton.

660

PRESUPPOSITION
A presupposition is a precondition of a sentence such that the
sentence cannot be uttered meaningfully unless the presupposition is satisfied. The concept of a presupposition originated
with Gottlob Frege (1892), but the English term was coined by
Peter F. Strawson (1950). Presupposition theory is an area of
active research at the semantics/pragmatics interface. A
related term is conventional implicature. H. Paul Grice
(1975) distinguished between presuppositions and conventional implicatures, however, it is still under debate whether
such a distinction is necessary (cf. Potts 2007 and ensuing
discussion).
definite descriptions have played a major role in the
development of presupposition theory and are still generally
analyzed as introducing a presupposition. Consider for example
(1). No entity that satisfies the description biggest natural number exists. What is the status of (1)? Is it true or false?
(1)

The biggest natural number is prime.

Presupposition theory says (1) is neither true nor false: A


definite description the NP (noun phrase) presupposes the

Presupposition
existence of an individual that satisfies NP in other words,
definite descriptions carry an existence presupposition.
Presupposition failure describes the case wherein a presupposition is not fulfilled like (1). Presupposition failures are analyzed as being neither true nor false, but as being truth value
gaps. Presupposition theory, therefore, relies on a distinction
among three possible truth values a sentence may have: true,
false, and undefined. One important argument in support of
a third truth value has been the interaction between negation
and presuppositions: A presupposition failure in many cases
remains a presupposition failure even when the sentence is
negated:
(2)

The biggest natural number is not prime.

It follows that (2), like (1), is a presupposition failure just if negation does not change the conditions under which a sentence
has a truth value. Negation can be used in this way as a presupposition test: A presupposition follows from a sentence and its
negation. The assertion, on the other hand, only follows from the
sentence itself, and not from its negation.
Just as the existence presupposition of the sentences in (1)
and (2) is triggered by the definite article the, many other words
trigger presuppositions. Stephen C. Levinsons (1983) textbook
lists several pages of presupposition triggers in English. A particularly interesting paradigm is that in (3) (cf. Abusch 2005): (3a)
has no relevant lexically triggered presupposition, whereas (3b)
presupposes that it is actually raining outside and asserts that
Bill thinks so, too. Finally, (3c) presupposes that Bill thinks that
it is raining outside, and asserts that it actually is raining outside.
It is particularly interesting that be right and know have the same
truth conditions, but differ on which part of them is presupposed. Paradigm (3) shows that part of our specific knowledge
about think, know, and be right is whether they trigger a presupposition and which one.
(3)

a. Bill thinks that its raining outside.


b. Bill knows that its raining outside.
c. Bill is right that its raining outside.

Some presuppositions are not lexically triggered. For example, (3a) cannot be used if it is known that it really is raining
outside. This presupposition, however, has been analyzed as an
implicated presupposition (Sauerland 2008). It can be derived in
a similar way to conversational implicatures as arising from the
avoidance of a presupposition trigger and a principle of presupposition maximization (Heim 1991).
One central problem of presupposition theory is the question
of how to predict the presuppositions of complex sentences the
problem of presupposition projection. Lauri Karttunen and S.
Peters (1979) show that while negation does not affect presuppositions, in other complex sentences presupposition triggers can
occur, but the presupposition may not project to the entire sentence: In example (4), the conditional clause blocks projection of
the existence presupposition of the biggest natural number.
(4)

If there was a biggest natural number, the biggest natural


number would be prime.

Building on work by Robert Stalnaker (1973) and Karttunen


(1974), Irene Heim (1983) has developed an influential account of

Primate Vocal Communication


presupposition projection that has given rise to dynamic semantics (see also Beaver 2001; Kadmon 2001). However, the projection problem is still subject to lively debate (see Schlenker 2007).
Uli Sauerland
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abusch, Dorit 2005. Triggering from alternative sets and projection
of pragmatic presuppositions. Unpublished manuscript, Cornell
University.
Beaver, David. 2001. Presupposition and Assertion in Dynamic Semantics.
Stanford, CA: CSLI Publications.
Frege, Gottlob. [1892] 1952. ber Sinn und Bedeutung. Zeitschrift fr
Philosophie und philosophische Kritik 100: 2550. English translation: On sense and reference. In Translations from the Philosophical
Writings of Gottlob Frege, ed. Peter T. Geach and M. Black, 5678.
Oxford: Blackwell.
Gazdar, Gerald. 1979. Pragmatics: Implicature, Presupposition, and
Logical Form. New York: Academic Press.
Grice, Herbert Paul. 1975. Logic and conversation. In Syntax and
Semantics. Vol. 3. Ed. Cole and Morgan, 4158. New York: Academic
Press.
Heim, Irene. 1983. On the projection problem for presuppositions.
In Proceedings of WCCFL 2, ed. Dan Flickinger, 11425. Stanford,
CA: CSLI.
. 1991. Artikel und Definitheit. In Semantik: Ein internationales
Handbuch der zeitgenssischen Forschung, ed. Arnim von Stechow and
Dieter Wunderlich, 487535. Berlin: Mouton de Gruyter.
Kadmon, Nirit. 2001. Formal Pragmatics: Semantics, Pragmatics,
Presuppositions and Focus. Malden, MA, and Oxford: Blackwell.
Karttunen, Lauri. 1974. Presuppositions and linguistic context.
Theoretical Linguistics 1: 18194.
Karttunen, Lauri, and S. Peters. 1979. Conventional implicature. In
Presupposition, ed. C. Oh and D. Dineen, 156. New York: Academic
Press.
Levinson, Stephen C. 1983. Pragmatics. Cambridge: Cambridge
University Press.
Potts, Christopher 2007. The expressive dimension. Theoretical
Linguistics 33.2: 16597. Includes commentary by other scholars.
Russell, Bertrand. 1905. On denoting. Mind, n.s., 14, 47993.
Sauerland, Uli. 2008. Implicated presuppositions. In Sentence and
Context, ed. A. Steube, 581600. Berlin: Mouton de Gruyter.
Schlenker, Philippe. 2007. Transparency: An incremental theory of
presupposition projection. In Presupposition and Implicature in
Compositional Semantics, ed. Uli Sauerland and Penka Stateva, 21442.
Basingstoke, UK: Palgrave Macmillan.
Stalnaker, Robert. 1973. Presuppositions. Journal of Philosophical Logic
2: 44757.
Strawson, Peter F. 1950. On referring. Mind 59: 32044.

PRIMATE VOCAL COMMUNICATION


Nonhuman primate vocal communication is one of our few
links to understanding the evolution of human speech and its
underlying physiological bases. Since the vocal tract and the
brain do not fossilize, insights into the origins of human communication occur through a comparison of the vocal behavior of
extant primates with humans. This comparative approach forms
a framework upon which testable hypotheses on the evolution of
speech can be based. Also, since the brain is critical for the production and perception of vocalizations, nonhuman primates

661

Primate Vocal Communication


(hereafter, primates) are the ideal model system through which
we can directly monitor neurons and neural ensembles to find
the causal links between brain activity and vocal behavior.

Vocal Perception In Nonhuman Primates


To date, this comparative approach has been most fruitful when
scientists have examined how primates perceive their own vocalizations and then compare how these perceptions relate to those
occurring in human speech production. In this chapter, we
review some of these findings.
SYNTACTIC PROCESSING OF VOCAL SEQUENCES. Many primate
species produce bouts of vocalizations that contain sequences of
similar acoustic units and/or different-sounding acoustic units.
Do these units separately code meaningful information (akin
to words in a sentence)? Or do they need to be combined to
form a meaningful utterance (akin to syllables in a word)? For
example, the chimpanzee (Pan troglodytes) pant-hoot consists of a series of hoot calls followed by a series of screams.
Since both hoots and screams are produced individually in other
contexts, the pant-hoot could either be a single vocalization or a
bout of several vocalizations.
Most of our insights into the units of perception come from
studies of the orderly arrangement of the sound units in primate long calls (Marler 1968; Waser 1982). Long calls serve
as localization cues for conspecifics and are produced in the
context of territorial encounters, mate attraction, and isolation/group cohesion. These long calls provide evidence of
phonological syntax in which individual acoustic units are
assembled to form a larger, more functional (meaningful) unit.
For example, male titi monkeys and gibbons produce multiunit long calls that are used to demarcate and defend their territories (Robinson 1979; Mitani and Marler 1989). When one
of these long calls is rearranged by a human experimenter and
presented to a conspecific (in the form of a playback experiment), the primate recognizes this novel vocalization and
responds as if there is a new male in the adjacent territory.
Gibbons produce significantly more squeak calls (given during intergroup encounters) when hearing these novel stimuli,
whereas titi monkeys produce significantly more moaning
responses (also given in response to interspecies and intergroup encounters). These data suggest that, at least in these
species, the global order of syllable sequences represents a cue
to individual recognition.
The vocalizing behavior of cotton-top tamarins also provides
evidence of phonological syntax (Ghazanfar et al. 2001). When
socially isolated, tamarins elicit a long call that begins with one
to two chirps and ends with two to five whistles. When conspecifics hear these vocalizations, they respond with their own calls,
a behavior called antiphonal calling. Do the individual chirps
or whistles provide functional information? Or is information
provided only by chirp-whistle combinations? Playback experiments in which chirps, whistles, or the entire long call (chirpwhistle combinations) are presented to tamarins were used to
address these questions. These experiments have shown that
the entire long call is more effective in eliciting antiphonal long
calls than isolated chirps and whistles, an observation consistent with the hypothesis that in this species, the whole call is the

662

most meaningful unit from the perspective of socially isolated


receivers.
There is also some evidence for a lexical syntax in which
different combinations of acoustic units are used to transmit different meanings to listeners (Zuberbuhler 2002; Arnold
and Zuberbuhler 2006). These data come from studies of two
African forest monkey species. First, not only do Diana monkeys
(Cercopithecus diana) perceive the leopard and alarm calls of a
sympatric species, Campbells monkeys (C. campbelli), but they
also seem to understand that if these two calls are preceded by
a Campbells monkeys boom call, the threat is less urgent.
Importantly, if a Campbells monkeys boom call occurs before
a Diana monkeys species-specific alarm call, it has no effect on
the Diana monkeys behavior. Second, putty-nosed monkeys (C.
nictitans) produce two different alarm calls: pyows for leopards
and hacks for eagles. When produced, each elicits a stereotypical escape response from the listeners. However, when males
combine the calls to form pyow-hack sequences, the combination does not elicit escape responses but, instead, elicits general
group movement.
REFERENTIAL COMMUNICATION. The species-specific vocalizations of many primates, as well as many other animals, can
be used by a listener as a source of information about objects,
events, and the status of peers in their environment. These
vocalizations are important since, on the basis of acoustic structure alone, listeners can extract functional (referential) information about a vocalizations meaning (see reference and
extension).
A classic example of referential communication signaling is use of predator alarm calls by the vervet monkeys
(Cercopithecus aethiops) (Seyfarth, Cheney, and Marler 1980).
Vervets produce unique alarm calls for three different predators: snakes, leopards, and eagles. When an alarm call is produced, it initiates predator-appropriate behaviors in listeners.
For example, when vervets hear an eagle-alarm call, they scan
the sky for visual cues of the airborne predator, and in some
cases, run to locations that provide overhead coverage. In contrast, when they hear a snake-alarm call, they stand up and
scan the ground. Finally, a leopard-alarm call initiates a third
distinct behavior: Vervets run up the nearest tree while scanning the horizon for the leopard.
The capacity to process referential signals successfully also
allows animals to use the referential information that is transmitted by the vocalizations of other species (Zuberbuhler 2000).
For example, female Diana monkeys elicit a predator alarm call
when they hear a male Diana monkey producing a leopard alarm
or when they hear the leopard-alarm call of a crested guinea fowl.
This observation is important since it suggests that Diana monkeys can form abstract categorical representations of a vocalizations functional meaning that is independent of acoustics and
the species generating the signal.
Another example of the categorization of referential information is the food-associated calls of rhesus macaques
(Macaca mulatta) (Hauser 1998; Gifford, Hauser, and Cohen
2003). When free-ranging rhesus monkeys encounter lowquality food, they elicit one of two acoustically distinct vocalizations, coos or grunts. In contrast, when they encounter

Primate Vocal Communication


rare, high-quality food, they elicit one of two acoustically distinct vocalizations, harmonic arches or warbles. However,
despite the fact that these four vocalizations are all acoustically
distinct, rhesus do not discriminate among the vocalizations on
the basis of differences in their acoustics but instead discriminate and categorize these vocalizations on the basis of the type
of referential information transmitted (e.g., low-quality versus
high-quality food).
TEMPORAL CUES FOR VOCAL RECOGNITION. Duration, interval, the order of acoustic features, and other temporal cues are
important components in the capacity of humans to distinguish
between different speech sounds. The difference between the
two phonemes /pa/ and /ba/ is due to differences in voice-onset time. Similarly, the difference between /sa/ and /sta/ is due
to differences in the silent time between the consonants and the
vowels.
Primates can also use temporal information to distinguish
between different vocalizations. As discussed, cotton-top tamarins antiphonally call preferentially when they hear entire
long calls versus portions of the long calls (Ghazanfar et al.
2001). Further studies revealed that while tamarins did not
distinguish between normal calls and time-reversed or pitchshifted long calls (Ghazanfar et al. 2002), normal response
rates did require the species-specific temporal structure of the
amplitude envelope (Ghazanfar et al. 2001). Finally, the number of acoustic units and the presentation rate may also influence antiphonal calling. Along similar lines, rhesus respond
differently to shrill barks and grunts than to copulation calls
when the interval between the acoustic units is expanded or
contracted beyond the normal range (Hauser, Agnetta, and
Perez 1998). Finally, at least for shrill barks and harmonic
arches, there is evidence suggesting that rhesus are sensitive
to the temporal progression of these vocalizations amplitude
envelope: When shrill barks or harmonic arches are time-reversed, which changes their temporal structure but not their
spectral content, rhesus monkeys act as if they do not recognize these stimuli as species-specific vocalizations (Ghazanfar,
Smith-Rohrberg, and Hauser 2001).

Neural Bases of Primate Vocal Communication


Overall, the referential, syntactic, and temporal features of primate vocalizations, from many different species, suggest striking parallels with human speech processing. How are these
features represented and processed at the level of neurons
and neural assemblies? Here, we review some relevant recent
findings.
AUDITORY CORTEX. Despite the ethological importance of primate vocal communication in the lives of primates, we lack a
complete understanding of how biologically relevant features of
complex sounds are processed at the level of single neurons or
small populations of neurons. For many years, the squirrel monkey was the only primate model for investigating the role of auditory cortex in processing species-specific vocalizations. Studies
using this species found that many cells in the superior temporal
gyrus responded to species-specific vocalizations. However, one
of the drawbacks of these studies was that recordings were made

across the superior temporal gyrus without reference to any


neuroanatomical subdivisions.
Recent anatomical and neurophysiological experiments
(Hackett, Preuss, and Kaas 2001; Rauschecker and Tian 2004;
Tian et al. 2001; Kaas and Hackett 1998) in rhesus monkeys have
identified the serial and parallel processing that occurs between
the primary auditory cortex (A1) and secondary levels of auditory processing along the superior temporal gyrus. These higher-order areas are called the caudolateral, middle lateral, and
anterolateral belt areas (CL, ML, and AL, respectively). These
studies suggest that belt neurons responded to more complex
sounds than A1 neurons. For instance, neurons in all three lateral belt areas seem to prefer vocalizations to energy-matched
pure tone stimuli.
Other studies, however, suggest that A1 neurons are also
sensitive to the complex acoustic features needed in speech
and other types of auditory-object processing. For example, in
common marmosets (Callithrix jacchus) and squirrel monkeys,
A1 neurons are selective for species-specific vocalizations and
phase-lock their firing pattern to the functional acoustic units
that comprise a vocalization, as opposed to the finer-grain
acoustic features that are not functionally meaningful (Bieser
1998; Wang et al. 1995; Lu, Liang, and Wang 2001). Also, A1
neurons, as well as those in belt and parabelt regions, are sensitive to the pitch of an auditory stimulus, which may be used to
infer a vocalizations affective content (Bendor and Wang 2005).
These studies highlight the fact that the flow of information
from primary auditory areas to belt and parabelt regions is not
strictly serial but is parallel with both feedforward and feedback
interactions.
PREFRONTAL CORTEX. The frontal lobes contain an auditory responsive region that responds robustly to vocalizations (Romanski, Bates, and Goldman-Rakic 1999 ). This
region, the ventrolateral prefrontal cortex (vPFC), has been
hypothesized to play an important role in processing the
more abstract components of vocalizations. Specifically, it
has been suggested that the vPFC plays an important role
in processing the referential information transmitted by a
vocalization. Indeed, in one set of experiments, it was demonstrated that vPFC neurons were modulated more by differences between the food-related referential information (see
section on referential communication) that is transmitted
by a vocalization than by differences between their acoustic features (Gifford et al. 2005). These data suggested that,
on average, vPFC neurons are modulated preferentially by
transitions between presentations of food vocalizations that
belong to functionally meaningful and different categories.
Consistent with the proposed role of vPFC in categorical processing, vPFC neurons in a second experiment responded in
the same way to different vocalizations that transmit information about different types of food quality (i.e., high-quality and low-quality food) (Cohen, Hauser, and Russ 2006 ).
However, these same vPFC neurons responded differently to
different vocalizations that transmitted information about
different nonfood events .
Yale E. Cohen and Asif A. Ghazanfar

663

Primate Vocal Communication


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Arnold, K., and K. Zuberbuhler. 2006. Language evolution: Semantic
combinations in primate calls. Nature 441.7091: 303.
Bendor, D., and X. Wang. 2005. The neuronal representation of pitch in
primate auditory cortex. Nature 436.7054: 11615.
Bieser, A. 1998. Processing of twitter-call fundamental frequencies
in insulas and auditory cortex of squirrel monkeys. Exp Brain Res
122: 13948.
Chapman, C.A., and D.M. Weary. 1990. Variability in spider monkeys
vocalizations may provide basis for individual recognition. American
Journal of Primatology 22: 27984.
Cleveland, J., and C.T. Snowdon. 1982. The complex vocal repertoire
of the adult cotton-top tamarin, Saguinus oedipus. Zeitschrift fr
Tierpsychologie 58: 23170.
Cohen, Y. E., M. D. Hauser, and B. E. Russ. 2006. Spontaneous processing of abstract categorical information in the ventrolateral prefrontal
cortex. Biology Letters 2: 2615.
Ghazanfar, A. A., J. I. Flombaum, C. T. Miller, and M. D. Hauser. 2001.
The units of perception in the antiphonal calling behavior of cottontop tamarins (Saguinus oedipus): Playback experiments with long
calls. J Comp Physiol [A] 187.1: 2735.
Ghazanfar, A. A., D. Smith-Rohrberg, and M. D. Hauser. 2001. The role
of temporal cues in rhesus monkey vocal recognition: Orienting asymmetries to reversed calls. Brain Behav Evol 58: 16372.
Ghazanfar, A. A., D. Smith-Rohrberg, A. A. Pollen, and M. D. Hauser.
2002. Temporal cues in the antiphonal long-calling behaviour of cottontop tamarins. Animal Behavior 64: 42738.
Gifford, III, G. W., M. D. Hauser, and Y. E. Cohen. 2003. Discrimination
of functionally referential calls by laboratory-housed rhesus
macaques: Implications for neuroethological studies. Brain Behav
Evol 61: 21324.
Gifford, III, G. W., K. A. MacLean, M. D. Hauser, and Y. E. Cohen.
2005. The neurophysiology of functionally meaningful categories: Macaque ventrolateral prefrontal cortex plays a critical role in
spontaneous categorization of species-specific vocalizations. J Cogn
Neurosci 17: 147182.
Hackett, T. A., T. M. Preuss, and J. H. Kaas. 2001. Architectonic identification of the core region in auditory cortex of macaques, chimpanzees,
and humans. J Comp Neurol 441.3: 197222.
Hauser, M. D. 1998. Functional referents and acoustic similarity: Field playback experiments with rhesus monkeys. Anim Behav
55.6: 164758.
Hauser, M. D., B. Agnetta, and C. Perez. 1998. Orientation asymmetries
in rhesus monkeys: Effect of time-domain changes on acoustic perception. Anim Behav 56: 417.
Kaas, J. H., and T. A. Hackett. 1998. Subdivisions of auditory cortex
and levels of processing in primates. Audiology and Neuro-otology
3.2/3: 7385.
Lu, T., L. Liang, and X. Wang. 2001. Neural representations of temporally asymmetric stimuli in the auditory cortex of awake primates.
J Neurophysiol 85.6: 236480.
Marler, P. 1968. Aggregation and dispersal: Two functions in primate
communication. In Primates: Studies in Adaptation and Variability,
ed. P. C. Jay. New York: Holt, Rinehart, and Winston.
Mitani, J. C. and Marler, P. 1989. A phonological analysis of male gibbon
singing behavior. Behaviour 109: 2045.
Rauschecker, J. P., and B. Tian. 2004. Processing of band-passed noise in
the lateral auditory belt cortex of the rhesus monkey. J Neurophysiol
91.6: 257889.
Robinson, J. G. 1979. An analysis of vocal communication in the
titi monkey Callicebus moloch. Zeitschrift fur Tierpsychologie
49: 4679.

664

Priming, Semantic
Romanski, L. M., J. F. Bates, and P. S. Goldman-Rakic. 1999. Auditory
belt and parabelt projections to the prefrontal cortex in the rhesus
monkey. J Comp Neurol 403.2: 14157.
Seyfarth, R. M., D. L. Cheney, and P. Marler. 1980. Monkey responses
to three different alarm calls: Evidence of predator classification and
semantic communication. Science 210.4471: 8013.
Tian, B., D. Reser, A. Durham, A. Kustov, and J. P. Rauschecker. 2001.
Functional specialization in rhesus monkey auditory cortex. Science
292: 2903.
Wang, X., M. M. Merzenich, R. E. Beitel, and C. E. Schreiner. 1995.
Representation of a species-specific vocalization in the primary auditory cortex of the common marmoset: Temporal and spectral characteristics. J Neurophysiol 74.6: 26852706.
Waser, P. M. 1977. Individual recognition, intragroup cohesion, and
intergroup spacing: Evidence from sound playback to forest monkeys.
Behaviour 60: 2874.
. 1982. The evolution of male loud calls among mangabeys
and baboons. In Primate communication, ed. C. T. Snowdon,
C. H. Brown, and M. R. Petersen. New York: Cambridge University
Press.
Zuberbuhler, K. 2000. Referential labelling in Diana monkeys. Anim
Behav 59: 91727.
. 2002. A syntactic rule in forest monkey communication. Anim
Behav 63: 2939.

PRIMING, SEMANTIC
priming is used to describe a situation with two words (or other
entities) that are related, whereby an encounter with the one will
either facilitate or inhibit recovery of the other, either as regards
the speed or the accuracy with which it is recovered (see also
spreading activation). There are various kinds of priming,
of which semantic priming is perhaps the most discussed in the
psycholinguistic literature. An example of semantic priming
is that the word dog will be processed more quickly if the word
bark has just been encountered. If the two words are unrelated
or an encounter with a non-word precedes the encounter with
a word, an encounter with the former will have no effect on the
speed with which the latter is processed; thus, the processing
of dog will be unaffected if either teaching or priff has just been
encountered. This is automatic and independent of any intention or task-related motivation. When processing is affected,
the affected word is known as the target; the word or entity that
affects the process is known as the prime. Primes do not have to
be words; they can be groups of words or complete sentences, or
they can be pictorial or aural. The prime is assumed to partially
activate circuits that include the target.
The assumption is that the degree of priming is proportional
to the semantic relatedness of the items (though this is not accurate for all types of priming) and that priming is an automatic
process. Another assumption, which gives rise to the assumption of proportionality, is that related words are stored closer
in semantic space to one another than are nonrelated words
(non-words are, of course, stored nowhere prior to encounter).
Indeed, one of the major values of priming studies is that they
permit investigation of the interconnections of the mental lexicon (Rumelhart and Norman 1985). Interpreting the results of
such studies, though, is not unproblematic; Kenneth I. Forster
(1976) shows that the facilitation of processing is affected in

Priming, Semantic
complex ways by whether the prime and target are low frequency
or high frequency (or combinations of these).
Different kinds of priming have different kinds of effect. If the
prime is formally similar but semantically unrelated to the target
(e.g., from and frog), the prime inhibits recognition of the target,
presumably because one is momentarily mistaken for the other
in the recognition process (Colombo 1986). Where the prime
and the target are drawn from the same semantic set (e.g., dog
and cat) or are otherwise related semantically, as in the dog-bark
example, the target is retrieved more quickly. This phenomenon
was first demonstrated (though not first noticed) by D. E. Meyer
and R. W. Schvaneveldt (1971) and has become a cornerstone of
psycholinguistic methodology.
Semantic priming automatically accelerates the recognition process if the time between sight of the prime and sight
of the target is short, but if the gap is greater, other factors
may impact. In particular, even if the prime and the target
are semantically related, there may be inhibition rather than
acceleration if the informant has been led to expect something else to occur (Neely 1977), though this inhibitory effect
does not occur if there is a very short time between prime and
target display (technically known as the stimulus-onset asynchrony, or SOA). This leads one to conclude that there are two
types of priming, one of which is automatic and short-lived
and the other of which attentional priming is longer lasting and more available for conscious inspection. The former
can be interfered with but will persist in the speaker on subsequent occasions; the latter can be created and equally can
disappear. So in an influential experiment, Neely (1977) told
informants that whenever they were primed with body, they
should expect a word associated with buildings to be the target. He then sometimes gave a word associated with the body
as target instead. He found that with an extremely short SOA,
both heart and door were processed equally quickly after use
of body as a prime. With a slightly longer SOA, however, the
processing of the unexpected heart actually took longer than
the (in the circumstances) expected door. Outside the context
of the experiment, though, there are no reasons to suppose
that the body-heart connection would be affected, and equally
it seems unlikely that the attentional priming would persist
much beyond the end of the experiment.
Successful processing has an effect here, too, and further
confirms the existence of attentional priming as a separate kind
of priming from automatic priming. If informants find that the
primes they are given relate regularly and reliably to the targets,
the effect of the priming gets increasingly strong. If, on the other
hand, they find little connection between the primes and the targets, the priming effect gets weaker (Den Heyer 1985)
It is important to note that priming of either the automatic
or attentional kind presupposes a causeeffect relationship: The
prime affects the target. However, it is implicit in this causal relationship that the word that is the target has some prior mental
connection with the word (or other linguistic, or indeed nonlinguistic, entity) that is chosen as prime, and it is this prior connection that the prime activates, whether the connection takes
the form of the words being stored near to each other because of
their semantic or pragmatic proximity or because they regularly
co-occur for some reason. An exploitation of the latter kind of

connection occurs when a textual co-text is used as prime and


the completion or continuation of the co-text as target. So its
time to go will accelerate recognition of home, but inhibit recognition of feet. It is interesting to note that even where the target
is semantically unpredictable, the co-text prime will still accelerate, rather than inhibit, recognition of the target so long as
it is syntactically predictable from the co-text (as in its time to
go hang-gliding) (Wright and Garrett 1984; West and Stanovich
1986)
Fairly obviously (but importantly), if people are primed
with a particular word, then they will recognize it more quickly
when it occurs again as target; this is known as repetition priming. (So dog accelerates the processing of a second instance
of dog.) Repetition priming affects both accuracy of response
(Jacoby and Dallas 1981) and speed of response (Scarborough,
Cortese, and Scarborough 1977). The priming may be over very
short intervals (tiny fractions of a second) or long intervals
(minutes, hours, or even days), and many believe that longterm repetition priming is explicable in terms of quite different
mechanisms from short-term priming. The way that repetition priming works has been disputed (Jacoby 1983; Tulving
and Schacter 1990), but it can be assumed to be a key factor
in priming for cohesion (see following discussion) particularly
as repetition priming has been shown to last for several hours.
Long-term repetition priming is often referred to as implicit
memory.
We have seen that psycholinguists have interested themselves in the way a prime may accelerate or retard the processing
of a target. The target itself must, however, be assumed to have a
preexistent relationship with the prime in advance of the particular priming effect. Otherwise, the relationship would have to be
created at the time of processing the target and the effect would
presumably be one of retardation. This prior relationship is investigated by Michael Hoey (2005) using corpus-linguistic rather
than psycholinguistic methodology; he assumes the relationship
to have been created by a type of repetition priming, such that
repeated encounters of the same items within the same environment result in the creation of an association between them.
Hoey terms these relationships lexical primings and uses them to
account for a wide range of linguistic phenomena. Although he
does not explicitly relate his model to the claims of connectionism, there are clear points of parallel.
As noted, lexical priming draws on a different evidential base
from the semantic (and repetition) priming research described
so far, drawing on corpus-linguistic evidence to demonstrate
the probability of particular psycholinguistic associations. The
first of these types of associations, and perhaps the most fundamental, is that of collocation (Sinclair 1991). A collocation such
as dog and bark is created for a speaker whenever each word is
semantically primed by the other as a result of repetition priming
of word combinations, such as the dog barked. The implication is
that repetition priming is primary both because it is long lasting
and because its existence accounts for (some) semantic priming.
From these primings, semantic associations (or preferences)
(Hoey 2005; Sinclair 2004), such as that of bark with spaniel-Alsatian-poodle-Labrador (etc.), are created. Members of the set may
be stored close to each other, but their presence in the set is a
result initially of the repetition priming of one or more members

665

Priming, Semantic

Principles and Parameters Theory

of the set in conjunction with bark. Thus, the mind stores several instances of the Alsatian barked and of the poodle barked, as
well of course the dog barked, and from this creates an association of bark with all types of dog. This set remains sound until
conflicting evidence is encountered. So Chihuahua may never
be encountered with barked but with yelped. The conflict persists
until the speaker modifies the original priming or treats the new
priming as either an exception or an anomaly. So, for instance,
bark might be placed in a semantic set of nonverbal noises, which
would include growl as well as bark and yelp. According to Hoey,
the same processes result in the establishment for the speaker
of quasi-grammatical relations associated with the lexical item
(colligational relations; Hoey 2005; Sinclair 2004), and grammar
is argued to be an output from the primings, rather than having
an existence independent of them.
Hoey (2005) also claims that speakers are primed to associate
words with textual position and cohesive patterning. All primings are assumed to be genre and domain specific, and this is
particularly noticeable of the textual primings. Thus, for example, British newspaper readers are primed by British news stories
to associate yesterday with text-initial sentences and to associate
Mr with first-word position in paragraph-initial sentences. The
words Blair and Bush are primed for most readers to be followed
cohesively by pro-forms (he, his), while Pluto is primed to be followed by co-hyponyms (Neptune, Saturn, etc.).
The term priming is used by Hoey to describe both the process and the product of that process. The association created
by priming may itself be subject to priming (which Hoey terms
nesting). Thus, we are primed to collocate Bush with George, and
then to collocate the nested pair George Bush (with or without W)
with President. As a further step, we are then primed to associate
(President) (George)(W) Bush as a combination with pronominal
cohesion. No assumptions are made, however, about the order
in which the primings may occur. Primings necessarily vary from
individual to individual.
The accumulation of collocational, colligational, and semantic relationships may explain linguistically the efficacy of semantic primings, though experimental evidence has not yet been
offered in support of this claim. The phenomenon of semantic
priming would appear to have no effect in giving rise to the lexical primings as described by Hoey, but the existence of semantic priming is confirmation of the efficacy of the prior lexical
primings.
Michael Hoey
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Colombo, Lucia. 1986. Activation and inhibition with orthographically
similar words. Journal of Experimental Psychology: Human Perception
and Performance 12: 22634.
Den Heyer, Ken. 1985. On the nature of the proportion effect in semantic
priming. Acta Psychologica 60: 2538.
Forster, Kenneth I. 1976. Accessing the mental lexicon. In New
Approaches to Language Mechanisms, ed. R. J. Wales and E. Walker,
25787. Amsterdam: North-Holland.
Harley, Trevor. 2001. The Psychology of Language: From Data to Theory.
2d ed. Hove and New York: Psychology Press. Although the information on priming is spread around the text, this is an ideal and accessible
introduction to the different kinds of priming.

666

Hoey, Michael. 2005. Lexical Priming: A New Theory of Words and


Language. London: Routledge. The key work on lexical priming.
Jacoby, Larry L. 1983. Perceptual enhancement: Persistent effects of an
experience. Journal of Experimental Psychology: Learning, Memory
and Cognition, 15: 93040.
Jacoby, L. L., and M. Dallas. 1981. On the relationship between autobiographical memory and perceptual learning. Journal of Experimental
Psychology: General 110: 30640.
Meyer, D. E., and R. W. Schvaneveldt. 1971. Facilitation in recognising
pairs of words: Evidence of a dependence between retrieval operations. Journal of Experimental Psychology 90: 22735.
Neely, J. 1977. Semantic priming and retrieval from lexical memory: Roles of inhibitionless spreading activation and limited capacity
attention. Journal of Experimental Psychology: General 106: 22654.
. 1991. Semantic priming effects in visual word recognition: A
selective review of current findings and theories. In Basic Processes in
Reading: Visual Word Recognition, ed. D. Besner and G. Humphreys,
264336. Hillsdale, NJ: Erlbaum. An extensive review of semantic priming from a word recognition perspective.
Rumelhart, D. E., and D. A. Norman. 1985. Representations of knowledge. In Issues in Cognitive Modeling, ed. A. M. Aitkenhead and J. M.
Slack, 1562. Hillsdale, NJ: Lawrence Erlbaum.
Scarborough, D. L., C. Cortese, and H. S. Scarborough. 1977. Frequency
and repetition effects in lexical memory. Journal of Experimental
Psychology: Human Perception and Performance 3: 117.
Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford: Oxford
University Press.
. 2004. Trust the Text: Language, Corpus and Discourse.
London: Routledge.
Tulving, E., and D. L. Schacter. 1990. Priming and human memory systems. Science 247: 3016.
West, R. F., and K. E. Stanovich. 1986. Robust effects of syntactic structure on visual word processing. Memory and Cognition 14: 10412.
Wright, B., and M. Garrett. 1984. Lexical decision in sentences: Effects of
syntactic structure Memory and Cognition 12: 3145.

PRINCIPLES AND PARAMETERS THEORY


The Framework
Principles and parameters (P&P) theory has been the prevailing approach to natural language syntax within transformational grammar and generative grammar since the
beginning of the 1980s. According to the P&P theory, the initial,
innate state of the human faculty of language FL0 is characterized
as a finite set of general principles complemented by a finite set
of variable options, called parameters. These principles and
parameters together constitute universal grammar (UG),
a model of FL0. FL0 functions as a language acquisition
device: It imposes severe constraints on attainable languages,
thereby facilitating the process of language acquisition, the core
of which lies in fixing the open parameter values of FL0. On this
view, competence in a given language is the result of a particular specification of the parameters of FL0 (called parametersetting), which determine the range of possible variation among
languages.
Interpreted broadly, the P&P framework can be seen as
a general model of the interaction of nature and nurture
(genetic endowment and experience) in the development of any
module of human cognition. Accordingly, it has come to be
applied beyond syntax both inside and outside of linguistics. An
example of the former case is the theory of phonology called

Principles and Parameters Theory


Government Phonology (see Kaye 1989), and an instance of the
latter is a recently emerging principles and parametersbased
approach to moral psychology (see Hauser 2006, and references
therein). In the domain of natural language syntax, the P&P
framework subsumes both government and binding (GB)
theory and its more recent development, called the minimalist program, or linguistic minimalism, even though the term is
often used narrowly to refer to the former model only.

From Rules to Principles


The P&P framework crystallized by the end of the 1970s as
a way to resolve the tension between two goals of generative
grammar. One objective was to construct descriptively adequate grammars of individual languages (see descriptive,
observational, and explanatory adequacy ). Another
was to address the logical problem of language acquisition
(see innateness and innatism ) by working out a theory of
UG that constrains possible grammars to a sufficiently narrow
range, so that the determination of the grammar of the language
being acquired from the primary linguistic data can become
realistic (this is referred to as explanatory adequacy). The two
goals clearly pull in opposing directions: The former seems
to call for allowing complex rules and a considerable degree
of variation across grammars (a liberal UG), while the latter
requires that possible grammatical rules be as constrained as
possible (a restrictive UG).
The research program that culminated in P&P theory aimed
to approximate these twin goals by establishing in the ways in
which grammatical rules can and should be restricted, extracting from them properties that seemed to be stable across constructions and languages, and formulating them as constraints
imposed by UG on the format of rules of individual grammars.
Uncovering, generalizing, and unifying such constraints eliminated from rules the general conditions on their operation,
which made it possible for rules themselves to be considerably
simplified. For instance, the transformational rule that forms
wh-interrogatives, the rule of relativization producing relative
clauses, the rule of topicalization, and several others, each
corresponding roughly to some construction recognized by
traditional grammars, share certain notable properties. Noam
Chomsky (1977) argued that instead of stating such properties
as part of each of these rules, some of them should be incorporated into UG, while others should be ascribed to the generalized
rule dubbed front-wh, of which each of the individual rules is an
instantiation. The furthest such a factoring out and unification
strategy can potentially lead to is a model of language where
rules (as well as the corresponding constructions of traditional
grammar) are eliminated altogether from the theory as epiphenomena deducible from the complex interaction of the general
principles of UG. This is precisely the approach that the P&P
framework has been pursuing.

the lexically specified theta-roles of a predicate and its argument expressions in the syntactic representation. As for structures derived by transformations, movement rules are reduced
to a single and maximally general operation Move that can
move anything anywhere. A representational filter that limits the application of Move is the empty category principle
(ECP), which demands that traces of movement be licensed
under a local structural relation called government. Apart from
the ECP and the bounding theory, which places an upper
bound on how far movement can take an element, various other
modules of UG, not narrowly geared to cut down the overgeneration of structures resulting from Move , act to filter the output representations produced by movements. case theory
requires that (phonetically overt) noun phrases (NPs) occupy a
position at surface structure where they are assigned a case. The
three principles of binding theory (which constrain the distribution of anaphors, personal pronouns, and referential NPs,
respectively, relative to potential antecedents they can/cannot
be coreferential with) are sensitive to the binary [ anaphoric]
and [ pronominal] features of NP categories generally, including phonetically empty NPs like various types of traces and null
pronouns.
The modular organization itself, that is, the dissociation of
various aspects of syntactic phenomena for the purposes of the
grammar, is what makes it possible to keep principles of UG maximally simple. The cohesion of each module is supplied by some
notion and/or formal relation on which its principles are centered. The whole of the grammatical system is also characterized
by unifying concepts, most notably the notion of government,
which plays a key role in a variety of modules. The components
interact in complex ways to restrict the massive overgeneration
of syntactic expressions that would otherwise result from the
fundamental freedom of possible basic phrase structures and
transformations applied to them, which ultimately yields the
actual set of well-formed expressions.
The modularity of the different (sets of) principles is due
not only to the dissociation of the properties relevant to them
but also to the stipulation of distinctions with regard to where
in the grammar they apply. According to GB theory, each sentence corresponds to a sequence of representations, starting
from D-structure (or deep structure, DS), proceeding through
S-structure (or surface structure, SS) to the final representation
called logical form (LF), where adjacent representations are
related by transformations. The derivation from DS to SS feeds
phonetic realization, in particular the mapping from SS to phonetic form (PF) (it is overt), whereas the derivation from SS to LF
does not (it is covert). A principle can apply to transformations
(like bounding theory), or to one or more of the three syntactic representational levels DS, SS, and LF (these constraints are
filters), though not to any intermediate representation. Figure 1
depicts this so-called Y- or T-model of GB, tagged to indicate
where the most prominent modules apply.

Modularity
In the government and binding model of the P&P approach
(Chomsky 1981), principles of UG are organized into modules,
or subtheories. Such modules include x-bar theory, which
constrains possible phrase structure configurations, and
theta theory, which determines a bi-unique mapping between

Parameters
UG, as a model of language competence, includes the principles
along with the locus of their application, as well as the primitive syntactic objects (e.g., labels distinguishing full phrases,
heads of phrases, and intermediate-level categories), relations

667

Principles and Parameters Theory

Theta Theory
X - bar Theory

Lexicon

DS

Bounding Theory
overt transformations

Theta Theory
Case Theory

Theta Theory
ECP
Binding Theory

SS

LF

covert transformations

Figure 1.

PF

(e.g., c-command, dominance, government), and operations


(e.g., movement, deletion) that collectively define the syntactic
expressions. Cross-linguistic variation, according to GB theory,
is rather limited. An obvious element of variation involves the
identity and properties of lexical items (referred to collectively
as the mental lexicon). Apart from acquiring a lexicon, the
primary means of grammar acquisition and the key source of
cross-linguistic differences is the inference of underspecified
aspects of UG principles, that is, the setting of open parameters.
Parametric principles are an innovation to allow the model
to furnish descriptively adequate because suitably different grammars for individual languages. To provide a realistic
account of language acquisition, a process that is fairly uniform
and remarkably effective both across speakers and across languages, the number of parameters to be fixed must be reasonably low, the parameter values permitted by UG must be limited
to relatively few, and the cues in the primary linguistic data that
can trigger their values must be sufficiently easy to detect. Due to
their rich deductive structure, a distinct advantage of parameterized principles over language- and construction-specific rules
is that the setting of a single parameter can potentially account
for a whole cluster of syntactic properties, thereby contributing
to a plausible explanation for the outstanding efficiency of the
process of language acquisition itself. Such parameters are often
referred to as macro-parameters.
A canonical macro-parameter of GB theory is the so-called
pro-drop (or null subject) parameter. Null subject languages
like Italian or Greek, in contrast to non-pro-drop languages like
English or Dutch, allow phonetically null pronominal subjects
(designated as pro) in tensed clauses, have no overt pleonastic
element filling the subject position of weather verbs (cf. It rained),
exhibit free subject inversion to the right of the verb, and permit
movement of a subject out of an embedded wh-clause and from
a position following a lexical complementizer (cf. *Whoi do you
think that ti will win?). The classic account of this cluster of properties ascribes them to a single parameter, namely, whether or
not the finite verbal agreement inflection syntactically governs
the preverbal subject position, which in turn is related to the
morphological richness of the relevant conjugation paradigm. It
is due to the positive setting of this (ultimately lexical) parameter
that null subject languages license a phonetically null pronoun
or a trace in the canonical subject position of tensed clauses relatively freely.
Parameters range from macro-parameters like the null subject parameter to micro-parameters whose scope is comparatively narrow, for instance, the parameter determining whether
or not the (finite) main verb raises out of the verb phrase (VP)
before S-structure to a position above VP adverbs or clausal
negation (the verb raising parameter). Another dimension along

668

which parameters differ concerns the number of options, that


is, parameter settings that are allowed. Most parameters are
binary, but proposals have been made for parameters with more
options: for instance, the choice of the local domain in which
anaphors must find an appropriate antecedent. Binary parameters include the choice of the timing of a movement transformation with respect to S-structure (either overt or covert; see
Figure 1). Finally, while some parameters are simply underspecified aspects of UG principles, others are grammatical properties
of (classes) of lexical items. The head directionality parameter
(set as head-initial for English, where verbs, nouns, adjectives,
and adpositions precede their complements, and head-final for
Japanese, where they follow them) belongs to the first of these
two types, while variation in terms of the lexical items that are
lexically [+anaphoric] exemplify the second.

The Shift to Minimalism


The P&P framework inspired a vast amount of research on similarities and differences across languages, as well as on language
acquisition (see principles and parameters theory and
language acquisition, and syntax, acquisition of),
which has produced an impressive array of novel discoveries and analyses that are both attractively elaborate in terms of
data coverage and at the same time genuinely illuminating as
regards the explanations they offer. That said, in pursuit of the
twin objectives of descriptive and explanatory adequacy, some of
the basic notions and principles became increasingly non-natural and complex (like government and the ECP, or the notion
of local domain in binding theory). This gave cause for growing
concern in the field, in no small part because the question of why
UG is the way it is became disappointingly elusive. The ultimate
source of the emergent complexities, beyond the strive for everimproving empirical coverage, was the fact that GB lacked an
actual theory of possible principles or, for that matter, of possible
parameters. As for the latter, continued in-depth research on
cross-linguistic variation has shown many of the macro-parameters, among them the null subject parameter, to be unsustainable in the strong form they were originally proposed: Several of
the linguistic properties correlated by macro-parameters turned
out to be cross-linguistically dissociable. Even though the idea
of parametric linguistic variation was upheld, parameters themselves needed to be scaled down. In addition, as GB relied on
massive overgeneration resulting from the fundamental freedom of basic phrase structure and transformations, downsized
by declarative constraints imposed (mainly) on syntactic representations, the computational viability of the model was often
called into question.
The current minimalist research program (MP), initiated by
Chomsky in the early 1990s (see Chomsky 1995), while building

Principles and Parameters Theory


on the achievements of GB theory, departs from it in various
important ways. It refocuses attention on the shape of UG itself
as a model of the innate faculty of language (FL), a computational-representational module of human cognition, as well as on
the way it interfaces with (articulatory-phonetic and conceptualintentional) external systems. The MP adopts the substantive
hypothesis (called full interpretation) that representations that
the FL feeds to the external interface systems are fully interpretable by those components, with all uninterpretable aspects of
the representations eliminated internally to FL. As for the shape
of UG as a computational system, the MP puts forward the substantive hypothesis that FL is computationally efficient: It incurs
minimal operational complexity in the construction of representations fully interpretable by the interface systems. Syntactic
operations like movement apply only if they are triggered: only
if they must be carried out in order to satisfy full interpretation
by eliminating some uninterpretable property in the syntactic
expression under computation (a principle of computational
economy called last resort). If there is more than one way that a
derivation can satisfy full interpretation, the least complex (set
of) operation(s) is selected by FL (the principle of least effort).
On the methodological side, the MP proposes to apply
Ockhams razor considerations of theoretical parsimony to UG
as rigorously as possible. All syntax-internal principles constraining representations are disposed of, thereby eliminating
syntax-internal representational levels, including S-structure
and D-structure. The incremental structure-building operation
merge starts out from lexical items, combining them recursively
into successively larger syntactic units. Empirical properties formerly captured at D-structure and S-structure are accounted for
by shifting the burden of explanation to full interpretation at the
interface levels of PF and LF, and to principles of economy of
derivation, the only principles operational in UG. Economy principles have no built-in parameters: All parametric differences
across languages are confined to the domain of lexical properties, an irreducible locus of variation, to which, accordingly, the
acquisition of syntax is reduced (cf. the lexical learning
hypothesis). For instance, word order variation, previously
put down to the head directionality parameter, is typically attributed to movement operations: Movements can occur either in
overt or in covert syntax, and they can affect smaller or larger
units of structure, these choices being a function of uninterpretable lexical properties of participating elements.
Non-naturally complex notions and relations (including
government) are also eliminated from UG. A syntactic expression is taken to be a plain set (of sets of sets, etc.) of lexical items,
produced by recursive applications of merge: Nothing beyond
that is added in the course of the derivation. It follows from this
simplifying proposal (called inclusiveness) that syntactic expressions include no indices (to link a moved element to its trace, or
a binder to its bindee), no traces (but silent copies of the moved
elements themselves), no syntactic label for phrase or head
status, and perhaps no labels borne by complex syntactic units at
all. Two stipulative assumptions of the GB model that all overt
movements precede all covert movements and that transfer to
phonetic and conceptual interpretation can only take place at
a unique point in the derivation are also dropped. This yields
a model that has overt and covert movements intermingled

Conceptual interpretation
Full Interpretation
principles of computational economy

Lexicon

Full Interpretation
Phonetic interpretation
Figure 2.
(applying them as soon as their respective trigger is merged in)
and that has multiple transfers. Derivational sequences between
two transfer points are called phases. The basic architecture is
shown in Figure 2.
Finally, grammatical components are reduced as well. First of
all, there are no distinct phrase structure and transformational
components, as both basic phrase structure and movements are
brought about by the operation merge: While basic structure
building involves merging two distinct elements, movement
involves (re)merging an element with a constituent that contains it. In addition, the burden of description carried by modules of GB is partly reallocated to syntax-external components,
and is partly redistributed among the residual factors that can
enter syntactic explanation: the principal constraint imposed by
the interface components (full interpretation), the character of
the syntactic derivation (multiple transfers, principles of computational economy, the nature of basic syntactic operations,
etc.), and the properties of lexical items. For instance, much of
the binding theory of UG is reduced to movement operations
and rules of interpretation, case theory is recast in more general
terms and is subsumed in a broader account of triggers for movements (called checking theory), and bounding theory is deduced
from the multiple transfers nature of the derivation.

Conclusion
The fundamental question pursued by the P&P framework is
whether it is possible to construct an explanatorily adequate
theory of natural language grammar based on general principles. Two further ambitions of P&P, gaining prominence with
the advent of its minimalist research program, are to find out
whether the primitive notions and principles of such a model are
characterized by a certain degree of naturalness, simplicity, and
nonredundancy, and concurrently, whether some properties of
the language faculty can be explained in terms of design considerations pertaining to computational cognitive subsystems in
general, or even more broadly, in terms of laws of nature. Should
it turn out that these questions are answered in the affirmative (as
some initial results suggest), that would be a surprising empirical discovery about an apparently complex biological subsystem
(cf. biolinguistics): in the case at hand, the human language
faculty. The exploration of the ways in which general laws of
nature might enter linguistic explanation has barely begun.
Clearly, most of the work lies ahead.
Balzs Surnyi

669

Principles and Parameters Theory and Language Acquisition


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baker, Mark C. 2001. The Atoms of Language: The Minds Hidden Rules of
Grammar. New York: Basic Books.
Chomsky, Noam. 1977. On wh-movement. In Formal Syntax, ed.
Peter Culicover, Tom Wasow, and Adrian Akmajian, 71132. New
York: Academic Press.
. 1981. Lectures on Government and Binding. Dordrecht, the
Netherlands: Foris.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
Hauser, Marc D. 2006. Moral Minds: How Nature Designed Our Universal
Sense of Right and Wrong. New York: Ecco/Harper Collins.
Kaye, Jonathan. 1989. Phonology: A Cognitive View. Hillsdale,
NJ: Lawrence Erlbaum.
Lasnik, Howard, and Juan Uriagereka, with Cedric Boeckx. 2005. A Course
in Minimalist Syntax: Foundations and Prospects. Oxford: Blackwell.

PRINCIPLES AND PARAMETERS THEORY AND


LANGUAGE ACQUISITION
Nativism
The basic idea in principles and parameters theory is to
distinguish the invariants of human language (the principles)
from the major points of cross-linguistic variation (the parameters). Both principles and parameters are taken to reflect innately
determined, biological characteristics of the human brain (see
universal grammar). In the course of normal child development, however, the two diverge: The principles come to operate
in much the same way in every child, with minimal sensitivity to
the childs environment, while the parameters take on distinct
values as a function of the childs linguistic input.
The term parameter is normally reserved for points of narrowly restricted variation. The principles and parameters (P&P)
framework also acknowledges that languages vary in ways that
are relatively unconstrained by universal grammar, such as the
exact form of vocabulary items. These latter points of variation
are usually treated as arbitrary idiosyncrasies, to be listed in the
lexicon.
The P&P framework has its origins in the two foundational
questions of modern linguistics (Chomsky 1981): What exactly
do you know, when you know your native language? And how
did you come to know it? A satisfactory answer to these questions
must address the poverty of the stimulus, including the fact that
children are not reliably corrected when they make a grammatical error (Brown and Hanlon 1970; Marcus 1993).
Despite the poverty of the stimulus, by the age of about five
years we observe uniformity of success at language acquisition (Crain and Lillo-Martin 1999): Aside from cases of medical
abnormality, or isolation from natural-language input, every
child acquires a grammar that closely resembles the grammar of
his or her caregivers. Moreover, even when a child is younger,
and still engaged in the process of language acquisition, extraordinarily few of the logically possible errors are actually observed
in the childs spontaneous speech (Snyder 2007). Clearly, children do not acquire grammar through simple trial-and-error
learning.
Linguists working in the P&P tradition have concluded that
a great deal of grammatical information must already be present in the childs brain at birth. Of course, different languages of
the world exhibit somewhat different grammars, but the claim in

670

P&P is that the options for grammatical variation are extremely


limited. On the P&P approach, the childs task during language
acquisition is akin to ordering food in a restaurant: One need
only make selections from a menu, not give the chef a recipe.
In other words, the information required for the child to
select an appropriate grammar from among the options is far
less, both in quantity and in quality, than would be required to
build a grammar from the ground up. First, grammars that cannot be attained with the available parameter settings will never
be hypothesized by the child, even if they are compatible with
the childs linguistic input up to that point. Second, to the extent
that parameters are abstract, and thus have wide-spread consequences, a variety of different sentence types in the linguistic
input can help the child select the correct option. The challenge
of identifying the correct grammar is still considerable, but is far
more tractable than it would be if the child had to rely on general
learning strategies alone.

Investigating Language and Its Acquisition Within a P&P


Framework
The P&P framework was first clearly articulated for syntax, in the
context of government and binding theory (e.g., Chomsky
1981, 1986). Yet the framework is considerably more general.
First, the same basic architecture has been applied to phonology,
notably in the framework of government phonology (e.g., Kaye,
Lowenstamm, and Vergnaud 1990), and also (in certain work) to
semantics and morphology. Second, recent syntactic and phonological research in the minimalist program (Chomsky 1995,
2001; see minimalism) and in optimality theory (Prince
and Smolensky 2004) still crucially assumes a P&P framework, in
the broad sense that it posits universal principles and narrowly
restricted options for cross-linguistic variation. (This point is discussed further in the next section.)
Within the P&P framework, research on childrens acquisition of language plays a number of important roles. First, such
research can clarify the logical problem of language acquisition, which any explanatorily adequate linguistic theory must
address: How in principle can the correct grammar be chosen
from among the proposed options, using only the types of linguistic input that children actually need for successful language acquisition? (See descriptive, observational, and
explanatory adequacy.) Acquisition research can help
determine which types of linguistic input are (and are not), in
fact, necessary for children to succeed at language acquisition.
For example, some of the most compelling evidence
for the irrelevance of corrective feedback comes from Eric
H. Lennebergs (1967, 3059) study of a hypolingual child. Despite
the fact that the child had been mute since birth, and therefore
had had no possibility of producing any errors to be corrected, he
performed at an age-appropriate level on comprehension tests of
English grammar. Hence, receiving corrective feedback on ones
own utterances seems to be unnecessary. Hearing the linguistic
utterances of other speakers, produced in context, can suffice. To
achieve explanatory adequacy, a linguistic theory must be able
to account for this.
A second role of acquisitional evidence within the P&P framework lies in testing the acquisitional predictions of proposed
linguistic principles. All else being equal, if one proposes that

Principles and Parameters Theory and Language Acquisition


a given property of language is an innate principle of universal
grammar, then one expects the principle to be operative in children as early as we can test for it. (A notable exception is found
in the work of Hagit Borer and Ken Wexler 1992, who propose
that several specific linguistic principles undergo maturational
change during childhood.)
For example, Stephen Crain and Mineharu Nakayama (1987)
conducted an acquisitional test of structure dependence, the
proposed principle that syntactic movement is always sensitive
to hierarchical structure. Their study tested the prediction that
structure dependence, as an innate principle, should be operative very early. The study was conducted with children acquiring
English who were three to five years old (the youngest subjects
capable of performing the experimental task), and used prompts
such as the following: Ask Jabba if [the man who is beating a
donkey] is mean. Crucially, children never produced errors of
the form, Is [the man who __ beating a donkey] is mean? Such
errors might have been expected, however, if the children had
been at liberty to hypothesize structure-independent rules (such
as Move the first auxiliary to the beginning of the sentence).
Third, by proposing a parameter of universal grammar, one
makes predictions about the time course of child language acquisition. These predictions may involve concurrent acquisition or
ordered acquisition. To see this, suppose that two grammatical
constructions A and B are proposed to have identical prerequisites, in terms of parameter-settings and lexical information. A
and B are then predicted to become grammatically available to
any given child concurrently, that is, at the same point during
language acquisition.
A prediction of ordered acquisition results when the proposed
linguistic prerequisites for one construction (A) are a proper
subset of the prerequisites for another construction (B). In this
case A might become available to a given child earlier than B,
if the child first acquires the subset of Bs prerequisites that are
necessary for A. Alternatively, A and B might become available
to the child concurrently, if the last-acquired prerequisite for B
is also a prerequisite for A. In contrast, no child should acquire B
significantly earlier than A.
As a concrete example, consider William Snyders (2001)
work on the compounding parameter (TCP). Theoretical research
had suggested a link (at least in Dutch and Afrikaans) between
the verb-particle construction (cf. Mary lifted the box up) and
morphological compounding (cf. banana box, for a box where
bananas are kept). Snyder observed a one-way implication in
the data from a sizable number of languages: If a language permits the verb-particle construction, then it also allows free creation of novel compounds like banana box. The implication is
unidirectional, however: There do exist languages that allow this
type of compounding, yet lack the verb-particle construction.
Snyder therefore proposed that the grammatical prerequisite for
the English type of compounding (i.e., the positive setting of TCP)
is one of several prerequisites for the verb-particle construction.
A clear acquisitional prediction followed: Any given child
acquiring English will either acquire compounding first (if [+TCP]
is acquired prior to the other prerequisites for the verb-particle
construction), or acquire compounding and the verb-particle
construction at the same time (if [+TCP] is the last-acquired
prerequisite for the verb-particle construction). In no case will

a child acquire the verb-particle construction significantly earlier than compounding. This prediction received strong support
from a longitudinal study of 10 children.
This example illustrates how the investigation of language
acquisition and the investigation of mature grammars can
be mutually reinforcing activities within the P&P framework.
Another example is provided by the work of Diane Lillo-Martin
and Ronice Mller de Quadros (2005), who considered the parametric prerequisites for the different types of wh-questions in
American Sign Language (ASL), according to two competing
syntactic analyses. The two analyses yielded distinct predictions
about the time course of acquisition, which were then successfully tested against longitudinal data from children acquiring
ASL.

Areas of Debate
We mention here two areas of debate within the P&P approach
to child language acquisition, and of course there are others.
1) What types of parameters, exactly, is the child required to set?
2) What are the observable consequences of an unset or misset
parameter?
One point of disagreement in the P&P literature quite generally, including the acquisition literature, concerns the proper
conception of parameters. A classic conception, which Noam
Chomsky (1986, 146) attributes to James Higginbotham, is the
switchbox metaphor: Each parameter is like an electrical switch,
with a small number of possible settings.
Yet this is only one of many possible ways that parameters
could work. A radically different conception is found in optimality theory, which posits a universal set of violable constraints.
Instead of choosing particular settings for switches in a switchbox, the learner has to rank the constraints correctly. The result
is a narrowly restricted set of options for the target grammar,
as required by the P&P framework. (Indeed, on the mathematical equivalence of a constraint ranking to a set of switchboxstyle dominance parameters, see Tesar and Smolensky 2005,
456.)
Still another approach to parameters is to connect them to the
lexicon. (See lexical learning hypothesis.) This is conceptually attractive because the lexicon is independently needed as
a repository of information that varies across languages. Exactly
what it means to connect parameters to the lexicon, however,
has been open to interpretation.
One idea is to connect points of abstract grammatical (e.g.,
syntactic) variation to the paradigms of inflectional morphology.
The idea is that paradigmatic morphology has to be stored in the
lexicon anyway, and might provide a way to encode parametric
choices. This approach can be found in Borer (1984) and LilloMartin (1991), for example. A related idea is to encode parametric
choices in the morphology of closed-class lexical items. A good
example is Picas (1984) proposal to derive cross-linguistic variation in the binding domain of a reflexive pronoun from the pronouns morphological shape. A variant of Pierre Picas approach
is to encode parametric choices as abstract (rather than morphologically overt) properties of individual lexical items. This is the
lexical parameterization hypothesis of Wexler and Rita Manzini
(1987), who took this approach to cross-linguistic variation in the
binding domain for both reflexives and pronominals.

671

Principles and Parameters Theory and Language Acquisition


Yet another idea is to encode cross-linguistic grammatical
variation in the abstract (often phonetically null) features of
functional heads. Chomsky (1995, Chapter 2) takes this approach
to V-raising in French, for example, and its absence in English: In
French, the functional head Agr0 is strong, and causes the
verb to move up and adjoin to Agr0 before the sentence is pronounced. The result is the word order in Jean [AgrP voit [VP souvent
[VP Vt Marie]]], literally John [AgrP sees [VP often [VP Vt Mary]]], in
place of English John [AgrP [VP often [VP sees Mary]]].
Chomskys approach is lexical in the sense that the morphosyntactic features of functional heads like Agr0 are taken to
be listed in the lexicon. Note, however, that the possible features of a functional head are still assumed to be quite narrowly
restricted. Thus, where earlier work might have posited a switchlike parameter of [ verb raising], for example, Chomsky instead
posits a choice between a strong feature versus a weak feature on
Agr0, and assumes that this particular lexical item will be present
above the verb phrase (VP) in most or all cases. For purposes of
language acquisition, the difference is extremely minor; the child
makes a binary choice, and it has consequences across a wide
range of sentence types. Therefore, Chomskys approach still
falls squarely within the P&P framework.
The second and final point of disagreement that we mention
here concerns the consequences of unset or misset parameters.
For concreteness, we focus on the switchbox model: Can a switch
be placed in an intermediate, unset position? Alternatively, must
a child sometimes make temporary use of a setting that is not in
fact employed in the target language? If so, what are the consequences for the functioning of the language faculty?
One school of thought is that there is no such thing as an
unset parameter: Every parameter is always in a determinate setting, be it an arbitrary setting (cf. Gibson and Wexler
1994), or a prespecified default setting (e.g., Hyams 1986).
On this view, temporary missettings may be routine during
the period when language acquisition is still underway. (The
notion that certain parameter settings might be defaults, or
unmarked options, has its roots in the phonological concept
of markedness.)
A second school of thought maintains that parameters are
initially unset. Virginia Valian (1991) proposes that an unset
parameter permits everything that any of its potential values
would allow. Somewhat similarly, Charles D. Yang (2002) proposes that the learner begins the language acquisition process
not with a single grammar but, rather, with a multitude of different grammars, all in competition against one another. Every
grammar corresponding to a permissible array of parameter-settings is included. A consequence is that competing values of the
same parameter can be in play at the same time.
A cross-cutting view is that children may temporarily entertain nonadult parameter settings (whether default or not; see,
e.g., Thornton and Crain 1994). Children may then produce
utterances that use a grammatical structure found in some of
the worlds languages, but not in the target. On this view, what
is crucial is simply that the learner must eventually arrive at the
target parameter-setting, regardless of what parameter-settings
have been temporarily adopted along the way. This is the learning problem that is addressed by Edward Gibson and Wexlers
(1994) trigger learning algorithm, for example.

672

An alternative view is that the child reserves judgment on any


given parameter setting until he or she has enough information
to set it with confidence. Initially the parameter is in an unset
state, but this time the consequence is that none of the grammatical options tied to a specific setting of the parameter is actually
endorsed by the child. Snyder (2007) advances this view when he
argues that children who are speaking spontaneously, in a natural setting, make astonishingly few of the logically possible grammatical errors. The vast majority of the errors that do occur either
are errors of omission or belong to a tiny subset of the logical
possibilities for comission errors (where the words are actually
pronounced in configurations that are ungrammatical in the target language).
Most of the grammatical comission errors that are found
in studies of elicited production or comprehension are absent
from childrens spontaneous speech, even when the opportunities exist for the child to make them. Snyder concludes
that many of these errors result from the demands of the
experimental tasks. When left to their own devices, children
successfully avoid putting words together in ways that would
require them to make a premature commitment to a particular
parameter-setting.

Conclusion
Language acquisition is a rich source of evidence about both the
principles and the parameters of the human language faculty.
For this reason, research on language acquisition plays a central
role in the P&P framework.
William Snyder and Diane Lillo-Martin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Borer, Hagit. 1984. Parametric Syntax: Case Studies in Semitic and
Romance Languages. Dordrecht, the Netherlands: Foris.
Borer, Hagit, and Ken Wexler. 1992. Bi-unique relations and the maturation of grammatical principles. Natural Language and Linguistic
Theory 10: 14789.
Brown, Roger, and Camille Hanlon. 1970. Derivational complexity and
order of acquisition in child speech. In Cognition and the Development
of Language, ed. John R. Hayes, 155207. New York: Wiley.
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht,
the Netherlands: Foris.
. 1986. Knowledge of Language: Its Nature, Origin, and Use. New
York: Praeger.
. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
. 2001. Derivation by phase. In Ken Hale: A Life in Language, ed.
Michael Kenstowicz, 152. Cambridge, MA: MIT Press.
Crain, Stephen, and Diane Lillo-Martin. 1999. Linguistic Theory and
Language Acquisition. Cambridge, MA: Blackwell.
Crain, Stephen, and Mineharu Nakayama. 1987. Structure dependency
in grammar formation. Language 63: 52243
Gibson, Edward, and Kenneth Wexler. 1994. Triggers. Linguistic Inquiry
25: 355407.
Hyams, Nina. 1986. Language Acquisition and the Theory of Parameters.
Dordrecht, the Netherlands: Reidel.
Kaye, Jonathan D., Jean Lowenstamm, and Jean-Roger Vergnaud. 1990.
Constituent structure and government in phonology. Phonology
7: 193231.
Lenneberg, Eric H. 1967. Biological Foundations of Language. New
York: Wiley.

Print Culture
Lillo-Martin, Diane C. 1991. Universal Grammar and American Sign
Language: Setting the Null Argument Parameters. Dordrecht, the
Netherlands: Kluwer.
Lillo-Martin, Diane, and Ronice Mller de Quadros. 2005. The acquisition of focus constructions in American Sign Language and Lngua
de Sinais Brasileira. In BUCLD 29 Proceedings, ed. Alejna Brugos,
Manuella R. Clark-Cotton, and Seungwan Ha, 36575. Somerville,
MA: Cascadilla Press.
Marcus, Gary F. 1993. Negative evidence in language acquisition.
Cognition 46: 5385.
Pica, Pierre. 1984. On the distinction between argumental and nonargumental anaphors. In Sentential Complementation, ed. Wim de Geest
and Yvan Putseys, 18594. Dordrecht, the Netherlands: Foris.
Prince, Alan, and Paul Smolensky. 2004. Optimality Theory: Constraint
Interaction in Generative Grammar. Malden, MA: Blackwell.
Snyder, William. 2001. On the nature of syntactic variation: Evidence
from complex predicates and complex word-formation. Language
77: 32442.
. 2007. Child Language: The Parametric Approach. Oxford: Oxford
University Press.
Tesar, Bruce, and Paul Smolensky. 2000. Learnability in Optimality
Theory. Cambridge, MA: MIT Press.
Thornton, Rosalind, and Stephen Crain. 1994. Successful cyclic movement. In Language Acquisition Studies in Generative Grammar, ed.
Teun Hoekstra and Bonnie D. Schwartz, 21553. Amsterdam: John
Benjamins.
Valian, Virginia. 1991. Syntactic subjects in the early speech of American
and Italian children. Cognition 40: 2181.
Wexler, Kenneth, and Rita Manzini. 1987. Parameters and learnability in
binding theory. In Parameter Setting, ed. Thomas Roeper and Edwin
Williams, 4176. Dordrecht, the Netherlands: Reidel.
Yang, Charles D. 2002. Knowledge and Learning in Natural Language.
Oxford: Oxford University Press.

PRINT CULTURE
Print (or typographic) culture designates all the activities entailed
in producing, distributing, collecting, and reading printed materials and engraved images. As a historical construct, it usually
refers to the literary environment that first emerged in Western
Europe during the second half of the fifteenth century. Diverse
developments elsewhere (such as the use of xylography in China
and of movable type in Korea, prohibitions against Arabic printing by Ottoman rulers, and the sluggish pace of Russian printing)
lend themselves to comparative study but cannot be covered
here.
In Western Europe (unlike other areas), the printing arts, once
introduced, spread with remarkable rapidity. Between the 1460s
and 1490s, printing shops were established in all of the major
political and commercial centers. New occupations (typefounding and presswork) were introduced; bookmaking arts were
reorganized; trade networks were extended; book fairs inaugurated. By 1500, the use of movable type had become the dominant mode for duplicating texts, and xylography had replaced
hand illumination for replicating images.
Though often classified under the heading of book history,
print culture encompasses a vast variety of nonbook materials,
such as advertisements, almanacs, calendars, horoscopes, proclamations, tickets, and timetables. It also entails the provision
of visual aids (such as maps, charts, tables, graphs, and detailed
drawings) that are especially difficult to duplicate in large

quantities by hand. Print culture is defined largely in contrast to


the scribal (or chirographic) culture that had prevailed in the
West during previous millennia when handcopying was the sole
means of duplicating writing and drawing. Although handcopying persisted and indeed thrived after the introduction of printing, it did so within a changed literary environment. Handwriting
itself was taught with reference to printed manuals; copyists imitated the title pages, the punctuation, and pagination of printed
books.
Scribal culture had been characterized by an economy of
scarcity. The large collections of texts gathered in the Alexandrian
library and in some later centers of learning were exceptional
and relatively short-lived. The retrieval, copying, and recopying
of surviving texts took precedence over composing new ones.
The acquisition of literacy was confined to restricted groups
of churchmen and lay professionals. Oral interchange predominated. (As noted in the following, this has led some authorities to
contrast print, not with scribal but with oral culture.)
Print culture introduced an economy of abundance. The
continued output of handcopied books simply added to a growing supply. Wholesale production replaced a retail book trade.
Increased output was spurred by competition among printers
and booksellers, who curried favor with the authorities in order
to win the privilege of issuing primers, prayer books, official
edicts, and other works for which there was a steady demand.
Print culture gave rise to new laws governing copyright, patenting, and intellectual property. The literary diets of Latin-reading
professional groups were enriched by access to many more
books than had been available before. More abundantly stocked
bookshelves increased opportunities to compare ancient texts
with each other and with more recent work. Academic activities
were reoriented from preserving ancient wisdom to registering
new findings and venturing into uncharted fields. The expansive
character of print culture grew more pronounced over the course
of centuries. Multivolumed reference works required constant
updating; serial publication was introduced; bibliographies grew
thicker and more specialized. Concern about information overload was experienced by each generation in turn.
The drive to tap new markets encouraged popularization, translations from Latin into the vernaculars, and a general democratization of learning and letters. The church was
divided over whether to support or counter these trends, especially over whether or not to authorize vernacular translations
of Bibles and prayer books. After the Lutheran revolt, lay Bible
reading was encouraged in Protestant regions and discouraged
in regions that remained loyal to Rome. Whereas a single Index
of Prohibited Books provided guidance to all Catholic officials,
Protestant censorship was decentralized, taking diverse forms in
different realms.
In all regions, learning to read paved the way for learning
by reading. Autodidacts were urged to master various arts by
means of how-to texts (Cormack and Mazzio 2005). Authors,
artists, and craftsmen, in collaboration with printers and publishers, used self-portraits, title pages, and paratextual materials
to advertise their products and themselves. Individual initiative was rewarded; the drive for fame went into high gear. But
the preservative powers of print made it increasingly difficult
for successive generations to win notice from posterity. An ever

673

Print Culture
more strenuous effort was required to cope with the burden of
the past (Bate 1970).
There are synchronic as well as diachronic aspects to print
culture (see synchrony and diachrony). Unlike handcopied texts, printed copies get issued not consecutively but
simultaneously. The distribution of printed copies was relatively
slow before the development of modern transport systems.
Nevertheless, the age of the wooden handpress saw a marked
improvement in the coordination of diverse activities, such as
checking the path of a comet against diverse predictions, incorporating new findings and corrections in successive editions of
reference works, or mobilizing a protest movement in different
parts of a given realm. We made the thirteen clocks strike as
one, commented an American revolutionary.
Simultaneity went together with standardization, as is illustrated in an anecdote about Napoleons minister of education,
who looked at his watch and announced that at that moment all
French schoolboys of a certain age were turning the same page
of Caesars Gallic Wars. The output of the handpress fell short
of achieving the degree of standardization that marks modern
editions. Yet early modern readers were able to argue, in both
scholarly tomes and polemical pamphlets, about identical passages on identically numbered pages.
Simultaneity is of special significance in conjunction with
journalism. Especially after the introduction of iron presses
harnessed to steam and wire services that made use of the telegraph, the newspaper press would restructure the way readers
experienced the flow of time. Simultaneity is nicely illustrated by
the front page layout of a modern newspaper, which has been
described by Marshall McLuhan as a mosaic of unrelated scraps
in a field unified only by a dateline (1964, 249). Given the juxtapositions and discontinuities presented by the front page, it is
a mistake to associate print only with linear sequential modes
of thought. Although books and newspapers are now filed separately in libraries and archives, they are intertwined manifestations of print culture, beginning with early newsbooks and going
on to later serialized novels.
Even in the age of the handpress, newsprint altered the way
readers learned about affairs of state. It created a forum outside
parliaments and assembly halls and invited ordinary subjects
to participate in debates (by contributing letters to editors). It
provided ambitious journalists (from Jean-Paul Marat to Benito
Mussolini) with new pathways to political power. It served to
knit together the inhabitants of large cities for whom the daily
newspaper provided a kind of surrogate community. According
to Benedict Anderson (1983, 3740), newsprint served a similar
function for millions of compatriots who lived within the boundaries of a given nation-state.
The reception of news via print rather than voice points to a
facet of print culture that has given rise to much debate. It centers on a contrast not with scribal but with oral culture. To an
Enlightenment philosopher such as the Marquis de Condorcet
(see Baker 1982, 268), who was impressed by advances in mathematics, the use of print held the promise of introducing rationality into political affairs. Whereas speech was ephemeral,
Condorcet argued, a printed account lent itself to rereading and
careful consideration. By means of rhetorical devices, orators
could persuade their audiences to perform ill-considered acts.

674

Legislators were less likely to be carried away by a treatise than


by a speaker, and were more likely to think calmly and carefully
before taking action.
The same distancing effect of print that Condorcet regarded
as beneficial was found objectionable by others. The romantic movement was in part a reaction against the mathematical
reasoning and abstract thinking that were associated with print
culture. Political romanticism took the form of lamenting the
way the age of chivalry had succumbed to that of economists
and calculators (Burke 1790). Readers were urged by romantic poets, such as William Wordsworth, to abandon dry-as-dust
books: close up those barren leaves! Objections to the purported distancing effects of print persisted among critics and
media analysts in the twentieth century. Through the habit of
using print and paper, wrote Lewis Mumford (1934, 1367),
thought lost something of its flowing organic character and
became abstract, categorical, stereotyped, content with purely
verbal formulations and solutions. A similar position was taken
by McLuhan in his depiction of Typographical Man (1962,
367).
Both the proponents and opponents of the ostensibly impersonal character of print tended to overlook its coexistence with
a human presence and a human voice. One thinks immediately
of parents reading to children. But any text that appears in print
lends itself to being read aloud. During the early modern era,
printed broadsides and news reports were especially likely to
be transmitted by word of mouth to listeners gathered around a
few literate townsmen. Even now, public readings or lectures are
delivered to hearing publics by authors of printed best-sellers.
Print culture did not supersede oral culture but did have an
effect upon it. As was true of handwriting and handcopying, the
speech arts, far from languishing, flourished in a more regulated form. Instruction in elocution and in holding debates figured among the many how-to books that printers kept in stock.
There were exceptional preachers (such as the Marian exiles or
the Huguenot refugees) who, when deprived of their pulpits and
sent into exile, turned to printing as their only recourse. But most
preachers (like Martin Luther himself) made full use of both pulpit and press.
Other considerations cast doubt on the distancing effect
of print. As noted previously, print culture encompasses images
as well as texts. Whatever the persuasive effects of printed cartoons and caricatures, they cannot be described as distancing.
Similarly, the figure of a distant ruler became less distant when
printed (or photographed) portraits could be cut out of newspapers and enshrined in peasant huts. Even with regard to bare
texts, a skillful writer (whether distant or dead) can still move
unknown readers to tears or incite them to take action.
Before printing, powerful lungs were required for preachers
or orators who hoped to gain a popular following. But the later
political scene saw effective action taken by numerous figures
who (like John Wilkes) were notably deficient in the speech arts.
Condorcet, for one, was remarkably blind to the political passions
that could be aroused by pamphleteers and journalists. During
the revolutionary era, readers of Tom Paine or Marat were not
distanced from political contestation but were drawn into it.
The basic features of print culture remained more or less the
same after the industrialization of printing processes in the early

Private Language
nineteenth century and after the adoption of other new technologies (such as lithography, photography and the shift from hot
to cold type). Nineteenth-century observers believed that the
advent of newspapers signaled the end of the book. Late twentieth-century commentators believed that radio, television, and
other electronic media were going to supersede print. At present,
the movement of texts onto screens has persuaded some observers that supersession is finally at hand. In my view, continued
coexistence seems more likely, especially since the preservative
powers of print are still uncontested. An ever-growing shortage of space on library shelves and an unending concern about
information overload suggest that print culture is still exerting a
cumulative effect and will continue to do so for the foreseeable
future.
Elizabeth L. Eisenstein
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, Benedict. 1983. Imagined Communities. London: Verso.
Baker, Keith M. 1982. Condorcet. Chicago: University of Chicago Press.
Bate, W. Jackson. 1970. The Burden of the Past and the English Poet.
Cambridge: Harvard University Press.
Burke, Edmund. 1790. Reflections on the Revolution in France.
Available online at: http://www.constitution.org/eb/rev_fran.htm.
Chartier, Roger. 1986. Texts, printing, readings. In The New Cultural
History, ed. L.Hunt, 15471. Berkeley: University of California Press.
. 1987. The Culture of Print. Princeton, NJ: Princeton University
Press.
Cormack, Bradin, and Carla Mazzio. 2005. Book Use, Book Theory, 1500
1700. Chicago: University of Chicago Press. See Part III on The How-to
Book.
Eisenstein. Elizabeth L. 1997. From the printed word to the moving
image. Social Research 64 (Fall): 104966.
. 2005. The Printing Revolution in Early Modern Europe. New
York: Cambridge University Press.
Martin, H. J., and Lucien Febvre. 1976. The Coming of the Book. Trans.
D. Gerard. London: NLB. See pages 716 for section on Chinese
precedents.
McKenzie, D. F. 2002. Speech manuscript print. In Making Meaning:
Printers of the Mind and Other Essays, ed. P. MacDonald and
M. Suarez, 23759. Amherst: University of Massachusetts Press.
McLuhan, Marshall. 1962. The Gutenberg Galaxy. Toronto: University of
Toronto Press.
. 1964. Understanding Media. New York: McGraw-Hill [Signet
Books].
Mumford, Lewis. 1934. Technics and Civilization. New York: Harcourt
Brace.
Ong, Walter. 1982. Orality and Literacy. London: Routledge.

PRIVATE LANGUAGE
Ludwig Wittgenstein (18891951) is considered one of the most
influential philosophers of the twentieth century. While his contributions to philosophy are wide-ranging, one of his most widely
discussed and influential contributions is taken to follow from the
sections of his Philosophical Investigations that explore the possibility of a logically private language ([1953] 1958, 243315).
These remarks have come to be called Wittgensteins private language argument. The label is not, however, without controversy;
we explore why in this entry.

Wittgensteins remarks need to be read closely; they are


designed to work on the reader rather than proffer arguments entailing conclusions, which might be summarized. For
Wittgenstein, philosophy was an activity, and its goal ought to be
to free us of problems, formulated through our misunderstanding the logic of our language (1922, 3). The interested readers first
port of call, therefore, should be his central text, Philosophical
Investigations (hereafter PI). This is the most complete of the
posthumously published works and the one that has had most
influence on subsequent philosophical thought. I discuss
the ways in which interpreters have read the remarks so often
referred to as the private language argument and the conclusions those interpreters have drawn. Wittgensteins writings are
designed to wean one away from certain alluring, though maybe
unconscious, commitments, pictures, analogies, and prejudices.
One cannot, therefore, merely summarize his argument(s) and
conclusion(s), for there is (are) none, in the traditional sense.
In PI, Wittgenstein asks, could we imagine a language in
which a person could write down or give vocal expression to
his inner experiences his feelings, moods, and the rest for
his private use? ([1953] 1958, 243). His imaginary interlocutor
responds by remarking that we do so in our ordinary language.
Wittgenstein rejoins: But that is not what I mean. The individual
words of this language are to refer to what can be known only
to the person speaking; to his immediate, private, sensations. So
another person cannot understand the language (243).
Wittgenstein is variously taken, in the 72 (or so) remarks that
follow this passage to be doing one of two things. Some interpreters take him to be providing a refutation of the claim that
the language described in 243 is possible through a reductio
ad absurdum, in the process advancing positive philosophical claims such as an expressive theory for the meaning of first
person, present tense, psychological utterances, refuting certain
(alleged) Cartesian prejudices regarding the mindbody relationship, and availing us of a new answer to the problem of other
minds. On another reading, he is taken to bring readers to a position whereby they freely acknowledge that such a logically private language could have no significance for them, that in trying
to state what such a language could be, the philosopher fails to
make sense; furthermore, that their thought that such a language
could have significance, could be stated sensically, stemmed
from an unacknowledged, thought-constraining, attachment to
a particular nonobligatory picture of language, the mind or
privacy. The debate, therefore, cashes out in the following way.
Those who take Wittgenstein to provide an argument in these
72 or so remarks take that argument to be something along the
lines of the following: For something, say, a set of utterances
say, the signals of the builders in the opening remarks of PI, say,
a logically private language such as that which we are asked to
imagine in 243 to rightly be called a language, it must fulfill
a certain set of criteria. The criteria give us the meaning of the
word language. Something failing to fulfill these criteria, therefore, cannot meaningfully be called language. There is a secondorder debate about the nature of the criteria: Are they formal/
logical or are they social? There are, thus, those who hold that
the existence of a language entails the existence of a linguistic
community, communitarians, and those who hold that it does
not do so. However, regardless of which position on this second-

675

Private Language
order debate such readers take, they all (if they take Wittgenstein
to have been successful in his alleged aim) hold that a logically
private language, as described in the final paragraph of PI 243,
is shown by Wittgenstein to fail to fulfill the relevant criteria for
being a language. It is this, such readers claim, that he demonstrates argues in the 72 or so remarks that follow it.
The alternative way of reading these remarks is as follows: Wittgenstein, in PI, asks us to imagine such a language
([1953] 1958, 243), that is, to (try to) entertain the thought of the
possibility of a language which describes my inner experiences
and which only I can understand (256). The remarks that follow
work on the reader to the extent that he or she sees that however
one tries to give sense to such a (putative) language, we never
arrive at a position where our desire to see it as such is satisfied.
Read aright, the remarks serve to dispel the desire to attempt to
give sense to the locution [a] language which describes my inner
experiences and which only I can understand or a logically private language. On this reading, it is not that there is something
akin to a misuse according to the rules of grammar of the
concepts private and language, such that such locutions are
nonsense. It is not something that the philosopher wants to say
but cannot owing to the configuration of grammatical rules. It is,
rather, that when we try to imagine a private language, we realize that there is no determinable thing a private language to
imagine. The very notion of a private language dissipates as we
try to grasp it.
On the latter reading, therefore, Wittgenstein does not advance
a theory as to the nature of first person, present tense, psychological utterances but merely offers suggestions as to how it might be
possible to understand these as learned replacements for (say) a
cry of pain, rather than as, for example, a description or report of
an inner state, such as a sensation. He offers such suggestions, as
it were, as prophylactics. To accept such suggestions as possible
weans one off of the assumption that such utterances must be
descriptions of inner states (sensations, for example) and feeds
into weaning one away from the assumed need and the desire to
give sense to the locution private language.
The debate between the two readings, therefore, hinges
on how one should understand Wittgensteins philosophical
method (or metaphilosophy). Those who take him to offer a refutation of a logically private language and to be, in the course of
doing so, advancing positive claims as to (say) the nature of first
person, present tense, psychological utterances, do so, as their
opponents suggest, by underplaying his remarks on philosophical method (especially PI 109 and 126 through 133), where he
lays out his therapeutic vision of philosophy. Here, the practice
of philosophy is undertaken as a therapeutic dialogue between
the Wittgensteinian philosopher and his or her interlocutor
indeed, the therapist and interlocutor might be conflicting tendencies in oneself. The task of the philosopher-as-therapist is to
facilitate the interlocutors free realization that he or she is in the
grip of a particular picture of the way things must be that leads
him or her to be committed to certain nonobligatory philosophical positions.
What one takes to be done by Wittgenstein in these 72 (or
so) remarks has more than merely exegetical significance. If one
understands him to have refuted the possibility of a logically private language and, in so doing, to have advanced, for example,

676

Projectibility of Predicates
the expressive theory for the meaning of first person, present
tense, psychological utterances, then one will be led to argue
that such utterances are not cannot be reports or descriptions
of inner states but are must be rather, expressions or avowals
of judgments or evaluations/appraisals: So, for example, many
cognitivist philosophers (e.g., Lyons 1980; Nash 1989) and
psychologists (e.g., Lazarus 1982) of emotion advance this view
(some, such as Kenny 1963, even drawing on Wittgenstein as
chief influence).
If one understands Wittgenstein as not having advanced such
views and, rather, takes his remarks to be designed to work on
one so as to facilitate the realization that there is nothing we
would wish to hold onto answering to the name private language, we are not led to a philosophical commitment to any
view on psychological language. We might then, rather, engage
with those who claim or assume that first person, present tense,
psychological utterances must be reports or descriptions of inner
states or sensations Jamesian accounts of emotion in general
and cognitive neuroscience in particular (e.g., Damasio 1994)
and with those who claim or assume that they must be expressions or avowals as both being driven by prejudice. Jamesian/
neoJamesian and cognitivist theories of emotion can be seen
to rest on prejudice about our use of psychological language (see
also emotion words).
Wittgenstein, read aright, can provide much help in our
attempts to dissolve such prejudice in the human sciences.
Phil Hutchinson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baker, Gordon. 2004. Wittgensteins Method: Neglected Aspects.
Oxford: Blackwell. See Part I, Section B, Chapters 5, 6, and 7. The book
contains a number of excellent chapters on the (so-called) private language argument and its interpretation.
Damasio, A. R. 1994. Descartes Error: Emotion, Reason, and the Human
Brain. New York: Grosset/Putnam.
Kenny, A. 1963. Action, Emotion and Will. London: Routledge.
Lazarus, R. S. 1982. Thoughts on the relations between emotion and cognition. American Psychologist 37: 101924.
Lyons, W. 1980. Emotion. Cambridge: Cambridge University Press.
Mulhall, Stephen. 2006. Wittgensteins Private Language: Grammar,
Nonsense and Imagination in Philosophical Investigations, 243315.
Oxford: Oxford University Press. Nuanced reading.
Nash, R. A. 1989. Cognitive theories of emotion. Nos 23: 482504.
Wittgenstein,
Ludwig.
1922.
Tractatus
Logico-Philosophicus.
London: Routledge.
. [1953] 1958. Philosophical Investigations. Oxford: Blackwell.

PROJECTIBILITY OF PREDICATES
The distinction between projectible and nonprojectible predicates arises in analyses of inductive inference. David Hume
([1748] 2000) showed that it is not possible to justify the belief
that past empirical regularities will continue into the future.
Nelson Goodmans 1953 new riddle of induction raised a further problem. Goodman showed that there are always an unlimited number of hypotheses that encompass all of the evidence,
yet conflict in their predictions. For example, if we introduce
the predicate grue (grue = examined before some future time

Projectibility of Predicates

Projection (Blending Theory)

t and green, or not so examined and blue), the hypotheses [H1]


All emeralds are green and [H2] All emeralds are grue both
conform to the observed data, though making divergent predictions. So, Goodman asks, what justifies adopting H1 rather than
H2 or countless other conflicting hypotheses? Why, that is, on
the basis of the very same evidence do we project some predicates (e.g., green) and not others (e.g., grue)? A predicate,
then, is considered projectible if it is suitable for use in inductive
inference.
The projectible/nonprojectible distinction, however, has significance for other important issues. In particular, it is appealed
to in distinguishing laws of nature (eg. H1) from generalizations
that while true, seem true only by accident (e.g., [H3] All the
objects on table D are green). The distinction is also central in
evaluating the truth or falsity of counterfactual conditionals and
in accounts of the basis of similarity judgments. This leads some
to identify projectible predicates with so-called natural kind
predicates. Green and emerald pick out natural kinds; grue
and on table D do not.
Providing criteria for distinguishing projectible from nonprojectible predicates (or natural kinds from other kinds) remains
a challenge (Scheffler 1981; Stalker 1994; Schwartz 2005). The difference between projectible and nonprojectible predicates does
not hinge on explicit temporal reference. If grue and bleen
(bleen = blue and examined before time t, or not so examined and
green) are taken as primitives and green and blue defined,
the latter predicates, not the former, will mention time. In fact,
temporal considerations need not play any role in setting the new
riddle. A curve drawn through data points represents a hypothesis
that projects new values from those observed. But an unbounded
number of curves that make conflicting predictions about unobserved values can be drawn through these same points.
Continuing efforts to explicate the concept projectible predicate in purely syntactic or semantic terms have not been
successful. Attempts to draw a projectible/nonprojectible distinction along epistemological and metaphysical lines have
also been unsatisfactory, tending either to be parochial or to
presuppose dichotomies tantamount to the distinction itself.
Solutions appealing to simplicity, similarity, and innate quality
spaces have run into like difficulties. Goodman proposes drawing the distinction along pragmatic lines. Projectible predicates
are those with a history of successful predictive use. As a result,
they become entrenched in our vocabulary and inductive
practices. Pragmatic entrenchment underlies the confidence
we have in projecting them and is a primary reason for their felt
naturalness.
Robert Schwartz
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Goodman, N. 1953. Fact, Fiction and Forecast. Cambridge: Harvard
University Press.
Hume, D. [1748] 2000. An Enquiry Concerning Human Understanding.
Oxford: Oxford University Press.
Scheffler, I. 1981. Anatomy of Inquiry. Indianapolis: Hackett.
Schwartz. R. 2005. A note on Goodmans problem. Journal of Philosophy
102: 3759.
Stalker, D., ed. 1994. Grue! Chicago: Open Court.

PROJECTION (BLENDING THEORY)


metaphor, analogy, categorization, and many other cognitive operations are customarily understood within cognitive
linguistics in terms of projected conceptual structures or in
terms that are compatible with conceptual projection. In the
conceptual blending framework, in particular, projection
describes the way that conceptual structures can be combined
or superimposed by copying conceptual content from one mental space to another. This term is unrelated to the use of projection to describe anticipatory utterances and gestures in
conversation (e.g., Streeck 1995) or the projection principle
of government and binding theory.
The notion of projection as a kind of mapping that serves as
a fundamental mechanism of thought has its roots in ideas from
conceptual metaphor theory and frame semantics, as
well as mental space theory. George Lakoff and Mark Johnson
(1980) explain individual instances of metaphorical language as
reflections of systematic relationships between two conceptual
domains, in which language and structure from a source domain
is projected, or mapped onto, a situation in a target domain (see
source and target). A sentence like Were drifting apart
depicts elements from the domain of emotional intimacy in
terms of structures mapped from the domain of physical proximity. Charles J. Fillmore (1982) observes that words like lend
make sense only in light of certain schematic representations
of situation types, or frames. A word is said to evoke an associated
frame or frames, prompting a language user to project the evoked
frames structure onto an unframed assembly of elements. This
projection allows the language user to understand those elements
in terms of the relationships and roles belonging to the frame.
Projection in the conceptual blending framework differs from its
counterparts in conceptual metaphor theory and frame semantics
in two ways. First, projection in conceptual blending is a process
that takes place between mental spaces, not domains or schemas.
Second, where the former theories describe projections involving
exactly two conceptual structures, the conceptual blending framework calls for four at a minimum. Material is projected from at least
two input spaces into at least two middle spaces (Fauconnier and
Turner 1994): a generic space that reflects the roles, frames, and
schemas that the inputs have in common, and a blended space
where projected elements are integrated and develop emergent
structure through processes of composition, completion, and
elaboration (Fauconnier and Turner 1998, 2002).
Backward projection refers to circumstances whereby structure is projected from the blended space back to its inputs. This
can be a desirable outcome in which inferences developed in
the blend enhance understanding of the input material. Gilles
Fauconnier and Mark Turner (2002) illustrate this kind of desirable backward projection with the Buddhist Monk riddle: A
monk walks up a mountain one day and walks down the same
path over the same time period on another day. A solution to
the puzzle of whether there is a place on the path that the monk
occupies at the same time of day on both journeys is to imagine
that the monk takes both walks on the same day in this blended
scenario, at some moment he will meet himself. Only projecting the location of the meeting place from the blended space
back to the input spaces yields the conclusion that the original
677

Projection Principle

Proposition

single monk, walking on two different days, will indeed cross


some spot on the path at the same time on each day.
An important principle of projection in conceptual blending
theory is that it is always selective. Not all elements and relations in
the input spaces are projected into the blended space. Sometimes
only one of a pair of counterpart elements is projected, sometimes both, and sometimes none. The organizing structure of one
or both inputs can also be projected in part, or not at all. In some
integration networks, called single-scope networks, only one input
projects its organizing frame to the blended space, while the other
input contributes other elements but little or no organizing structure. (Conventional sourcetarget metaphors are considered
prototype examples of this kind of integration network.) The
projection involved in other kinds of networks is also selective: If I
were you, Id take the job, for example, prompts the listener to take
some aspects of the interlocutor into consideration and some
aspects of himself or herself, but by no means all of either.
Unconstrained projection, by contrast, would undermine the
usefulness of these inferences, and projecting too much structure in either direction constitutes an error. Excessive projection
from the inputs in a blend leads to mistakes, such as assuming
that because my computer interface includes something called
a menu, I will need to find some waiter who can accept my
order. Inappropriate projection of emergent structure from the
blended space back to the inputs leads to other mistakes, such as
confusing actors with the characters they play on TV. The possible competing pressures governing these constraints on projection are examined in detail in Fauconnier and Turner (1998
and 2002, 30952).
Vera Tobin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Fauconnier, Gilles, and Mark Turner. 1994. Conceptual projection and
middle spaces. Department of Cognitive Science Technical Report
9401, University of California, San Diego.
. 1998. Principles of conceptual integration. In Discourse
and Cognition, ed. Jean-Pierre Koenig, 26983. Stanford, CA: CSLI
Publications.
. 2002. The Way We Think: Conceptual Blending and the Minds
Hidden Complexities. New York: Basic Books.
Fillmore, Charles J. 1982. Frame semantics. In Linguistics in the Morning
Calm, ed. The Linguistic Society of Korea, 11137. Seoul: Hanshin
Publishing.
Grady, Joseph, Todd Oakley, and Seanna Coulson. 1999. Blending
and metaphor. In Metaphor in Cognitive Linguistics, ed. Raymond
Gibbs and Gerard Steen, 10024. Amsterdam and Philadelphia: John
Benjamins. A useful comparison of related concepts in conceptual
metaphor and conceptual blending theory.
Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By.
Chicago: University of Chicago Press.
Streeck, Jrgen. 1995. On projection. In Social Intelligence and
Interaction: Expressions and Implications of the Social Bias in Human
Intelligence, ed. Esther N. Goody, 87110. Cambridge: Cambridge
University Press.

properties of lexical items (that is, both their features and selectional requirements) must be represented at every syntactic level of representation. Its key effect is to allow structure at
every level of representation to be directly determined by lexical
properties, thereby largely eliminating the need for languageparticular rules. For this reason, the projection principle has
played a key role in the transition from early rule-based transformational grammar to principles and parameters theory.
Let us briefly review two major consequences of the projection principle. The first concerns the elimination of languageparticular base rules (the rules that generate deep structures; see
underlying structure and surface structure) in transformational grammar. Such rules would state, for example, that
in English a verb phrase may consist of a verb followed by either
a noun phrase or a prepositional phrase or a noun phrase and a
prepositional phrase, and so on. In addition, the lexical entries of
verbs specify the syntactic environment in which they can occur.
Thus, the entry for the verb put states that it must be followed by a
noun phrase and a prepositional phrase. This arrangement gives
rise to redundancy: In the case at hand, the fact that there are
English verbs that can be followed by a noun phrase and a prepositional phrase is expressed twice, namely, in the base rules and
in the lexical entry of put (and other verbs). Since lexical properties are idiosyncratic, elimination of this redundancy requires a
simplification of the base component, and this is precisely what
the projection principle makes possible. As it requires syntactic
representations to be projections of lexical properties, it allows
the base component to be reduced to a small universal skeleton,
known as x-bar theory, and relegates language-specific properties of deep structures to the lexicon and to a set of word-order
parameters.
The projection principle also implies that syntactic representations that are the result of movement must contain traces
that function as place holders for categories that have undergone
this operation. It is easy to see why this should be so. As stated
earlier, the lexical entry for the verb put must state that this verb is
followed by a noun phrase and a prepositional phrase. However,
in the question in (1a), the selectional requirements of this verb
appear not to be met, since the noun phrase that usually follows
it has undergone movement. Satisfaction of the projection principle, therefore, requires that movement leave behind a trace
shown as tNP in (1b) so that the selectional properties of put are
also expressed in the structure that results from movement.
(1)

a. I wonder [[NP what] Jack has [VP [V put] [PP in the oven]]]
b. I wonder [[NP what] Jack has [VP [V put] tNP [PP in the oven]]]

Hans van de Koot


WORK CITED AND SUGGESTION FOR FURTHER READING
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht,
the Netherlands: Foris.
. 1986. Knowledge of Language: Its Nature, Origin, and Use. New
York: Praeger.

PROJECTION PRINCIPLE

PROPOSITION

The projection principle (Chomsky 1981) is a cornerstone of


principles and parameters theory. It states that the

As used in the philosophy of language and logic, a proposition


is what is believed and/or asserted. The prima facie case for the

678

Propositional Attitudes
existence of propositions is that they make sense of the fact that
from
Terry believes that cats are cute.

we can infer that


There is something that Terry believes.

Propositions, while often thought to be what is expressed by a


sentence, are said to be distinct from sentences. One sentence
(e.g., Fred put the money in the bank) can express more than
one proposition (in this case, depending on the meaning of
bank); and one proposition can be expressed by two different
sentences (e.g., snow is white and neige est blanc).
Proponents of propositions typically treat them as the primary bearers of truth and falsity, but they disagree over their
nature and structure.
Michael P. Lynch

PROPOSITIONAL ATTITUDES
Propositional attitudes are mental relations believing, desiring,
hoping, and so on between an individual and a proposition.
The sentence Amie believes that 1 + 1 = 2 is a propositional
attitude report, where Amie stands in the belief relation to the
proposition that 1 + 1 = 2. The proposition specifies the content of
the attitude (see intentionality). Propositional attitudes play
a central role in ordinary psychological explanations. We explain
why Elia presses the doorbell by saying that he hopes someone
will answer the door, and he believes that this will happen if he
presses the bell.

The Nature of Propositional Attitudes


One controversy about propositional attitudes concerns their
relationship to the mental representations investigated in
psychology and cognitive science. The language of thought
theory says that propositional attitudes correspond to languagelike mental representations their combinatorial syntax allows
them to form propositional attitudes with new contents.
A second controversy about propositional attitudes concerns their purported social or environmental dependency. For
internalists, whether one has a certain propositional attitude
depends only on ones internal physical brain state (Searle 1983).
However, externalists (McGinn 1977; Burge 1979) argue that
mental relations to propositions are mediated through linguistic
conventions and the external environment. Two physically identical individuals can possess different propositional attitudes if
they are embedded in different environments (see meaning
externalism and internalism).

The Semantics of Propositional Attitudes Reports


The analysis of propositional attitude reports is a particularly
thorny problem for semantics. There is no consensus as to the
right approach, and the resolution of the problem depends on
many other issues in semantics, such as the nature of a proposition. Under possible worlds semantics, a proposition is a
set of possible worlds. The proposition that pigs fly is the set of
worlds where pigs fly. But a serious problem is that necessarily

equivalent propositions would then have to be identical. The


necessarily true proposition that 1 + 1 = 2 is identical to the proposition that 60,375 is divisible by 3 (both identical to the set of all
possible worlds). But intuitively, one can believe the first proposition without believing the second.
Many take this to show that a proposition must be more finegrained than simply a set of worlds. On the Russellian approach,
a proposition expressed by a that-clause is constructed from the
normal referents of the expressions contained in the clause. The
number 2 is a constituent of the proposition that 1 + 1 = 2, but it is
not part of the proposition that 60,375 is divisible by 3. Since the
propositions are distinct, we can believe one without believing
the other.
Unfortunately, the Russellian approach has difficulties
accounting for an observation by Gottlob Frege ([1892] 1948).
Frege famously pointed out that identity statements can differ
in cognitive value. Intuitively, Amie believes that Mark Twain is
Mark Twain can be true while Amie believes that Mark Twain
is Samuel Clemens is false. But on the Russellian approach, this
is impossible. Since Mark Twain is Samuel Clemens, the two
that-clauses refer to the same proposition. In response, Nathan
Salmon (1986) insists that the two belief reports are indeed
equivalent. Our contrary intuition reflects only a difference in
pragmatics.
A Fregean alternative is to postulate that the proper names
Mark Twain and Samuel Clemens have different senses (see
sense and reference) for Amie, that she represents the same
person in two different ways. In addition, these names do not have
their customary referents within the context of believe. Instead
of referring to the same famous author, the two names actually
refer to their associated senses within propositional attitude
contexts. Consequently, the that-clauses in the two belief reports
refer to distinct propositions, which is why the two reports are
not equivalent. However, one objection to the Fregean approach
is that proper names do not change their referents in propositional attitude contexts (Davidson 1968). Another objection is
that people can believe the same proposition even if they associate different senses with the same name. These children all
believe that Santa exists can be true even if the children represent Santa in different ways.
Rudolf Carnap (1958), W. V. O. Quine (1956), and Donald
Davidson (1968) analyzed propositional attitudes as relations
to sentences, rather than language-independent propositions.
Michelle believes that 1 + 1 = 2 is true if and only if Michelle
stands in a certain relationship to the sentence 1 + 1 = 2.
This avoids the need to postulate propositions, but there are
new problems to resolve. For example, the sentence Delman
believes something profound. can be true, even if Delman does
not speak English, or there is no English sentence that captures
the content of his profound belief.
Other authors, such as Mark Richard (1990) and Richard
Larson and P. Ludlow (1993), have sought to develop hybrid
approaches, where a proposition combines languageindependent entities with linguistic items or mental representations. A more radical approach is to deny that propositional
attitudes are binary relations. Mark Crimmins (1992) argues that
although a propositional attitude verb like believe is syntactically
a two-place predicate, it actually expresses a three-place relation

679

Prototypes
among a subject, a proposition, and a way of believing. A way
of believing is supposed to be similar to a Fregean sense, but it
is contextually specified and does not correspond to any explicit
referring item in the report.
Joe Lau
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Burge, Tyler. 1979. Individualism and the mental. In Midwest Studies in
Philosophy. Vol. 4. Ed. P. French, T. Uehling, and H. Wettstein, 73121.
Minneapolis: University of Minnesota Press.
Carnap, Rudolf. 1958. Meaning and Necessity: A Study in Semantics and
Modal Logic. 2d ed. Chicago: University of Chicago Press.
Crimmins, Mark. 1992. Talk about Beliefs. Cambridge, MA: MIT Press.
Davidson, Donald. 1968. On saying that. Synthese 19: 13046.
Frege, Gottlob. [1892] 1948. Sense and reference. Philosophical Review
57: 20930.
Larson, Richard, and P. Ludlow. 1993. Interpreted logical forms.
Synthese 95: 30555.
McGinn, Colin. 1977. Charity, interpretation, and belief. Journal of
Philosophy 74: 52135.
McKay, Thomas, and Michael Nelson. 2008. Propositional attitude
reports. In The Stanford Encyclopedia of Philosophy (Fall), ed. Edward
N. Zalta. Available online at: http://plato.stanford.edu/archives/
fall2008/entries/prop-attitude-reports/. This is a comprehensive survey on the semantics of propositional attitude reports.
Quine, Willard Van Orman. 1956. Quantifiers and propositional attitudes. Journal of Philosophy 53: 17787.
Richard, Mark. 1990. Propositional Attitudes: An Essay on Thoughts and
How We Ascribe Them. Cambridge: Cambridge University Press.
Salmon, Nathan. 1986. Freges Puzzle. Cambridge, MA: MIT Press.
Searle, John. 1983. Intentionality. An Essay in the Philosophy of Mind.
Cambridge: Cambridge University Press.

PROTOTYPES
Prototypes refers to one of the ways in which psychology has
attempted to account for the nature of concepts and conceptual categories (see categorization). In the prototypes view,
conceptual categories form around and/or are represented in the
mind by salient, information-rich ideas and images that become
prototypes for the category. Other members of the conceptual
category are judged in relation to these prototypes, thus forming
gradients of category membership.
The importance of prototype theory needs to be seen in its
historical context. Prior to this work, concepts and conceptual
categories were assumed from philosophy to be arbitrary logical
sets with defining features and clear-cut boundaries. All members of the conceptual category were considered equivalent with
respect to membership. This is now called the classical view.
Psychological research on concept learning used artificial sets
of stimuli, structured into microworlds in which these assumed
characteristics of categories were already built in. Mainstream
linguistics was constructed on similar assumptions; phonology, semantics, and syntax all sought to decompose the
subject matter of their domains (speech sounds, word meaning, and grammar) into sets of abstract binary defining features,
a procedure called componential analysis. In contrast, in the prototype view, there need be no defining attributes that all conceptual category members have in common; category boundaries

680

need not be definite (sometimes called fuzzy boundaries), and


category membership is graded with respect to how good a
member is judged to be.
The prototype view was first proposed by Eleanor Rosch
(1973) as a general framework encompassing her cross-cultural
work on color and form categories, and then elaborated into a
programmatic body of empirical research challenging the classical view (Rosch 1978, 1999). The theory was that categories form
around perceptually, imaginally, or conceptually salient stimuli,
then, by stimulus generalization, spread to other similar stimuli;
such prototype stimuli serve as the conceptual and imaginal reference points by which the category as a whole is represented
and understood. Empirically, all categories do show gradients
of membership; that is, subjects easily, rapidly, and meaningfully rate how well a particular item fits their idea or image of
the category to which the item belongs. Note that these are not
probability judgments but judgments of degree of membership.
Gradient of membership judgments apply to diverse kinds of categories: perceptual categories such as red, semantic categories
such as furniture, biological categories such as woman, social
categories such as occupations, formal categories that have classical definitions such as odd number, and ad hoc, goal-derived
categories such as things to take out of the house in a fire. In contrast, subjects cannot list criterial attributes for most categories
(Rosch and Mervis 1975).
Gradients of membership must be considered psychologically important because such measures have been shown to
affect virtually every major method of study and measurement
used in psychological research (Rosch 1975, 1978, 1999):
1. Association: When asked to list members of the category,
subjects produce better examples earlier and more frequently
than poorer examples. Association is taken as the key to mental structure in many systems; for example, in approaches as
diverse as British empiricist philosophy, connectionism, and
psychoanalysis.
2. Speed of Processing: The better an example is of its category, the more rapidly subjects can judge whether or not that
item belongs to the category. Reaction time has been considered a royal road to the study of mental processes in cognitive
psychology.
3. Learning: Good examples of categories are learned by subjects in experiments and acquired naturalistically by children
earlier than are poor examples, and categories can be learned
more easily when better examples are presented first findings
with implications for education (see Markman 1989; Mervis
1980).
4. Expectation: When subjects are presented a category name
in advance of making rapid judgment about the category, performance is helped (i.e., reaction time is faster) for good and
hindered for poor members of the category. Called priming or
set in psychology, this finding has been used to argue that the
mental representation of the category is in some ways more
like the better than the poorer exemplars.
5. Inference: Subjects infer from more to less representative members of categories more readily than the reverse, and
the representativeness of items influences judgments in formal
logic tasks, such as syllogisms (see also Smith and Medin 1981;
cf. verbal reasoning).

Prototypes
6. Probability Judgments: Representativeness strongly influences probability judgments (Kahneman, Slovic, and Tversky
1982), which is important because probability is thought by
many philosophers to be the basis of inductive inference and,
thus, of the way in which we learn about the world.
7. Natural Language Indicators of Graded Structure: Natural
languages themselves contain various devices that acknowledge
and point to graded structure, such as hedge words like technically and really (see also Lakoff 1987; Taylor 2003).
8. Judgment of Similarity: Less good examples of categories
are judged more similar to good examples than vice versa. This
violates the way similarity is treated in logic, where similarity
relations are symmetrical and reversible (Tversky 1977).
What determines the items that will be prototypical of categories in the first place? Some are based on statistical frequencies,
such as the means or modes (or family resemblance structures) for various attributes; others appear to be ideals made
salient by such factors as physiology (good colors, good forms),
social structure (president, teacher), culture (saints), goals
(ideal foods to eat on a diet), formal structure (multiples of 10
in the decimal system), causal theories (sequences that look
random), and individual experiences (the first learned or most
recently encountered items or items made particularly salient
because they are emotionally charged, vivid, concrete, meaningful, or interesting). Note that particular exemplars can be prototypes if they serve as reference points for the category; thus,
prototype and exemplar theories are not necessarily contradictory. Note also that it is a misapprehension to take prototypes to
mean only one kind of prototype and to critique prototype theory
on that basis.
The prototype view has spread beyond psychology to many
fields, including linguistics and narratology. Gradients of
exemplariness are ubiquitous in linguistic phenomena, even
in phonology where actual speech is less clear-cut than would
appear in an abstract componential analysis. In semantic and
syntactic analyses (particularly in cognitive grammar and
the understanding of metaphor), prototype effects, as well as
providing specific case studies are often used as evidence that
formal analysis is insufficient of itself and that world knowledge
must be part of ones theory (Lakoff 1987; Langacker 1990; Taylor
2003).
There are societal implications of prototype theory. For example, social stereotypes are a type of prototype; it is called a
stereotype when it applies to a group of people and has social
consequences. Another example: Anglo-American case law is
based on prototypes; precedent cases provide the reference
points in arguing present cases (see legal interpretation).
Further examples: Political arguments are often about conflicting
prototypes (e.g., different images of welfare mothers) even when
goals (care for children) may be quite similar, while scholarly
debates are frequently about attempts to draw clear-cut boundaries where there are none when the participants actually agree
on central prototypes (e.g., agreement on clear cases of what we
mean by language or religion but disagreement about borderline
phenomena and about criteria). Many such issues could be clarified by understanding the principles of prototyping.
Although prototype effects are now acknowledged as empirically established, there have been many criticisms of prototypes

as an account of concepts and categorization. The main objections fall into two camps: In the first, prototypes and graded
structure violate the classical requirement that the real meaning
of a concept (that to which it refers) must be the identifiable necessary attributes of a classical definition. One argument for this
view is that prototype and graded structure effects can be found
for conceptual categories that have a formal classical definition, such as odd number (Armstrong, Gleitman, and Gleitman
1983), another that prototypes do not form componential combinations as do the elements of classical definitions (e.g., a good
example of pet fish is neither a prototypical pet nor a prototypical
fish; Osherson and Smith 1981). Both findings are taken to indicate that prototypes are something other than and irrelevant to a
concepts meaning. One solution is a dual model in which prototypes are assigned the function of rapid recognition of conceptual
referents, whereas the true meaning is provided by a classically
defined core (Osherson and Smith 1981; Smith, Shoben, and Rips
1974). In the second camp, prototypes change with context for
example, prototypical animals in the context of a zoo differ from
those in the context of a farm (Barsalou 1987). Such findings are
taken to indicate that it is theories that determine concept meaning (Medin 1989). (For a review of more philosophical objections
to prototypes and all other accounts of concepts see Fodor
1998 and Laurence and Margolis 1999.)
In conclusion, prototype and graded structure effects are well
established; the debate concerns what they mean for our understanding of concepts, categorization, thinking, decision making,
the meaning of words and word combinations, and innumerable
other aspects of human functioning. The prototype view is not
necessarily incompatible with other approaches if the different
views are seen at a level deep enough that their complementarity can be appreciated. Prototype effects do not deny that under
some circumstances we explicitly think in terms of necessary
and sufficient conditions, nor does the essentialist intuition that there are nonobvious realities behind outer appearances (Gelman and Wellman 1991) exclusively require the form
of classical definitions for its implementation. Furthermore, prototypes, like any other mental or cultural process, only function
within the ever-shifting contexts of partially organized forms of
encyclopedic knowledge and belief the sort of understanding
toward which the theories view points. What prototype research
uniquely points out is that there is a level of organization of the
mind in which concepts appear to be represented by informationrich, imagistic, sensory-based, often emotion-linked wholes that
are used in thinking and communicating without reference to
definitions, category boundaries, or even truth conditions.
Eleanor Rosch
WORKS CITED AND SUGGESTIONS FOR FURTHER REFERENCES
Armstrong, Sharon, Lila Gleitman, and Henry Gleitman. 1983. What
some concepts might not be. Cognition 13: 263308.
Barsalou, Lawrence. 1987. The instability of graded structure: Implications for the nature of concepts. In Concepts and
Conceptual Development: Ecological and Intellectual Factors in
Categorization, ed. Ulric Neisser, 10140. Cambridge: Cambridge
University Press.
Fodor, Jerry. 1998. Concepts: Where Cognitive Science Went Wrong. New
York: Oxford University Press.

681

Proverbs
Gelman, Susan, and Henry Wellman, 1991. Insides and essences: Early
understanding of the non-obvious. Cognition 38: 21344.
Kahneman, Daniel, Paul Slovic, and Amos Tversky, eds. 1982. Judgment
under Uncertainty: Heuristics and Biases. New York: Cambridge
University Press.
Lakoff, George. 1987. Women, Fire, and Dangerous Things: What
Categories Reveal about the Mind. Chicago: University of Chicago
Press.
Langacker, Ronald. 1990. Concept, Image, and Symbol: The Cognitive
Basis of Grammar. Berlin: Mouton de Gruyter.
Laurence, Stephen, and Eric Margolis. 1999. Concepts and cognitive
science. In Concepts: Core Readings, ed. Eric Margolis and Stephen
Laurence, 381. Cambridge, MA: MIT Press.
Markman, Ellen. 1989. Categorization and Naming in Children.
Cambridge, MA: MIT Press.
Medin, Douglas. 1989. Concepts and conceptual structure. American
Psychologist 44: 146981.
Mervis, Carolyn. 1980. Category structure and the development of
categorization. In Theoretical Issues in Reading Comprehension, ed.
William Brewer, Bertram Bruce, and Rand Spiro, 279308. Hillsdale,
NJ: Erlbaum.
Osherson, Daniel, and Edward Smith. 1981. On the adequacy of prototype theory as a theory of concepts. Cognition 9: 3558.
Rosch, Eleanor. 1973. Natural categories. Cognitive Psychology
4: 32850.
. 1975. Cognitive representations of semantic categories. Journal
of Experimental Psychology: General 104: 192233.
. 1978. Principles of categorization. In Cognition and
Categorization, ed. Eleanor Rosch and Barbara Lloyd, 2748. Hillsdale,
NJ: Lawrence Erlbaum.
. 1999. Reclaiming concepts. Journal of Consciousness Studies
6.11/12: 6177.
Rosch, Eleanor, and Carolyn Mervis. 1975. Family resemblances: Studies
in the internal structure of categories. Cognitive Psychology
7: 573605.
Rosch, Eleanor, Carolyn Mervis, Wayne Gray, David Johnson, and
Penelope Boyes-Braem. 1976. Basic objects in natural categories.
Cognitive Psychology 8: 382439.
Smith, Edward, and Douglas Medin. 1981. Categories and Concepts.
Cambridge: Harvard University Press.
Smith, Edward, Edward Shoben, and Lance Rips. 1974. Structure and
process in semantic memory: A featural model for semantic decisions. Psychological Review 81: 21441.
Taylor, John. 2003. Linguistic Categorization. Oxford: Oxford University
Press.
Tversky, Amos. 1977. Features of similarity. Psychological Review
84: 32752.

PROVERBS
Proverbs in the Humanities and Cognitive Science
The humanities treat proverbs as repositories of wisdom about
everyday life. This premise motivates religious, literary, practical,
and culturalfolklore approaches that capitalize on the fact that
proverbs can be pithy, express a moral or precept, sound authoritative, perform various pragmatic functions (e.g., exhortation),
and serve as indirect speech-acts, in which what is intended
encompasses more than what is actually said.
The cognitive science perspective examines how proverbs
both illuminate and are illuminated by knowledge about the
mind. Ultimately, the humanities and cognitive science views
are complementary. The former is represented in Mieder (1994)

682

and the journal Proverbium, the latter in Honeck (1997), and


both views in Mieder (2003).

The Cognitive Science of Proverbs


Proverbs are best considered not in isolation but as members of
a larger family or category that includes the proverb, pictorial
renderings of the proverbs literal meaning, verbal and pictorial interpretations of the proverbs figurative meaning, and verbal and pictorial instances of the proverbs figurative meaning.
Theoretically, the proverbs figurative meaning serves as the conceptual glue that connects the family members.
For example, the family for the proverb Great weights hang
on small wires, might include a picture of barbells hanging on
a thin wire, an interpretation such as The outcome of important
events often depends on seemingly minor details, and verbal
instances such as The shortstop tripped on a pebble and the
game was lost and The nurse accidentally bumped the surgeons hand and the patient died. Crucially, and except for the
proverbliteral picture connection, the family members share no
literal similarity, either of a linguistic or imagistic sort.
Psychological research with proverbs has contributed to
issues in several areas of cognitive science (see Honeck 1997),
as follows:
MENTAL REPRESENTATION. Because research shows that people
can reliably connect the family members, a case can be made for
the role of an abstract, amodal, nonlinguistic, nonimagistic mental representation in cognitive processes such as remembering and categorization.
For example, proverbs are remembered better if they are first
presented along with related as opposed to unrelated, interpretations. And related interpretations, and even verbal instances, serve
as effective prompts for recall of a proverb. Indeed, people judge a
good interpretation to be the best way to represent the meaning of
a proverb family. Moreover, if people fixate on the literal imagery
evoked by a proverb, they are unable to recognize verbal instances
of its figurative meaning. These several results strongly suggest
that the knowledge that guides memory and categorization can be
quite abstract and theory-like, and that imagery is best construed
as the outgrowth of a particular level of understanding.
ON-LINE PROCESSING. Must the literal meaning of a trope be
deciphered before its figurative meaning can be constructed?
The answer from reaction-time research on proverbs is in the
affirmative, although the literal primacy effect reduces with proverb familiarity. Providing advance markers (e.g., proverbially
speaking) reduces processing time for unfamiliar proverbs, but
event-related potential (brain wave) measures indicate that the
words in proverbs used figuratively, versus literally, are harder to
integrate with the discourse context (Schwint, Ferretti, and Katz
2006). Moreover, since proverbs almost always function as indirect speech-acts, these results also indicate that such acts can be
dependent on prior literal access.
CEREBRAL ASYMMETRY. Research on brain damage indicates
that the right hemisphere is better than the left hemisphere at processing nonliteral meanings, the kind involved in
metaphors, connotation, verbal humor, sarcasm, the point

Psychoanalysis and Language


of a story, gist, and proverbs. The right brain appears to process
inputs in a more flexible, open, contextualist, and holistic way,
propensities that would facilitate proverb comprehension.
DEVELOPMENTAL TRENDS. Proverb comprehension improves
slowly over time, beginning at about age seven. Comprehension
is facilitated by supporting context (e.g., pictures), proverb concreteness and familiarity, and reduced information-processing
load, as well as by the child having a better vocabulary, metacognitive awareness of nonliteral meanings, a formal education,
and a good social knowledge base. The cognitive processes subserving improvement are unclear, although analogy has been
implicated.

Theories of Proverb Comprehension


The extended conceptual base theory (Honeck 1997) is a cognitive
psychological, laboratory-derived approach. It claims that proverbs are distinguished by form and function, are variably familiar, and are understood via several phases of problem solving
that are nonautomatic, heavily inferential, and error prone. The
great chain metaphor theory (Lakoff and Turner 1989), a cognitive linguistics cultural approach, makes antithetical claims.
To date, almost all of the empirical research has been done in
conjunction with, and is supportive of, the conceptual base theory (Honeck and Temple 1996).
Richard P. Honeck
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Honeck, Richard. 1997. A Proverb in Mind: The Cognitive Science of
Proverbial Wit and Wisdom. Mahwah, NJ: Erlbaum.
Honeck, Richard, and Jon Temple. 1996. Proverbs and the complete
mind. Metaphor and Symbolic Activity 11: 21722.
Lakoff, George, and Mark Turner. 1989. More Than Cool Reason: A Field
Guide for Poetic Metaphor. Chicago: University of Chicago Press.
Mieder, Wolfgang. 1994. Wise Words: Essays on the Proverb. New
York: Garland.
Mieder, Wolfgang, ed. 2003. Cognition, Comprehension, and
Communication: A Decade of North American Proverb Studies.
Baltmannsweiler, Germany: Schneider Verlag.
Schwint, Christopher A., Todd R. Ferretti, and Albert N. Katz. 2006. The
influence of explicit markers on slow cortical potentials during figurative language processing. In Proceedings of the 28th annual Conference
of the Cognitive Science Society, 268. Mahwah, NJ: Erlbaum.

PSYCHOANALYSIS AND LANGUAGE


Psychoanalysis is, first, a talking cure, a verbal treatment for
mental illness. As such, it has been bound up with speech, meaning, and interpretation from the outset. In keeping with this
connection, psychoanalytic theorists and scholars examining
psychoanalysis have repeatedly addressed the intertwining of
psychoanalysis and language. Some scholars have explored the
place of language and/or language science in Freudian thought
and practice (see, for example, Forrester 1980 and Loewald 1978).
Others have drawn on specific theories of language in order to
alter, reorient, enhance, or reunderstand Freudian principles.
The most famous case of the latter was Jacques Lacan, who drew
on structuralist ideas to revise psychoanalytic accounts of

the unconscious, symptoms, dreams, etc. Some later writers,


such as Marshall Edelson (1975) and Bonnie Litowitz (1978),
drew on transformational grammar to rethink psychoanalytic ideas. (See also Ricoeur 1978 on both approaches.)

Function and Architecture


As the preceding cases may already begin to suggest, it is valuable to distinguish functional accounts of mental phenomena in
psychoanalysis from specific, explanatory architectures. Put very
simply, functional accounts treat what mental operations accomplish (e.g., fear causes us to flee predators); a particular architecture defines the causal organization that enables the functions
(e.g., fear is produced when certain perceptual stimuli produce
convergent arousal in the amygdala). In psychoanalysis, the
functional idea of repression is separable from any particular
account of how repression occurs and where it is located in a
set of mental structures. When Lacan drew on structuralism to
theorize the unconscious, he was preserving the basic functional
idea of Freud while altering the precise way in which the mental
structures thus, the implementation of those functions were
specified. The same point holds for Litowitz and the Freudian
idea of overdetermination (roughly, the multiple interpretable
meanings and causal sources of a dream image or symptom).
Unfortunately, these uses of specific linguistic theories seem
highly problematic. First, they often presuppose the fundamental principles of Freudian architecture and try to graft a very different theory onto them. It is not clear that the theories being
combined are compatible. Moreover, the initial Freudian architecture often seems ill-defined and not particularly well supported by more recent research. Second, the linguistic theories
may not apply literally to the psychoanalytic functions anyway
(e.g., the relation between underlying structure and the
Freudian unconscious is, at best, only analogical). Finally, when
not used in a loose, metaphorical way, the linguistic theories
have sometimes been taken up in specific forms that were subsequently bypassed in linguistics, leaving the psychoanalytic redevelopments outdated.
There may be a more productive way of revising Freudian
architecture, however, while preserving basic functional principles of psychoanalysis, including those that bear on language.
Specifically, we might begin with the relatively well-established architecture that has emerged from cognitive neuroscience in recent years. From here, we might seek to reformulate
some central functional principles of psychoanalysis in that
architecture. (This is roughly the program of neuropsychoanalysis.) In the remainder of this entry, I set out one possible account
of this sort.
Before going on to language, however, we should briefly consider the general relation between some key functional divisions
in psychoanalysis primarily the conscious/unconscious, or
repression and neurocognitive architecture.

Repression and the Brain


Different readers of Freud and different psychoanalysts will
characterize psychoanalysis differently. For our purposes, we
may consider a few elements of psychoanalytic theory to be central and distinctive. I would first of all list the dynamic unconscious, thus, mental contents that are not available to conscious

683

Psychoanalysis and Language


knowledge even though they are the type of contents that are ordinarily available to consciousness. In other words, some aspects
of the mind for example, grammatical principles are the
sorts of things we can only infer; we cannot know them directly.
However, other aspects of the mind for example, memories,
beliefs, and desires are the sorts of things that we ordinarily
seem to know directly. When we do not know memories, beliefs,
and desires, they are unconscious, as are grammatical principles.
But the reason for their unconsciousness is different. It is a matter of repression (hence, the dynamic part of dynamic unconscious). Repression begins with mental conflict that gives rise to
emotional pain. To take a simple example, one might experience
conflict between ones desires and ones moral aspirations, leading to feelings of guilt. In certain circumstances, we respond to
this pain by making one element of the conflict inaccessible to
conscious thought in this case, presumably, the desires. This
can occur at any age. However, classical psychoanalysis posits
that primary repression occurs in early childhood. In adulthood,
secondary repression takes place when a conflict is assimilated to
primary repressed material.
This simplified picture of repression is at least broadly consistent with current neurocognitive architecture. We often experience emotional conflict due to the different imperatives of our
various (largely subcortical) emotion systems. Research suggests
that the anterior cingulate cortex (ACC) monitors conflicts in task
performance (see, for example, MacDonald et al. 2000, 1835).
When the ACC recognizes painful conflict, it activates the dorsolateral prefrontal cortex, which engages inhibitory processes
(see Ito et al. 2006; Preston and de Waal 2002; Lieberman and
Eisenberger 2006). More generally, our brains operate to reduce
emotional conflicts, producing valenced outcomes from ambivalent inputs (Ito and Cacioppo 2001, 69). Psychoanalysis adds two
things to this model. First, it isolates a class of feelings and ideas
so thoroughly inhibited that they do not, then or subsequently,
enter working memory. Second, it gives special importance to
early childhood experiences though this is broadly in keeping
with the cognitive neuroscientific idea of critical periods in
childhood when certain aspects of cognition (e.g., knowledge of
language) are shaped in crucial ways.
The problem with repression is that the motivational force of
the unconscious elements continues to have consequences in
ones behavior. One part of this fits in a general way with current
neurocognitive accounts of memory, which distinguish explicit
episodic and semantic/factual memories from implicit emotional
memories (see LeDoux 1996, 182). It is well established that we
are not consciously aware of emotional memories, though they
affect our behavior. Emotional memories are commonly associated with episodic and semantic/factual memories that allow
us to explain our feelings. One way in which the two may be
dissociated is through brain damage. Psychoanalysis suggests
that another way they may be separated is through repression
of the relevant episodic and semantic/factual memories. Such
repression would presumably allow the emotional memories to
operate motivationally without our awareness, awareness normally guided by the (now repressed) explicit memories. In any
case, the behavior affected by consciously unavailable motives
prominently includes ones verbal behavior which brings us to
language.

684

Language and the Unconscious


Since language in psychoanalysis necessarily involves components, I organize my discussion (somewhat loosely) by reference
to Roman Jakobsons influential sixfold analysis of the speech
situation into speaker, hearer, context, medium of contact, message, and code.
SPEAKER. The psychoanalytically crucial idea here, developed
most famously by Lacan, is the difference between the subject
as speaker and the subject as object of speech. When I describe
myself, there is necessarily a difference between the I that does
the describing and the myself that is described. The latter is
linguistic in two senses. First, my account of myself sometimes
called my self-concept is largely given in verbal descriptions. These include not only descriptions that I have formulated on my own but also those that I have taken up from others
(e.g., from student evaluations of my teaching). More generally,
language provides me with the categories by which I identify
myself as a teacher or a scholar, as boring or interesting, clear
or obscure (see meaning and belief). According to Lacan
and others, it is this necessary division between the (largely
verbal) self-object and the (speaking) subject that allows there
to be an unconscious. Put very simply, the unconscious occurs
in that motivationally consequential part of me that is excluded
from my self-concept. The very idea of such a division may seem
counterintuitive, as we ordinarily imagine our self-knowledge to
result directly from introspection. However, it is well established
outside of psychoanalysis that this is not the case. For example,
drawing on an extensive body of research, Richard Nisbett and
Lee Ross explain that [k]nowledge of ones own emotions and
attitudes, though commonly believed to be direct and certain, has been shown to be indirect and prone to serious error.
Such knowledge is based in large part on inferences about causes
of behavior (1980, 227).
HEARER. The crucial feature of the hearer in psychoanalysis
concerns not the hearer per se but, rather, ones implicit imagination of the hearer. In psychoanalysis, the unconscious has
both emotional and cognitive consequences. Among the most
crucial are those that define the transference. Transference is,
roughly, unconsciously basing ones response to someone in
ones current environment on repressed memories of, attitudes
toward, or ideas about some earlier figure (e.g., ones father).
It is well established in pragmatics that speakers engage
cooperatively with addressees, not only in following general
principles but also in making specific adjustments for the likely
knowledge, interests, attitudes, and so on of their addressees.
In all conversations, these adjustments will be partially guided
by cognitive structures, such as prototypes and exemplars.
For example, I may avoid making a particular sort of joke with
a colleague because I have a strong memory of such a joke
flopping in the past. That memory would be a salient exemplar, guiding my expectations about my addressee. One might
understand transference as the unconscious reliance on a particular exemplar for ones addressee. Moreover, this reliance
bears on motivationally crucial aspects of the interaction (e.g.,
trust). Finally, it operates even when incompatible with individual or situational information (e.g., about the addressees

Psychoanalysis and Language


trustworthiness, as when my unconscious exemplar fosters
unwarranted trust).
CONTEXT. The context of a given speech event might be understood to encompass not only such matters as the social and
material situation but also the array of ideas and feelings that
accompany speech on the part of both the speaker and the
hearer. This includes what psychoanalysts refer to as fantasy,
which is bound up with transference. Specifically, my thoughts
about an addressee are embedded in a series of often fleeting
ideas and imaginations that are connected with that addressee,
the larger situation, and other matters. This fantasy structure is
likely to give rise to ephemeral emotional ambivalences in any
circumstances. When transference is involved, it is likely to generate more sustained conflict that repeatedly inflects my motivational and cognitive responses to the addressee. Such fantasy
may be understood, first, as the complex of images, memories,
feelings, and, of course, words and meanings that are continually
produced in association with speech and action through priming. These primed contents are partially activated but have not
reached a threshold where they draw attentional focus in working memory. In some cases, psychoanalysts would add, they cannot draw attentional focus. Nonetheless, they undoubtedly affect
our action, including our speech production guiding choices
of, for example, lexical alternatives.
MEDIUM OF CONTACT. Many aspects of the physical channel
(Jakobson 1987, 66) of communication are psychoanalytically
consequential, including visual and aural aspects of both linguistic and paralinguistic action (see paralanguage). One
important aspect of psychoanalytic interpretation is that linguistic and paralinguistic features may contradict one another
such that the paralinguistic features may partially express the
unconscious attitudes and ideas that are inhibited in the verbal
statement. This is connected with our relatively limited control
over the expressive component of emotions. Specifically, our
prefrontal control of our expressions usually cannot convincingly replicate the expressions produced by the emotion systems
themselves a point familiar to anyone who has tried to smile
for a photograph. This division allows us to recognize many sorts
of insincere expression in ordinary life. Psychoanalysis adds that
paralinguistic features may reveal inhibited ideas or feelings with
strong motivational force, even when we are unaware of those
ideas or feelings. Moreover, certain symptomatic acts may have
the same function. In one case, one of Freuds patients repeatedly moved her finger in and out of a small purse as she spoke.
Freud took this to be a masturbatory action ( [1905] 1981, 75).
Whether or not he was right in this case, in principle it is possible for this sort of thing to occur. One form of implicit memory
involves procedural schemas, prominently motor routines for
various habitual actions. It is possible to activate a motor routine
without self-conscious monitoring. If there are repressed motivations, it may be possible for those to manifest themselves in at
least certain schemas of this sort.
MESSAGE. One aspect of expression stressed by Freud is its
determinacy. This is most famously (or notoriously) the case
in errors, which Freud explained by reference to unconscious

impulses. But the generation of ordinary speech is no less determinate, and it is not determined solely by ones self-conscious
purposes. At any given moment, there are many ways in which
I might phrase what I wish to say. However, I say things in one
way only. For example, in speaking of a particular object, I might
refer to it as a p.c., a computer, a laptop, and so on. But I choose
one. In a connectionist model, we would say that a range of
factors has given the greatest activation to a particular phrasing.
In some cases, this is just the basic level term. In other cases,
there is some contextual or other reason for the choice. It might
be a matter of repetition (I just saw a copy of PC Computing,
so I say p.c.); alternatively, it might involve a more complex
lexical relation (someone just mentioned desktops, so I
say laptop). Psychoanalysis adds a series of circuit activations
from dynamically unconscious motivational contents. For example, Ernst Lanzer (Freuds Rat Man) tells Freud about someone who flattered and deceived him. He identifies the person as
a medical student. There are many ways in which Lanzer could
have introduced this person. What caused the specifically medical detail to be activated? One possibility is that Lanzer had a
transferential relation to Freud in which he feared that Freud
would flatter and deceive him in the treatment. Circuits connecting Freud with this student (and presumably with parental
figures from the period of primary repression; see Hogan 1996,
14850) would then have shaped this speech specifically, the
choice of medicine as an identifying characteristic separately
from Lanzers self-conscious intentions.
CODE. Various features of a linguistic code particularities of the
lexicon, morphology, grammar may allow the expression of
unconscious ideas and impulses beyond the straightforward
semantics of lexical selection. This may occur most obviously
through ambiguity or puns, a point stressed particularly by
Lacan. For example, James Gorney (1990) reports a case in which
someone reported conscious anxiety over a bill and thereby
expressed unconscious anxiety over someone named Bill. This
is most readily understood in terms of left hemisphere versus
right hemisphere processes. Neurocognitive research shows
us that the right hemisphere generates multiple meanings, some
of which are inhibited by the left hemisphere, which selects the
contextually relevant meanings (see Chiarello 1998, 145; Faust
1998, 180). For example, in the semantic context of references to
a restaurant, and the linguistic context of following a determiner,
left hemisphere interpretive processes will limit the meanings of
bill to a list of charges. However, right hemisphere interpretive processes will briefly generate beak and William also.
Psychoanalysis once again adds to this well-established architecture a dynamic unconscious component. Specifically, it adds
the idea that in some cases, a particular word (e.g., bill, rather
than check) may be produced due to unconscious activations
(see spreading activation) that are motivational, though
inaccessible to the working memory of the speaker.

Conclusion
The relation between psychoanalysis and language has been
explored by many writers using many approaches. In the preceding sections, I have tried to show that central functional
principles of psychoanalysis as they relate to language may be

685

Psychoanalysis and Language

Psycholinguistics

integrated with current cognitive neuroscientific architecture.


Some psychoanalytic principles seem plausible in this architecture, and they enrich our understanding of human psychology
and language use in that context, though they do not alter the
architecture per se. The question that remains open is whether
there a function of repression and, if so, precisely how it operates
in neurocognitive architecture. For example, as the Nobel laureate and prominent neuroscientist Gerald Edelman explained,
referring to the Freudian unconscious and the notion of repression, it is conceivable that the modulation of value systems
in the brain could provide a basis for the selective inhibition of
pathways related to particular memories (2004, 95; Edelman
suggests that subcortical involvement, particularly in relation to
focal attention, is crucial to this inhibition). If Freud was right to
posit such a functional principle, then psychoanalysis may alter
our understanding of cognitive neuroscientific architecture by
introducing new processes or structures or at least new systemic consequences resulting from previously unrecognized
interactions of already posited processes and structures. If he
was not right about this functional principle, it remains the case
that many principles of psychoanalysis may be accommodated,
in altered form, in cognitive neuroscience, including many principles of psychoanalysis and language.
Patrick Colm Hogan

Ito, Tiffany, Geoffrey Urland, Eve Willadsen-Jensen, and Joshua Correll.


2006. The social neuroscience of stereotyping and prejudice: Using
event-related brain potentials to study social percpetion. In Cacioppo,
Visser, and Pickett 2006, 189208.
Jakobson, Roman. 1987. Language in Literature. Ed. Krystyna Pomorska
and Stephen Rudy. Cambridge, MA: Belknap.
Lacan, Jacques. 1966. crits. Paris: Editions de Seuil.
LeDoux, Joseph. 1996. The Emotional Brain: The Mysterious Underpinnings
of Emotional Life. New York: Touchstone.
Lieberman, Matthew, and Naomi Eisenberger. 2006. A pain by any
other name (rejection, exclusion, ostracism) still hurts the same: The
role of dorsal anterior cingulate cortex in social and physical pain. In
Cacioppo, Visser, and Pickett 2006, 16787.
Litowitz, Bonnie. 1978. On overdetermination. In Smith 1978, 35594.
Loewald, Hans. 1978. Primary process, secondary process, and language. In Smith 1978, 23570.
MacDonald, Angus, Jonathan Cohen, Andrew Stenger, and Cameron
Carter. 2000. Dissociating the role of the dorsolateral prefrontal and
anterior cingulate cortex in cognitive control. Science 288: 18358.
Nisbett, Richard, and Lee Ross. 1980. Human Inference: Strategies and
Shortcomings of Social Judgment. Englewood Cliffs, NJ: Prentice-Hall.
Preston, Stephanie, and Frans de Waal. 2002 Empathy: Its ultimate and
proximate bases. Behavioral and Brain Sciences 25: 172.
Ricoeur, Paul. 1978. Image and language in psychoanalysis. In Smith
1978, 293324.
Smith, Joseph, ed. 1978. Psychoanalysis and Language. New Haven,
CT: Yale University Press.

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Beeman, Mark, and Christine Chiarello, eds. 1998. Right Hemisphere
Language Comprehension: Perspectives from Cognitive Neuroscience.
Mahwah, NJ: Lawrence Erlbaum.
Cacioppo, John, Penny Visser, and Cynthia Pickett, eds. 2006. Social
Neuroscience: People Thinking About Thinking People. Cambridge,
MA: MIT Press.
Chiarello, Christine. 1998. On codes of meaning and the meaning
of codes: Semantic access and retrieval within and between hemispheres. In Beeman and Chiarello 1998, 14160.
Edelman, Gerald. 2004. Wider Than the Sky: The Phenomenal Gift of
Consciousness. New Haven, CT: Yale University Press.
Edelson, Marshall. 1975. Language and Interpretation in Psychoanalysis.
New Haven, CT: Yale University Press.
Faust, Miriam. 1998. Obtaining evidence of language comprehension
from sentence priming. In Beeman and Chiarello 1998, 16186.
Fiumara, Gemma. 1992. The Symbolic Function: Psychoanalysis and the
Philosophy of Language. Oxford: Blackwell.
Forrester, John. 1980. Language and the Origins of Psychoanalysis. New
York: Columbia University Press.
Freud, Sigmund. [1905] 1981. Bruchstck einer Hysterie-Analyse. Frankfurt
am Main: Fischer Taschenbuch.
. [1909] 1982. Bemerkungen ber einen Fall von Zwangsneurose. In
Zwei Falldarstellungen. Frankfurt: Fischer Taschenbuch.
Gorney, James. 1990. Reflections on impasse. In Criticism and
Lacan: Essays and Dialogue on Language, Structure, and the
Unconscious, ed. Patrick Colm Hogan and Lalita Pandit, 14751.
Athens: University of Georgia Press.
Hogan, Patrick Colm. 1993. Dora: Desire and ambiguity in the fragment
of a psychoanalysis. American Journal of Psychoanalysis 53.3: 20518.
. 1996. On Interpretation: Meaning and Inference in Law,
Psychoanalysis, and Literature. Athens: University of Georgia Press.
Ito, Tiffany, and John Cacioppo. 2001. Affect and attitudes: A social neuroscience approach. In Handbook of Affect and Social Cognition, ed.
Joseph Forgas, 5074. Mahwah, NJ: Lawrence Erlbaum.

686

PSYCHOLINGUISTICS
Psycholinguistics studies the relationship between the mind
and language. Its concern is with the cognitive processes that
underlie the acquisition, storage, and use of language and their
correlates in observable neurological processes in the brain. The
field is thus heavily reliant upon principles and research methods adopted from cognitive psychology. However, it is essentially
multidisciplinary, drawing also upon linguistics, speech science,
phonetics, computer modeling, neurolinguistics, discourse
analysis (linguistic) and semantics.
Psycholinguistic inquiry is driven by the premise that language has developed in ways that reflect the structure of the
human mind. Certain shared cognitive routines are assumed
to underpin the production and reception of language, despite
variations between individuals in terms of level of vocabulary and powers of self-expression. Research thus tends to be
normative in direction, though due allowance is made for the
characteristics of the language being examined, the task being
undertaken, and the population under investigation. A favored
approach is to investigate small-scale components of a processing system in order to gradually build up a picture of the system
as a whole.
An interest in the psychology of adult language emerged
during the nineteenth century, with initiatives such as Paul
Pierre Brocas work (1863) on the location of language in the
brain and Francis Galtons (1883) on word association. A parallel interest in child language acquisition had developed earlier
from the Enlightenment debate between rationalist followers
of Descartes, who believed that much human knowledge was
innate, and empiricists such as Hume and Locke, who asserted
that it was entirely acquired.

Psycholinguistics
The term psycholinguistics probably dates from the 1930s,
but progress during the first half of the twentieth century was
discouraged by the dominant behaviorist view that the human
mind is unknowable. The field did not emerge as a discipline
in its own right until the mid-1950s. An impetus was given by
a series of essays in which George Miller mapped out possible
areas of inquiry. A major landmark was then Noam Chomskys
1959 rebuttal of the behaviorist assumptions expressed by B.
F. Skinner in his Verbal Behavior. Citing the speed with which
an infant masters a linguistic system and the poverty of the evidence available, Chomsky concluded that language is a genetically acquired faculty. His nativist stance (see innateness and
innatism) stimulated modern studies of child language and
began a controversy that continues to the present day.
Much early psycholinguistic inquiry was closely allied to linguistic theory. It explored aspects of Chomksys early generative grammar on the assumption that the rules represented
psychological reality: that is, that they provided a model of the
operations of the mind, as well as a linguistic account of grammatical structure. The findings were mainly negative or inconclusive, and the paths of linguistics and psycholinguistics began
to diverge, although a body of psychological linguists continues today to work within a framework of Chomskyan theory.
In its current form, psycholinguistics falls into a number of
distinct areas, but considerable overlaps between them give the
discipline coherence.

Language Processing
Processing research seeks to characterize the four language
skills of reading, writing, speaking (speech production)
and listening (auditory processing, speech perception)
in terms of the cognitive operations that underlie them. It also
investigates how vocabulary is retrieved from the mind (word
recognition) and how syntactic structures are assembled
or interpreted. Its findings have important applications for the
teaching of literacy (see teaching reading and teaching
writing), for language therapy, and for second language learning (see second language acquisition).
Drawing upon an information-processing precedent, accounts
often depict language users as taking linguistic material through
a series of levels of representation. For example, a listener
might be seen as building sounds into syllables, syllables into
words, words into clauses, and clauses into abstract meanings.
In fact, the process is not necessarily a sequential one, as the
language user is capable of operating at all these levels simultaneously. What is more, there is evidence that higher-level
knowledge influences processing at a lower level (knowledge of
the existence of a particular word might assist the recognition of
a string of sounds in a top-down way). Opinion continues to be
divided as to whether the relationship between the levels is fully
interactive or whether each level operates with a degree of independence (in a modular way). The benefit of the former is that
all sources of information become available at once; the benefit
of the latter is that rapid decision making is enabled.
The receptive skills of listening and reading entail two distinct operations. In the first, visual or auditory information is
decoded, a process that entails mapping from strings of sounds
or letters to known words. The user relies partly upon perceptual

evidence but is also influenced by the expectations created by


context. Groups of words are parsed into syntactic structures
(see parsing, human), and a proposition is extracted from
the utterance. The second phase, meaning construction, requires
the user to elaborate on the literal meaning of the utterance by
inferring details that have been left unexpressed by the speaker/
writer. The user then adds the new information to the meaning
representation or mental model built up so far in the discourse
and checks for consistency.
The productive skills of speaking and writing proceed in the
opposite direction: from idea to language. An abstract representation of the planned sentence is created, and linguistic form is
then conferred upon it. While the sentence is being produced, it
has to be stored in a mental buffer, in the form of a set of instructions to the articulators or the fingers. The user monitors performance and self-corrects if necessary. An important difference
between most writing events and most speech events lies in the
time available for planning, self-monitoring, and review.
Accounts of language processing rely upon constructs from
cognitive psychology, such as attention, memory, and automaticity. An important principle is that human working memory
is limited in capacity. The language user minimizes the demands
made upon memory by transferring information into a more permanent store (long-term memory) or by establishing highly automatic routines for speech assembly or word recognition that do
not require focused attention. In addition, schema theory helps
to explain how a listener/reader enriches a message by the addition of external knowledge and how a speaker/writer is able to
abridge a message, relying upon the recipient to fill in missing
details.
The methods used to study language processing include observation and verbal report, but an experimental approach tends to
predominate. The tasks employed are often small in scale, tapping into on-line processes; those that enable the researcher to
measure reaction time (e.g., time taken to recognize a word) are
especially favored. Increasingly, researchers also draw upon
findings from brain imaging (see neuroimaging) that map the
neurological correlates of different types of language activity.

Language Representation
A closely allied area of enquiry considers how linguistic knowledge is stored in the mind in a way that enables rapid matches to
be made. Words are said to be stored as lexical entries, containing
information about the items orthographical form and pronunciation, word class, inflections, and combinatorial possibilities.
There has been discussion of whether there are separate entries
for affixes such as un- in the word unhappy or whether words of
this type have their own entries. Also much discussed has been the
question of category membership: how a language user manages
to classify an item of crockery as a cup rather than a mug or bowl.
The entries in a users mental lexicon are massively interconnected,
enabling a process known as spreading activation, in which it
becomes easier to locate (e.g.) the words nurse, hospital, or patient
after recently hearing or seeing the word doctor. The connections
vary in strength, favoring words of high frequency and words with
a high probability of occurring together.
Traditional linguistic accounts assume that syntax is represented in the mind as a set of rules, but recent psycholinguistic

687

Psycholinguistics
thinking has swung toward the notion that speakers can only produce language as rapidly and accurately as they do if they make
extensive use of stored chunks of language already assembled
syntactically. The human mind appears better adapted for the
storage of enormous amounts of information than for rapid processing. exemplar theory suggests that linguistic knowledge
derives from multiple memory traces of individual language episodes. The attraction of the theory is that it accounts not only for
the speakers ability to retrieve chunks but also for the listeners
ability to deal with a range of voices and accents and to recognize the many forms that a single word can take in connected
speech. Instead of mapping back to one idealized template for
a phoneme or word, the listener draws upon vast numbers of
traces laid down by earlier encounters.

Language Acquisition (Developmental Psycholinguistics/


Child Language Development)
Those who adopt a Chomskyan line examine the productions
of infants for evidence of universal linguistic principles or of
the setting of parameters to incorporate features specific to the
target language (see principles and parameters theory
and language acquisition ). Others prefer a data-driven
approach that analyzes the speech of the developing child
with no pretheoretical assumptions. Some 30 years of research
has produced evidence that input to infants (child-directed
speech) is more informative than Chomsky once asserted.
Developmental patterns have been traced in the childs phonology (phonology, acquisition of) and vocabulary (lexical
acquisition). There has been research into how an infant
builds up lexical categories such as dog or flower, overextending them in some instances and underextending them in
others. Issues in morphology and syntax include whether
there is a fixed order for the acquisition of inflections and how
infants manage to derive semantic relationships from cues such
as word order.
It has long been recognized that the timing and rate of language acquisition varies enormously from one child to another.
It is customary, therefore, to mark development not by the childs
age but by the average number of morphemes in a childs utterances, a measure known as mean length of utterance (MLU).
Gradual increases in MLU reflect changes in the childs cognitive
capacity and in its ability to articulate groups of words.
Alongside longitudinal observation in the form of recordings
and diaries, language acquisition is also studied experimentally.
Children are asked to perform linguistic or perceptual tasks
designed, for example, to elicit plural forms or to demonstrate
phonological awareness. Methods have also been devised that
provide insights into the mental processes of preverbal children
by tracking shifts in the direction of their attention (see communication, prelinguistic). Computer simulation also plays a
part. A recent challenge to nativism comes from connectionist modeling, which has shown, on a limited scale, that a computer program is capable of acquiring language rules by exposure
to repeated examples.
Further insights into language acquisition are achieved by
studying children who grow up in exceptional circumstances.
The most extreme cases are those of children who have been
deprived of human company through being abandoned in the

688

wild or confined to their rooms for long periods. There are also
studies of the course of language development in twins, blind
children, the children of deaf parents, and children acquiring a
pidgin language.

Brain and Language


Interest in the location of language in the brain was stimulated
by the discoveries of Broca and Carl Wernicke, which showed
that damage to two small areas of the left hemisphere was
associated with the loss of speech. Later research supported the
view that the left hemisphere was the dominant one for language
in nearly all right-handers and the majority of left-handers. A
number of children who suffered damage or surgical intervention to the left hemisphere before the age of five appeared to
relocate their language operations in the right one (see right
hemisphere), an effect that was less evident in adults. From
these findings grew a theory that the brain was plastic in the
early years of life, and that language became lateralized to the left
hemisphere around the age of five. The conclusion was that brain
lateralization corresponded to a critical period for acquiring
ones first language.
Views have now changed in the face of evidence that the brain
continues to manifest flexibility into adulthood. Research suggests that some areas are more heavily involved than others in
particular aspects of language use but that language processing
as a whole cannot be localized. It is widely distributed across the
brain and supported by massive neural interconnections that
enable information to be transmitted extremely rapidly. It has
also become apparent that the right hemisphere of the brain has
an important role in language, dealing broadly with larger units,
such as prosody, discourse structure, and so on.
Neuroscience contributes increasingly to psycholinguistics,
with brain-imaging technology enabling researchers to locate
and track the electrical impulses or blood flows associated with
different types of linguistic behavior.

Language as a Human Faculty


Here, two central concerns dominate. The first is the extent to
which language can be regarded as restricted to the human race.
Current evidence suggests that it is, though pigmy chimpanzees
have shown themselves capable of acquiring large sets of symbols
and using them in quite elaborate ways (see animal communication and human language and primate vocalizations). The second issue concerns how and when language first
evolved (see origins of language). It is suggested that its
emergence was dependent both upon the evolution of an appropriate vocal apparatus (see speech anatomy, evolution
of) and upon changes in the human brain. A problematic issue
for nativists is to explain how language first entered the human
gene pool; a catastrophic account has been proposed, in which a
sudden change in brain infrastructure coincided with increased
communicative demands. A weakness of the theory is that the
brain evolves very slowly, whereas language changes quickly.
A number of leading commentators have, therefore, concluded
that language initially developed in a way that accorded with the
processing capabilities of the brain, rather than vice versa, and
that the functions of the brain have since adapted to accommodate it through coevolution.

Psycholinguistics

Language Impairment
Psycholinguistics also studies the psychological factors that contribute to language disorders. One type of difficulty investigated
is developmental: It includes problems of speech production,
such as stammering, as well as disorders of reading and
writing, such as dyslexia and dysgraphia. Another type is
acquired as a result of accident, stroke, or surgery and includes
aphasia, an impairment of the ability to produce and/or understand speech. Also investigated are the effects of aging upon
language. In all of these areas, an important distinction has to be
made between loss of language from the mental store and disruption of the processes by which language is accessed.
Besides supporting the work of clinicians and therapists in
treating disorders, the study of language impairment feeds back
into other areas of psycholinguistics. Firstly, it sheds light on the
skills and cognitive processes that are essential to normal language operations but are absent or obstructed in the patients
studied. Secondly, it affords possible insights into the process of
language acquisition. Research into cases of specific language
impairment (SLI) has sought evidence as to whether language is
or is not genetically transmitted. Research into the linguistic competence of sufferers from syndromes such as Downs, Williams,
and autism has sought evidence as to whether language forms
part of general cognition or is independent of it.

Second Language Acquisition (SLA)


As with first language acquisition, there are two distinct branches
of inquiry. One group of researchers takes linguistic theory as a
point of departure, while the other draws its guiding principles
from cognitive psychology. The linguistic group employs models of language (predominantly Chomskyan) on the premise that
they correspond to internalized rules in the mind of the user.
Their research seeks evidence of the resetting of Chomskyan
parameters or compares the components of the second language (L2) grammar with those of a native speaker. Research
in this tradition concerns itself with competence, rather than
performance; thus, there is considerable reliance upon grammaticality judgments as a source of evidence.
A more cognitive approach views the ability to perform in a
second language as an acquired skill and draws upon cognitive
theory relating to problem solving and to the development of
expertise. On this analysis, language learners begin with declarative rule-based knowledge, which gradually becomes proceduralized. Separate steps in the process become combined, and
access to the rules becomes less and less subject to conscious
control and increasingly automatic. Recent thinking in SLA has
also been influenced by emergentism, leading to a theory that
L2 proficiency derives not from internalized rules but from multiple stored exemplars of linguistic encounters.
Researchers in SLA draw increasingly upon psycholinguistic
constructs such as memory, attention, and automaticity when
considering the cognitive demands imposed upon the L2 user.
Another approach employs psycholinguistic models of language
processing as descriptors of skilled L1 behavior and, thus, as targets for those teaching language skills.
There has been extensive psycholinguistic study of bilingualism, focusing especially upon how a bilinguals two languages are stored: whether they are kept entirely apart, subserved

by a single semantic system, or completely integrated. Other


issues concern the extent to which one language influences the
use of the other and whether there are cognitive costs and benefits in growing up bilingual.

Language and Thought


The relationship between language and thought was much
discussed in the early days of psycholinguistics. Major issues
were the extent to which one is dependent upon the other and
the extent to which language shapes our perceptions of reality.
Developmental studies include accounts by Jean Piaget of the
impact of cognitive development upon language acquisition and
by Lev Vygotsky of how egocentric speech becomes internalized
as thought. There have been attempts by psycholinguists to put
the Sapir-Whorf hypothesis to the test, especially in the area of
color terms.

Conclusion
Research is active in all the areas of psycholinguistics that have
been profiled. The methodologies employed are diverse and (as
noted) include experimentation, observation, elicited response,
verbal report, and grammaticality judgments. However, the
future direction of research will be greatly influenced by technological advances. The most important recent innovation has
been the use of brain imaging to validate psycholinguistic theories about the operations underlying language performance.
Researchers are able to identify different types of brain event,
distinguishing, for example, one associated with syntactic functions from one associated with semantic. A second development lies in the increased use of eye-tracking equipment not
only to study reading but also to record the way in which a language users attention is attracted to an object by semantic cues.
Computer science also makes a contribution. Psycholinguistics
employs computer modeling of linguistic performance, including neural networks that simulate spreading activation, connectionist word-recognition models based upon competition
between items, and connectionist learning models. It also draws
upon ongoing research into artificial intelligence for example,
studies of expertise, artificial speech recognition, or automatic
translation.
John Field
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aitchison, Jean. 2008. The Articulate Mammal. 5th ed. London: Routledge.
Readable general introduction to many areas.
. 2004. Words in the Mind. 3d ed. Oxford: Blackwell. Accessible
overview of lexical storage and retrieval.
Clark, Eve V. 2003. First Language Acquisition. Cambridge: Cambridge
University Press. Extensive account of the field.
Deacon, Terrence. 1997. The Symbolic Species. New York: W. W. Norton.
In-depth account of neurolinguistics and language evolution.
Field, John. 2004. Psycholinguistics: The Key Concepts. London: Routledge.
Comprehensive reference guide for non-psycholinguists.
Gathercole, Susan E., and A. D. Baddeley. 1993. Working Memory and
Language. Hove, UK: Erlbaum. Key account of the part played by working memory.
Gleason, Jean Berko, and N. B. Ratner, eds. 1998. Psycholinguistics. Fort
Worth, TX: Harcourt Brace. Overviews of nine major areas.

689

Psychonarratology
Levelt, Willem J. M. 1989. Speaking. Cambridge, MA: MIT Press. Standard
comprehensive account.
Obler, Lorraine K., and K. Gjerlow. 1999. Language and the Brain.
Cambridge: Cambridge University Press. Short but wide-ranging introduction for non-specialists.
Perfetti, C. 1985. Reading Ability. New York: Oxford University Press.
Influential account of skilled reading.

PSYCHONARRATOLOGY
Definition
This term was first coined by P. Dixon and M. Bortolussi in 2001
in their chapter Prolegomena for a science of psychonarratology and developed further by Bortolussi and Dixon in their
book Psychonarratology: Foundations for the Empirical Study
of Literary Response (2003). It designates an interdisciplinary,
empirical approach to the study of literary response (see literature, empirical study of) and the processing of narrative. Generally, as the term suggests, it brings together two very
diverse fields: psychology and literary studies. However, while in
literary studies psychology typically refers to psychoanalysis,
psychonarratology draws on cognitive psychology and discourse
processing for its methodology. Within literary studies, it looks to
narratology for its conceptual and theoretical insights about
narrative prose but also draws on related fields, such as reception and reader response theories.

questions, that research has typically not been informed by the


body of scholarship in literary studies on narrative structure and
form. Thus, psychonarratology is an attempt to bring the empirical methods and analytical style of cognitive psychology to the
problems and domain of narratology.
Psychonarratology builds on previous research in the empirical study of literature, a field first designated by Sigfried Schmidt
(1981), who developed some of its theoretical and philosophical
foundations. Pioneering empirical work in this effort included
research by A. C. Graesser et al. (1999), Willie van Peer and Max
Louwerse (2002), D. S. Miall and D. Kuiken (1995), R. J. Gerrig
(1992), and D. Vipond and R. A. Hunt (1984). Current research
in the general area includes not only psychonarratology but also
research on a wide range of other topics, such as genre, emotion,
aesthetics and aesthetic appreciation, and film and media, as
well as evolutionary and intercultural approaches.

Main Tenets of Psychonarratology


As developed in Psychonarratology by Bortolussi and Dixon, four
cascading proposals are made for the empirical study of reader
response to narratives. The first is a conceptual distinction
between textual features and reader constructions that allows a
suitable framing of empirical questions. The second is the application of statistical models for conceptualizing the nature of the
reader. The third is a preference for the methodology of the textual experiment. And the fourth is a hypothesis concerning readers representation of a conversational narrator.

Theoretical Background
The conjoining of cognitive psychology and narratology has two
theoretical motivations. On the one hand, literary studies in general, and narratology in particular, seemed to have reached an
impasse from which the traditional stock of methods provided no
means of escape. Numerous scholars had understood that texts
are just collections of letters, words, sentences, and paragraphs,
that meanings emerge in the interaction between texts and readers, and that these interactions could be varied and complex. For
example, from within the phenomenological tradition, Roman
Ingarden conceived of literary texts as schematic structures that
needed to be concretized or completed by the reader, and W.
Iser, following in his footsteps, regarded literary texts as indeterminate objects that elicit gap-filling strategies on the part of
the reader. However, without appropriate methodologies for
assessing the cognitive processes of real readers, literary scholars were restricted to making intuitive and purely speculative
assumptions about the effects of particular aspects of literature
on its readers. This gave rise to notions of an ideal reader, a timeless, homogeneous entity possessing the ideal competence
required to process texts in some ideal fashion. Variants of this
ideal entity were expressed through a host of related terms: the
model reader, the super reader, and so on. Various branches of
literary studies, regardless of their specific goals or scope, inevitably came up against the same roadblock, namely, the need to
account for the effect(s) of texts on readers, without the required
tools for advancing beyond the purely hypothetical.
On the other hand, empirical research in discourse processing
generally neglected to address the processing of complex, realworld narratives and literary narratives in particular. Moreover,
to the extent that cognitive psychology has investigated such

690

FEATURE/CONSTRUCTION DISTINCTION. Psychonarratology makes


a crucial distinction between textual features, that is, those properties of the text that can be objectively identified, and reader
constructions, or the mental representations of the reader.
Features are objective and can, in principle, be reliably identified
by trained observers. Ideally, a textual feature could be specified
by an explicit algorithm operating on the text. As a minimum,
there should be some consensus about an explicit definition of
a feature and clear examples. In contrast, constructions are subjective and can vary across readers as a function of specific goals,
reading context, expectations, literary experience, and individual
reader characteristics. Logical and conceptual analysis is sufficient for understanding the former, but in order to draw inferences about the nature of constructions, one requires empirical
evidence on real readers reading. According to the analysis of
the authors, many of the difficult controversies in literary studies arise from a failure to carefully distinguish features and constructions and to apply appropriate methodologies to each.
THE NATURE OF THE READER. One of the greatest challenges facing scholars interested in literary reception, regardless of their
particular orientation, is the reconciliation of the need to make
general claims about readers (such as those entailed by the
concept of an ideal reader) with the fact that reader response is
seemingly variable and idiosyncratic. In psychonarratology, this
resolution is accomplished by applying the statistical concepts
of population and measurement distributions, an approach that
the authors refer to as the statistical reader.
A population is a particular group of people about which
one hopes to draw interesting conclusions, for example,

Psychonarratology
undergraduate students in Alberta and California or all graduates of a community college who have taken no more than one
literature course. One of the main tenets of psychonarratology is
that scientific claims can only be made about a clearly delimited
population of readers. However, any given group of readers need
not be homogeneous and uniform; rather, each group may consist of any number of overlapping subgroups that may or may not
differ with respect to interesting aspects of narrative response.
Because populations are potentially heterogeneous, there
is no necessity that general statements about reader response
apply equally to all individuals. Nevertheless, this does not mean
that it is impossible to describe reader response. The solution
is to describe measurement distributions. With respect to any
given empirical measurement, one can describe the central tendency that is, what the measurements have in common and
the variability that is, how the measurements vary over individuals. There are standard statistical and analytical techniques
for developing these kinds of descriptions, not only for numerical measurements (such as reading time or rating scales) but also
for more open-ended responses. The scientific goal is, thus, to
generate theories that relate the description of measurement distributions to the observed characteristics of readers and texts.
THE TEXTUAL EXPERIMENT. The textual experiment provides a
methodology for understanding the relationship between textual features and reader constructions by systematically varying
those textual features. The essence of this approach is to measure reader constructions using, for example, questionnaires
and rating scales, and to assess how those constructions vary as
a function of the features of the text. When a particular reader
construction (e.g., the perceived justification of a characters
actions) covaries with a particular textual feature (e.g., the use
of free indirect speech style, a mode of representing some of
the form of a characters enunciated speech or thought without direct quotation), then one can conclude that the feature
causally contributes to the reader construction. An alternative
to textual experiments (in which the nature of the text is manipulated) is to assess reader constructions of different texts sampled from extant materials. However, any two sampled texts will
differ in a wide range of characteristics. As a consequence, it
is difficult to draw causal inferences about which features are
related to particular reader constructions. Such inferences are
much more sound in a textual experiment in which features are
manipulated. In order to apply this method properly, though, it
is essential that only a single feature be varied, and that other,
potentially causal, features are not inadvertently changed at the
same time.
THE CONVERSATIONAL NARRATOR. A theoretical hypothesis that
unifies many of the authors research results is the idea that readers represent the narrator as a conversational participant. This
follows from the theoretical advances of B. Bruce and others in
narratology in which the narrative is conceptualized as a communicative transaction between a narrator and a narratee. In the
psychonarratological version of this idea, readers are hypothesized to develop a mental representation of an individual that
could have produced the words of the narrative and, in many
circumstances, treat that representation much as they would

the representation of a conversational participant. In particular,


readers may draw inferences about the nature and mental state
of the narrator that are licensed by the assumption that the narrator is rational and cooperative. Such inferences are analogous
to the conversational implicatures of H. Paul Grice and
may be referred to as narratorial implicatures by extension.

Research Findings of Psychonarratology


Although psychonarratology is intended primarily as a framework in which to develop a scientific understanding of literary response, the extant work provides some evidence for the
hypothesis of the conversational narrator. Two examples are
briefly described here.
In one study, Dixon and Bortolussi manipulated the narrative
technique used in the story Rope by Katherine Ann Porter. In
this story, virtually all of the narrative consists of an argument
between a husband and wife related in free indirect speech.
Different versions of the story were created in which the speech
of the man or the wife was changed to tagged direct speech. For
example, the following passage: Had he brought the coffee?
She had been waiting all day long for coffee. They had forgot it
when they ordered at the store the first day was changed to: She
asked him, Did you bring the coffee? Ive been waiting all day
long for coffee. We forgot it when we ordered at the store the first
day. When groups of readers read these different versions, they
tended to assume that the gender of the (absent) narrator was
the same as the character whose speech was related in the free
indirect style. Moreover, that character was seen as more reasonable and rational in his or her arguments. The results were
interpreted as demonstrating that the free indirect speech style
closely associates a character with the narrator and that the narrator is, by default, assumed to be rational and cooperative, just
as a conversational participant would be.
In another study, different versions of the story The Office
by Alice Monroe were constructed. This story begins with several
paragraphs describing the first person narrators attitudes and
sensibilities, prior to any description of the story plot events. In
Dixon and Bortolussis analysis, this material appeared to have
a number of narratorial implicatures that might lead readers to
make assumptions or inferences regarding the nature of the narrator. Moreover, because readers make such inferences by drawing on their own experiences and knowledge, they may come
to see the narrator as more like themselves, in effect identifying
with the narrator. When material was added to the story in order
to make such inferences unnecessary, readers were less likely
to identify with the narrator and less likely to see the narrator as
reasonable and justified in his or her actions. These results were
seen as supporting the view that readers treat narrators much as
they would a conversational participant.
Marisa Bortolussi and Peter Dixon
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bortolussi, M. and Dixon, P. 2003. Psychonarratology: Foundations
for the Empirical Study of Literary Response. Cambridge and New
York: Cambridge University Press.
Bruce, B. 1981. A social interaction model of reading. Discourse
Processes 4: 273311. A description of the levels of communication in
narrative discourse.

691

Psychophysics of Speech
Dixon, P, and Bortolussi, M. 2001. Prolegomena for a science of psychonarratology. In New Perspectives on Narrative Perspective, ed. Will van
Peer and Seymour Chatman, 27588. Albany: State University of New
York Press. A brief introduction to the psychonarratology framework.
Gerrig, R. J. 1992. Experiencing Narrative Worlds. New Haven, CT: Yale
University Press. An interdisciplinary attempt to examine what it means
to be transported mentally and emotionally by literary narratives.
Graesser, A. C, C. A. Bowers, B. Olde, K. White, and N. K. Person. 1999.
Who knows what? Propagation of knowledge among agents in a literary storyworld. Poetics 26: 14375. A good example of the application
of discourse-processing methodology to a problem in literary reading.
Holub, R. C. 1989. Reception Theory: A Critical Introduction.
London: Routledge. A very good, lucid, accessible introduction to theories of literary reception.
Iser, W. 1974. The Implied Reader: Patterns of Communication in Prose
Fiction from Bunyan to Beckett. Baltimore: John Hopkins University
Press. A seminal work describing the mechanics of readertext interactions; includes some key concepts such as the ideal reader, and
gaps.
Miall, D. S, and Kuiken, D. 1995. Aspects of literary response: A new
questionnaire. Research in the Teaching of English 29: 3758. A seminal effort to develop quantifiable indices of literary expertise.
Peer, Willie van, and Max Louwerse, eds. 2002. Interdisciplinary Studies
in Thematics. Amsterdam and Philadelphia: John Benjamins. Brings
together research on themes from a variety of fields in an attempt to
define the concept of theme.
Rimmon-Kenan, S. 1983. Narrative Fiction: Contemporary Poetics. New
York: Routledge. An accessible, solid introduction to narratological
basics.
Schmidt, Sigfried. 1981. Empirical studies in literature: Introductory
remarks. Poetics 10: 31736. A pioneering study that lays the theoretical groundwork for the empirical study of literature.
Singer, M. 1990. Psychology of Language: An Introduction to Sentence and
Discourse Processes. Hillsdale, NJ: Lawrence Erlbaum. An introduction
to discourse processing and cognitive psychology.
Vipond, D., and R. A. Hunt. 1984. Point-driven understanding: Pragmatic
and cognitive dimensions of literary reading. Poetics 13: 26177. A
pioneering investigation into how literary reading varies with reading
strategy.

PSYCHOPHYSICS OF SPEECH
The psychophysics of speech describes an interdisciplinary
approach to the understanding of speech perception. The
approach considers speech as a complex acoustic signal sharing
much in common with other complex perceptual events and posits that, as such, speech may be studied in the broader context
of general perceptual, cognitive, and sensorineural systems. This
approach is distinguished from those that consider speech to be
a special signal processed in a manner distinct from non-speech
sounds. The essence of a psychophysical approach is to determine
the extent to which speech perception makes use of general cognitive and perceptual processes before postulating mechanisms
specialized to the speech signal. Thus, understanding the psychophysics of speech may include the utilization of animal models of auditory behavior and physiology to examine how much of
speech perception may be accounted for by general rather than
specialized mechanisms, and the relation of speech perception to
neural coding at peripheral and central levels of processing.
Psychophysics often connotes bottom-up or peripheral processing, and, in fact, a great deal of research of the psychophysics

692

of speech can be characterized this way. However, the term is


perhaps an unfortunate moniker for describing this approach
because auditory memory, attention, object recognition, crossmodal processing, learning, plasticity, and development all play
important roles in processing complex auditory signals, and these
processes relate to speech processing as well. The psychophysics
of speech might be more broadly described as an auditory cognitive neuroscience approach to speech perception that considers
the richness of the acoustic (and, in fact, cross-modal) perceptual environment, the influence of long-term experience, and the
effects of higher-order knowledge and processing.
Adherence to a general cognitive/perceptual account of
speech perception has meant that the psychophysical approach
to speech perception has played a central role in the theoretical debate about whether speech is perceived in a mode distinct from other acoustic signals. A major contribution of the
approach, apart from this theoretical debate, has been its insistence on attention to the precise spectrotemporal characteristics
of the speech signal and to the neural mechanisms of auditory
processing involved in representing these signals.
An application of the psychophysical approach is observed
in the study of phonetic context effects. A great deal of early
research in acoustic phonetics documented the considerable variability inherent in the acoustics of speech. To summarize this broad literature, there do not appear to be acoustic
signatures that uniquely specify phonetic categories. Thus, listeners are faced with the perceptual challenge of mapping highly
variable acoustic signals onto speech in a many-to-one manner.
Behavioral studies demonstrate that listeners appear to meet this
variability in speech acoustics by perceiving speech in a wholly
context-dependent manner. Many studies have reported phonetic context effects in which physically identical acoustic signals
are judged by listeners to be different speech sounds as a function of the phonetic context in which they are presented.
Phonetic context effects are ubiquitous in speech perception and have been documented for many speech segments.
Of interest in understanding the mechanisms that give rise to
such effects, Japanese quail (Coturnix coturnix japonica) that
were trained to peck a lighted key in response to presentation
of /g/ endpoints of a /ga/ to /da/ stimulus series pecked more
vigorously to novel ambiguous midseries speech stimuli when
they were preceded by /l/. A second set of birds trained to peck
in response to /d/ responded more robustly to the same novel
stimuli when they were preceded by /r/. The directionality of the
effect is the same as for human listeners. The extension of phonetic context effects to a nonhuman species suggests that general
auditory processing may assist in accommodating the complex
variability present in speech.
To pose the results in a general way, context sounds with
higher-frequency acoustic energy (like /l/) shift perception
of the following syllable toward the category alternative with
greater low-frequency energy, /g/. Thus, contrastive processes
by which the auditory system exaggerates change in the acoustic
signal may be sufficient to explain phonetic context effects. This
conclusion is supported by research demonstrating that adult
human listeners shift phonetic categorization responses not
only as a function of neighboring speech contexts but also as a
function of non-speech tones, chirps, and noises that precede or

Qualia Roles
follow speech. In the case of human and non-human perception
of speech and non-speech contexts, speech perception appears
to be relative to and contrastive with the acoustics of context
sounds, whether speech or non-speech.
This portfolio of research findings is indicative of a psychophysical approach to speech perception in that it pays careful
attention to the spectrotemporal information available to listeners, it makes use of nonhuman animals as a means of examining
the generality of the mechanisms available to speech processing, and it examines the extent to which complex non-speech
signals may give rise to some of the same patterns of perception
as speech. Research relating the context-dependent coding of
acoustic signals to neural response (see phonetics and phonology, neurobiology of) adds to the understanding of how
phonetic context effects may arise from general characteristics of
the perceptual system. The constellation of available results suggests that general perceptual mechanisms play a role in phonetic
context effects.
In other domains, the psychophysical approach has contributed to the understanding of auditory representation, auditory
learning, and cross-modal processing as they relate to speech
processing. There remains much potential for understanding the
perceptual, cognitive, and neural underpinnings of speech communication from a general perceptual/cognitive perspective.
Lori L. Holt
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Diehl, R. L., and K. R. Kluender. 1989. On the objects of speech perception. Ecological Psychology 1: 12144. Discussion of a general
approach to speech perception.
Diehl, R. L., A. J. Lotto, and L. L. Holt. 2004. Speech perception. Annual
Review of Psychology 55: 14979. This review contrasts theoretical
approaches to speech perception.
Liberman, A. M. 1996. Speech: A special code. Cambridge, MA: MIT Press.
Discussion of the motor theory of speech perception, an alternative
account to a general approach to speech perception.
Lotto, A. J., and L. L. Holt. 2006. Putting phonetic context effects into context: A commentary on Fowler (2006). Perception and Psychophysics
68: 17883. Briefly reviews phonetic context effects in literature.
Schouten, M. E. H., ed. 2003. The nature of speech perception. Speech
Communication 41: 1271. A collection of 20 papers on the psychophysics of speech perception.

Q
QUALIA ROLES
Qualia structure is a system of relations that characterizes the
semantics of a lexical item or phrase. The notion of qualia structure is derived in part from the Aristotelian theory of explanation
(Moravcsik 1975). An important semantic concept within generative lexicon theory (GL), qualia roles are the major building
blocks for constructing word and phrasal meaning in a language
compositionally.
GL (Pustejovsky 1995) is a theory of linguistic semantics,
which focuses on the distributed nature of compositionality in

natural language. On this view, there are four computational


resources available to a lexical item as part of its linguistic encoding: lexical typing structure, argument structure; event structure;
and qualia structure. There are four possible qualia roles associated with a word:
(a) Formal: the basic category distinguishing the meaning of
word within a larger domain;
(b) Constitutive: the relation between an object and its constituent parts;
(c) Agentive: the factors involved in the objects origins or
coming into being;
(d) Telic: the purpose or function of the object, if there is
one.
There are two general points that should be made concerning
qualia roles: 1) Every category expresses a qualia structure, and
2) not all lexical items carry a value for each qualia role. The first
point is important for the way a generative lexicon provides a
uniform semantic representation compositionally from all elements of a phrase. The second point allows us to view qualia as
applicable or specifiable relative to particular semantic classes.
In effect, the qualia structure of a noun determines its meaning in much the same way as the typing of arguments to a verb
determines its meaning. The elements that make up a qualia
structure include such familiar notions as container, space, surface, figure, or artifact. One way to model the qualia structure is
as a set of constraints on types (cf. Copestake and Briscoe 1992;
Pustejovsky and Boguraev 1993). The operations in the compositional semantics make reference to the types within this system. The qualia structure, along with the other representational
devices (event structure and argument structure), can be seen as
providing the building blocks for possible object types.
Consider, for example, the qualia structure for the nouns beer
and sandwich, with formal (F), agentive (A), telic (T), and constitutive (C):
a.

beer. x:[F = liquid A = brew T = drink]

b.

sandwich. x:[F = physical A = make T = eat C = bread,]

From qualia structures such as these, it now becomes clear how


a sentence such as Mary enjoyed her sandwich receives the
default interpretation it does, namely, that of Mary enjoying eating the sandwich. Similarly, for Mary finished her beer, the
composition of the event-selecting aspectual verb finish and its
object involves a rule that retrieves a possible event interpretation of drinking the beer. These are examples of type coercion,
where the compositional rules in the grammar make reference to
values such as qualia structure, if such interpretations are to be
constructed on-line and dynamically.
The qualia structure of verbs characterizes the general role of
the subpredicates of a verbs event structure (as in Dowty 1979).
It also interacts with the aspectual category of the predicate. For
example, run and bake are process verbs, where the process
predicate is assigned to the agentive role, as in John ran and
Mary baked the potato.
a.

run(x) P:[A = run_act(x)]

b.

bake(x) P:[A = bake_act(x)]

693

Quantification
They can both, however, be coerced to accomplishments
(transitions) by specifying a termination predicate, assigned to
the formal role (cf. Pustejovsky 1995), for example, John ran to
the store, Mary baked a cake.
Recently, researchers in computational linguistics and lexicography have adopted the notion of qualia roles as one organizing principle in the process of building resources for lexical
knowledge bases.
James Pustejovsky
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Copestake, A. and T. Briscoe. 1992. Lexical operations in a unificationbased framework. In Lexical Semantics and Knowledge Representation,
ed. J. Pustejovsky and S. Bergler. Berlin: Springer-Verlag.
Dowty, D. 1979. Word Meaning and Montague Grammar. Dordrecht, the
Netherlands: Kluwer.
Moravcsik, J. M. 1975. Aitia as generative factor in Aristotles philosophy. Dialogue 14: 62236.
Pustejovsky, J. 1995. The Generative Lexicon. Cambridge, MA: MIT Press.
Pustejovsky, J., and B. Boguraev. 1993. Lexical knowledge representation
and natural language processing. Artificial Intelligence 63:193223.

QUANTIFICATION
Quantification has been a central concern in Logic and
Language at least since Aristotle, who systematized all valid and
invalid syllogisms involving the forms All/ some/ no/ not all As are
Bs (see Kneale and Kneale 1962). In linguistics, quantificational
phenomena played a role in upsetting the architecture of standard
transformational grammar (Chomsky 1965) in which deep
structure determines semantic interpretation. Many transformations that were meaning-preserving on sentences involving referential terms were not so when applied to quantifiers:
(1)

John wanted [John win] John wanted to win

(2)

Everyone wanted [everyone win] (?) Everyone wanted to


win

The semantic inappropriateness of derivations such as (2) helped


to ignite the so-called linguistic wars (Newmeyer 1980).

Quantification and the Syntax-Semantics Interface


Quantification raises issues for the syntax-semantics interface
concerning scope ambiguity, binding, and anaphora.
Theories differ sharply in the treatment of scope-ambiguous sentences like (3), which challenge the otherwise plausible assumption that every ambiguity involves a lexical ambiguity or an
ambiguity of syntactic structure; (3) on the face of it has neither.
(3)

At least two students read every book.


i. Wider scope for at least two: There are at least two who read
the whole lot.
ii. Wider scope for every: Every book got at least two readings.

The problem illustrated in transformation (2) is a problem


of binding and anaphora. It was soon recognized that pronouns
and null anaphors whose antecedent is a quantifier behave
like logical variables, as in (4), and not like repetitions of the
antecedent noun phrase (NP).

694

(4)

Everyone wanted () to win: For every person x, x wanted


[x win]

The same phenomenon appears in (5a) and (5b).


(5)

a. John rescued himself = John rescued John


b. Someone rescued himself = For some x, x rescued x
Someone rescued someone

Thus, whereas anaphora with referential antecedents may


involve coreference, anaphora with quantificational antecedents
involves binding; anaphora and binding remain a major topic in
syntax and semantics.

The Semantics of Quantification


The rise of formal semantics brought investigations into
the model-theoretic semantics of NPs and determiners. In
montague grammar, all English NPs, even proper names,
are generalized quantifiers (Montague 1973), denoting sets of
properties of individuals. This uniform treatment launched
the study of the semantic properties of NPs and determiners
(Barwise and Cooper 1981; Keenan and Stavi 1986), leading to
progress on semantic universals of determiner meanings (see
semantics, universals of ), the semantics of existential
sentences and weak NPs (those that can occur in existential
sentences: a, some, three, no, many, but not the, every, both,
most), the semantics of determiners like any that can occur in
negative and certain other contexts but not in simple affirmative sentences (the negative polarity phenomenon), and other
topics in quantification.
In the early 1980s, Irene Heim ([1982] 1989) and Hans Kamp
(1981) independently argued against Richard Montagues uniform
treatment of NPs, distinguishing definite and indefinite NPs (with
determiners such as a, the, three, the three, some, several) from essentially quantificational NPs (every, all, most). On their approaches,
an indefinite introduces a discourse referent into the context, bringing context into semantics proper (see semantics-pragmatics
interaction); only the essentially quantificational NPs are
treated as generalized quantifiers. Barbara H. Partee (1986) reconciled Montagues uniform semantics with the Kamp-Heim theory
through type-shifting mechanisms such that all NPs can have
generalized quantifier-type meanings, but many NPs have referential and/or predicative meanings as well. The king, for instance,
may have a quantificational meaning (roughly, whoever is the
one and only king, with no presuppositions), a referential meaning (referring to the unique king if there is one, failing to refer if
existence and uniqueness presuppositions are not satisfied), or a
predicative meaning in is the king, asserting of its subject that he is
the one and only king.
Other topics in the semantics of quantification include the
semantics of distributive, collective, and cumulative quantification; the semantics of the mass-count distinction; event quantification and tense logic; generic sentences; implicit quantification;
and the binding of implicit variables. There is also active research
on childrens acquisition and adult processing of the syntax and
semantics of quantification.
Logicians have continued to make progress on the logic of
quantification, including work in game-theoretical semantics
(Hintikka and Sandu 1997; Clark 2007), where the foundations of

Quantification

Quantitative Linguistics

notions of scope, variable binding, and variable dependence are


being reexamined.

Typology of Quantification
Jon Barwise and Robin Cooper (1981) hypothesized that all
languages use NPs interpreted as generalized quantifiers.
Research reported in Bach et al. (1995) identified several languages that falsify that hypothesis. Other ways of expressing
quantification include adverbs of quantification like usually,
mostly (Lewis 1975), adjectives (numerous, Russian mnogie
many), nouns (majority, lot, dozen), and verbal prefixes
(Evans 1995).
Cross-linguistic studies show that languages differ with
respect to syntactic positions and strategies for expressing different quantificational notions (Szabolcsi 1997), with respect to the
degree to which surface structure constrains semantic quantifier
scope, with respect to the range of meanings a bare NP like
horses can have, in the variety and interpretation of indefinites
(Haspelmath 1997; Chung and Ladusaw 2003), in interactions
between nominal quantification and verbal aspect, and in other
ways that are still being explored.
Barbara H. Partee
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bach, Emmon, et al., eds. 1995. Quantification in Natural Languages.
Dordrecht, the Netherlands: Kluwer Academic.
Barwise, Jon, and Robin Cooper. 1981. Generalized quantifiers and natural language. Linguistics and Philosophy 4: 159219.
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge,
MA: MIT Press.
Chung, Sandra, and William A. Ladusaw. 2003. Restriction and Saturation.
Cambridge, MA: MIT Press.
Clark, Robin. 2007. Games, quantifiers, and pronouns. In Game Theory
and Linguistic Meaning, ed. A.-V. Pietarinen, 20728. Amsterdam:
Elsevier.
Evans, Nick. 1995. A-quantifiers and scope in Mayali. In Quantification
in Natural Language, ed. E. Bach et al., 20770. Dordrecht, the
Netherlands: Kluwer.
Haspelmath, Martin. 1997. Indefinite Pronouns. Oxford: Oxford University
Press.
Heim, Irene. [1982] 1989. The Semantics of Definite and Indefinite Noun
Phrases. New York: Garland.
Hintikka, J., and G. Sandu. 1997. Game-theoretical semantics. In
Handbook of Logic and Language, ed. J. van Benthem and A. ter
Meulen, 361410. Cambridge, MA: MIT Press.
Kamp, Hans. [1981] 1984. A theory of truth and semantic representation. In Truth, Interpretation, Information, ed. J. Groenendijk, Th.
Janssen, and M. Stokhof, 141. Dordrecht, the Netherlands: Foris.
Keenan, Edward. 1996. The semantics of determiners. In The
Handbook of Contemporary Semantic Theory, ed. Shalom Lappin.
Oxford: Blackwell. A survey of the semantic properties of quantifier
expressions by a leader in the field.
Keenan, Edward, and Jonathan Stavi. 1986. A semantic characterization of natural language determiners. Linguistics and Philosophy
9: 253326.
Kneale, William, and Martha Kneale. 1962. The Development of Logic.
Oxford: Oxford University Press.
Lewis, David. 1975. Adverbs of quantification. In Formal Semantics
of Natural Language, ed. E. L. Keenan, 315. Cambridge: Cambridge
University Press.

Montague, Richard. 1973. The proper treatment of quantification in ordinary English. In Approaches to Natural Language, ed. K. J. J. Hintikka
et al., 22142. Dordrecht, the Netherlands: Reidel.
Newmeyer, Frederick. 1980. Linguistic Theory in America: The First
Quarter-Century of Transformational Generative Grammar. New
York: Academic Press.
Partee, Barbara H. 1986. Noun phrase interpretation and type-shifting
principles. In Studies in Discourse Representation Theory and the
Theory of Generalized Quantifiers, ed. J. Groenendijk et al., 11543.
Dordrecht, the Netherlands: Foris.
. 1995. Quantificational structures and compositionality. In
Quantification in Natural Language, ed. Emmon Bach et al., 541602.
Dordrecht, the Netherlands: Kluwer. A formal semanticists approach
to semantic typology of quantificational expressions.
Szabolcsi, Anna, ed. 1997. Ways of Scope Taking. Dordrecht, the
Netherlands: Kluwer.

QUANTITATIVE LINGUISTICS
While the formal branches of linguistics use the qualitative
mathematical means (algebra, set theory) and logics to model
structural properties of language, quantitative linguistics (QL)
studies the multitude of quantitative properties as the essential
basis for the description and understanding of the development
and functioning of linguistic systems and their components. The
objects of QL research do not differ from those of other linguistic
disciplines. The difference lies, rather, in the ontological points
of view (do we consider a language as a set of sentences with
their structures assigned to them, or do we see it as a system
that is subject to evolutionary processes in analogy to biological
organisms, etc.) and, consequently, in the concepts that form the
basis of the disciplines.
Differences of this kind enable researchers to perceive new
phenomena in their area of study. A linguist accustomed to
thinking in terms of set theoretical constructs is not likely to find
the study of such properties as length, frequency, age, degree
of polysemy, and so on interesting or even necessary. Zipfs
Law is the only quantitative relation that almost every linguist
has heard about, but for those who are not familiar with QL,
it appears to be a curiosity more than a central linguistic law,
which is connected with a large number of properties and processes in language. From a quantitative point of view, however,
it is quite natural to detect features and interrelations that can
be expressed only by numbers. There are, for example, dependences of length (or complexity) of syntactic constructions
on their frequency and on their ambiguity; of homonymy of
grammatical morphemes on their dispersion in their paradigm; and of the length of expressions on their age, the dynamics of the flow of information in a text on its size, the probability
of change of a sound on its articulatory difficulty, and so on.
In short, in every field and on each level of linguistic analysis,
phenomena of this kind are significant. They are observed in
every language in the world and at all times. Moreover, it can
be shown that these properties of linguistic elements and their
interrelations abide by universal laws of language, which
can be formulated in a strict mathematical way in analogy to
the laws of the natural sciences. Emphasis has to be put on the
fact that these laws are stochastic; they do not capture single
cases (this would be neither expected nor possible) but, rather,

695

Quantitative Linguistics
predict the probabilities of certain events or certain conditions
in a whole. It is easy to find counterexamples with respect to any
of the examples cited above. However, this does not mean that
they contradict the corresponding laws. Divergences from a statistical average are not only admissible but even lawful they
are themselves determined with quantitative exactness. This
situation is, in principle, not different from that in the natural
sciences, where the old deterministic ideas have been replaced
by modern statistical/probabilistic models.
The role of QL is to unveil corresponding phenomena, to systematically describe them, and to find and formulate the laws
that explain the observed and described facts. Quantitative interrelations have an enormous value for fundamental research and
can also be used and applied in many fields, such as computational linguistics and natural language processing, teaching language, optimization of texts, and so on.

Historical Background
The first scientific counts of units of language or text were published in the nineteenth century. The first theoretical insight, after
many years of merely descriptive counts of various kinds, was
offered by the Russian mathematician A. A. Markov, who created
the base of the theory of Markov chains in 1913. This mathematical model of the sequential (syntagmatic) dependence among
units in linear concatenation in the form of transition probabilities was despite its mathematical significance of only little
use for linguistics. In modern natural language processing, however, (hidden) Markov models are a central component of many
methods in language technology.
Later, quantitative studies of linguistic material were, in
the first place, a consequence of practical demands: Efforts to
improve second language training (see bilingual education)
and optimization of stenographic systems are examples. The
unveiled interrelations between frequency of words and the
ranks of the frequency class (alternatively: between frequency
and the number of words in the given frequency class) were systematically investigated by the aforementioned George Kingsley
Zipf. He was the first to set up a model in order to explain the
observations and to find a mathematical formula for the corresponding function the famous Zipfs Law. Among his publications, his books ([1935] 1968 and 1949) are considered the
most important. Zipf formulated (in different terms) innovative
thoughts on self-organization, the principle of language economy, and fundamental properties of linguistic laws long before
modern systems theory arose. His ideas, such as the principle
of least effort and the forces of unification and diversification,
are still important today (even if they suffer from certain shortcomings). Later, his model was conceptually and mathematically
improved by Benot Mandelbrot (cf. Rapoport 1982), the originator of fractal geometry. Zipfs body of thought inspired various
scientific disciplines and enjoys increasing exposure again.
C. E. Shannon and W. Weaver (1949) applied information theory to linguistics without much success. Physicist Wilhelm Fucks
(1955) was responsible for a turn toward theoretical considerations in German QL. He studied, among others, word-length
distributions and various phenomena of language, literature,
and music. In France, Charles Muller (1973, 1979) created a novel
approach for studying the vocabulary of a text. In Russia, Zipfian

696

linguistics was conducted in particular by Michail V. Arapov


(Arapov 1988; Arapov and Cherc 1974), who based his models
of text dynamics and of language development on the analysis of
rank order. In Georgia, a group around Jurij K. Orlov established
a tradition of studies into the statistical structure of texts based
on the Zipf-Mandelbrot Law. The Estonian researcher Juhan
Tuldava (1995, 1998) is famous for his mathematical methods of
analysis of numerous text phenomena.

Objectives and Methods


The fact that language cannot adequately be analyzed without
quantitative methods follows from a number of principal considerations (cf. also Altmann and Lehfeldt 1980, 1 ff).
EPISTEMOLOGICAL REASONS. The possibilities of deriving statements about language(s) are seriously limited. Direct observation of language is impossible. Introspection cannot provide
more than heuristic contributions and does not possess the status of empirical evidence (even if the contrary is often claimed in
linguistics). As a source of scientific data, only linguistic behavior
is available in the form of oral or written text, in the form of
psycholinguistic experiments, and so on.
Furthermore, the situation is aggravated by the fact that we
never dispose of complete information on the object under
study. On the one hand, only a limited part or aspect of the object
is accessible (because it is principally infinite, such as the set of
all texts or all sentences, or because it cannot be described in full
for practical reasons). On the other hand, very often we lack the
complete information about the number and kinds of all factors
that might be relevant for a given problem and are, therefore,
unable to give a full description.
Only mathematical statistics enables us to find valid conclusions in spite of incomplete information and, indeed, with objective, arbitrary reliability.
HEURISTIC REASONS. One of the most elementary tasks of any
science is to create some order within the mass of manifold,
diverse, and unmanageable data. Classification and correlation
methods can give indications to phenomena and interrelations
not known before. A typical example of a domain where such
inductive methods are very common is corpus linguistics,
where huge amounts of linguistic data are collected and could
not even be inspected with the bare eye.
METHODOLOGICAL REASONS. Any science begins with categorical, qualitative concepts, which divide the field of interest into
delimited classes as clearly as possible. A linguistic example of this
kind of concept is the classical category parts of speech (see word
classes). It is possible to decide whether a word should be considered as, for example, a noun or not. Every statement based on categories can be reduced to dichotomies (having exactly two values,
such as {true, false}, {1, 0}, {yes, no}). This kind of concept is fundamental and indispensable but insufficient for a deeper insight.
The possibility of gradual statements is provided by comparative (ordinal-scale) concepts. They allow us to determine that an
object possesses more or less of a given property than another
one or the same amount of it. A linguistic example is the grammatical acceptability of sentences.

Quantitative Linguistics

Radical Interpretation

The highest degree of order is achieved with the help of metrical concepts, which are needed if the difference between the
amounts of a given property possessed by two objects plays a
role. In this case, the values of the property are mapped to the elements of a set of numbers in which the relations between these
numbers correspond to the relations between the values of the
properties of the objects. In this way, specific operations, such
as subtraction, correspond to specific differences or distances in
the properties between the objects. This enables the researcher to
establish an arbitrarily fine conceptual grid within his or her field
of study. Concepts of this kind are called interval-scale concepts.
If a fixed point of reference is added (e.g., zero), ratio-scaled concepts are obtained that allow the operation of multiplication and
division. Only the latter scale enables a formulation of how many
times object A has more than B of a property.

The Objectives of Quantitative Linguistics


In contrast to other branches of linguistics, QL emphasizes the
introduction and application of additional, advanced scientific tools. Principally, linguistics tries, in the same way as other
empirical sciences do in their fields, to find explanations for the
properties, mechanisms, functions, development, and so on of
language(s). Due to the stochastic properties of language, quantification and probabilistic models play a crucial role in this process. In the framework of this general aim, QL has a special status
only because it makes special efforts to care for the methods
necessary for this purpose. We can characterize this endeavor by
two complementary aspects:
1. On the one hand, the development and application of
quantitative models and methods are indispensable in all cases
where purely formal (algebraic, set-theoretical, and logical)
methods fail, that is, where the variability and vagueness of
natural languages (see language, natural and symbolic)
cannot be neglected, where gradual changes debar the application of static/structural models. Briefly, quantitative approaches
must be applied whenever the dramatic simplification caused
by the qualitative yes/no scale is inappropriate for a given
investigation.
2. On the other hand, quantitative concepts and methods
are superior to the qualitative ones on principled grounds. The
quantitative ones enable a more adequate description of reality by providing an arbitrarily fine resolution. Between the two
extreme poles yes/no, true/false, 1/0, as many grades as are
needed can be distinguished up to the infinitely many grades
of the continuum.
Generally speaking, the development of quantitative methods aims at improving the exactness and precision of the possible
statements on the properties of linguistic and textual objects. They
help us derive new insights that would not be possible without
them: Subjective criteria can be made objective and operationalized (e.g., in stylistics); interrelations between units and properties can be detected, remaining invisible to qualitative methods;
and workable methods for technical and other fields of application
can be found where traditional linguistic methods fail or produce
inappropriate results due to the stochastic properties of the data or
to the sheer mass of them (e.g., in natural language processing).
Reinhard Khler and Gabriel Altmann

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Altmann, Gabriel, and Werner Lehfeldt. 1980. Einfhrung in die quantitative Phonologie. Bochum, Germany: Brockmeyer.
Arapov, Michail V. 1988. Kvantitativnaja lingvistika. Moscow: Nauka.
Arapov, Michail, V., and Maja M. Cherc. 1974. Matematieskie metody v
istorieskoj lingvistike. Moscow: Nauka.
Fucks, Wilhelm. 1955. Mathematische Analyse von Sprachelementen,
Sprachstil und Sprachen. Cologne: Westdeutscher Verlag.
Hebek, Ludek. 1997. Lectures on Text Theory. Prague: Oriental
Institute.
Khler, Reinhard, Gabriel Altmann, and Rajmund G. Piotrowski. 2005.
Quantitative Linguistik. Ein internationales Handbuch. Berlin: Walter
de Gruyter. Quantitative Linguistics: An International Handbook. New
York: Walter de Gruyter.
Muller, Charles. 1973. Elments de statistique linguistique. In
Linguistica, Matematica e Calcolatori. Atti del convegno e della prima
scuola internazionale Pisa 1970, 34978. Florence: Zampolli.
. 1979. Langue franais et linguistique quantitative. Geneva: Slatkine.
Rapoport, A., 1982. Zipfs Law revisited. In Studies on Zipfs Law,
ed. Henri Guiter and Michail V. Arapov, 128. Bochum, Germany:
Brockmeyer.
Shannon, C. E., and W. Weaver. 1949. The Mathematical Theory of
Communication. Urbana: University of Illinois Press.
Tuldava, Juhan. 1995. Methods in Quantitative Linguistics. Trier,
Germany: WVT
. 1998. Probleme und Methoden der quantitative-systemischen
Lexikologie. Trier, Germany: WVT.
Zipf, George Kingsley. [1935] 1968. The Psycho-Biology of Language: An
Introduction to Dynamic Philology. Cambridge, MA: MIT Press.
. 1949. Human Behavior and the Principle of Least Effort. Reading,
MA: Addison-Wesley.

R
RADICAL INTERPRETATION
Radical interpretation is one of the central concepts in the work
of the American philosopher Donald Davidson (19172003). For
Davidson, to interpret speakers means to understand their linguistic utterances (cf. [1975] 1984, 157), to assign meanings to
their words and, in a slightly extended usage, assign contents to
their propositional attitudes. Radical interpretation takes place
in a specific scenario in which a person encounters the speakers
of a completely unknown language L. The radical interpreter has
the task of devising a formal semantic theory (see semantics)
for L on the basis of data of a very specific kind: His or her evidence consists entirely of (all available) data about the linguistic
and nonlinguistic behavior of the speakers of L in its observable
circumstances. According to Davidsons method, radically interpreting a language automatically includes systematic ascriptions
of belief to its speakers.

Background
The basic scenario of a field linguist trying to understand a
radically foreign language L on the basis of purely behavioral
data is introduced in W. V. O. Quines seminal work Word and
Object (1960). Here, the task is to construe a translation manual
for L, and Quine uses radical translation to consider how much

697

Radical Interpretation
of language can be made sense of in terms of its stimulus conditions, and what scope this leaves for empirically unconditioned
variation in ones conceptual scheme (Quine 1960, 26). Among
other things, he uses radical translation to argue for the indeterminacy of translation. This is the claim that on the
basis of the evidence available in radical translation, different
but equally correct translation manuals can be set up between
two languages L1 and L2, manuals diverging in a number of
places by translating a sentence of L1 into sentences of L2 which
stand to each other in no plausible sort of equivalence however
loose (p. 27). Davidson subscribes to indeterminacy for analogous reasons, even though in a more limited form (cf. Davidson
[1973] 1984, [1974] 1984).
The radical interpretation scenario derives its significance
for the theory of meaning, both in Quine and in Davidson, from
the foundational claim that meaning is entirely determined by
observable behavior (cf. Quine 1960, ix; Davidson 2005, 56).
According to this weak semantic behaviorism, the data available in radical interpretation are the data ultimately and entirely
determining the meanings of the expressions of the language
interpreted. In contrast to Quine, this determination remains
nonreductive in Davidson, but it is nevertheless both epistemic
and metaphysical in nature: The data available in radical interpretation are not only the ultimate evidence on the basis of
which meanings can be known; they are what (metaphysically)
determines or constitutes linguistic meaning. According to
Davidson, this is an individualistic affair not essentially involving social convention or a shared language; even though there
cannot be such a thing as a solitary speaker, one who never had
contact with other speakers, what a speaker means by his or her
words on an occasion of utterance is determined solely by his or
per own (dispositions to) behavior (cf. [1975] 1984, [1982] 1984,
[1992] 2001).
According to semantic behaviorism, meaning is determined
on a nonsemantic basis by data that can be described without
using semantic concepts, such as meaning or propositional
content. Davidson motivates his particular version of semantic
behaviorism partly by appeal to the essential publicness of language: The semantic features of language are public features.
What no one can, in the nature of the case, figure out from the
totality of the relevant evidence cannot be part of meaning
([1973] 1984, 135). According to Davidson, the relevant evidence
is evidence plainly available to an observer unaided by instruments (1994, 127). This restriction on the evidence stems from
the claim that terms like meaning and language are theoretical
terms deriving their significance from occasions of successful
linguistic communication (which do not, typically, involve the
use of any instruments).
Davidson thus transforms the basic question of the philosophical theory of meaning, the question What is it for words to
mean what they do? (1984, xiii), into two others: Given that we
can interpret the linguistic utterances of a speaker, what could
we know that would enable us to do this? How could we come
to know it? ([1973] 1984, 125). Radical interpretation addresses
the second of these questions and is supposed to show that there
is a method by which we can know meaning on the basis of the
nonsemantic evidence that, according to Davidson, determines
it (cf. 1994, 127).

698

Task, Method, and Procedure


Davidsons answer to the first question, the question of what
we could know that would allow us to interpret a language, is a
Tarskistyle theory of truth (T-theory). Such a theory, Davidson
argues, can be used as a formal semantic theory for a natural language L ([1967] 1984). He is one of the main advocates of truth
conditional semantics, and a correct T-theory T for L compositionally specifies the meanings of the sentences of L: For
every sentence S of L, a T-sentence specifying its truth conditions
can be derived from the axioms and rules of T. For instance, a
correct T-theory for German would allow derivation of the following T-sentence from the axioms for the simple expressions
Schnee and ist weiss:
(S)

Schnee ist weiss is true-in-German if and only if snow is white.

T-theories can be constructed for significant parts of natural language (for a list of problems such as belief sentences or
counterfactuals, see Davidson [1973] 1984, 132). They are supposed to theoretically model semantic competence, but need not
be objects of knowledge for the speakers in any psychologically
realistic sense.
According to Davidson, the radical interpreter can devise a
T-theory for an unknown language L in two steps. On the basis
of their behavior, the interpreter can determine the sentences
of L that the speakers hold true in particular circumstances. This
amounts to detecting a propositional attitude: Holding a sentence true is having a belief, but so long as the sentence held
true remains uninterpreted, no meaning-theoretical question
is begged. In the second step, the radical interpreter uses data
about the circumstances under which speakers hold sentences
true to determine their truth conditions.
Holding a sentence true, however, is a product of two factors: what the sentence means and what the speaker believes to
be the case (cf. Davidson [1973] 1984, 134). Assigning a meaning to a sentence held true is ascribing a belief to the speaker.
Because of this interdependence of belief and meaning (p.
134), the ascription of belief needs to be restricted in relevant
ways if there is to be any evidence relation between holding true
and T-theory; so long as there is no such restriction, so long,
that is, as beliefs can be as absurd as the interpreter pleases, any
meaning can be assigned to any sentence. To establish such an
evidence relation is one of the main functions of the principle of
charity (see charity, principle of) (cf. Gler 2006, 340). It
tells the radical interpreter to assign truth conditions to the sentences of L such that the speakers of L have true and coherent
beliefs so far as that is plausibly possible. Since this can be done
only according to the interpreters own view of what is true,
application of charity amounts to agreement maximization or, as Davidson prefers, agreement optimization between
speaker and interpreter. The idea here is that some mistakes are
more destructive for understanding than others; an interpretation that avoids ascribing flagrant logical errors or very basic,
inexplicable perceptual mistakes is prima facie better than one
that does not. Ultimately, charity tells the interpreter to pick that
T-theory that stands in the relation of best fit (Davidson [1973]
1984, 136) to the totality of his or her data. In this way, the principle of charity fulfills its second main function: It allows for a
ranking of T-theories in terms of how well they fit the totality

Radical Interpretation
of the data, a ranking such that the best theory is the correct
one (cf. Gler 2006, 342). This amounts to a form of semantic
holism; the principle of charity determines all the meanings of
the expressions of L together, and on the basis of the totality of
the evidence (cf. Pagin 1997, 13, 18). indeterminacy, then, is
the claim that there can be more than one best T-theory for
any given natural language L.
Davidson provides what he calls a crude outline ([1973]
1984, 136) for the process of devising a T-theory on the basis of
data about holding true attitudes. It has three steps: First, the
logical form of the sentences of L is determined. Use of a
T-theory as a formal semantic theory requires paraphrasing the
expressions of L in the language of first-order quantified logic
(plus identity). The relevant evidence for this first step consists
of sentences that are held true (or false) under all circumstances
(candidates for logical truth or falsity) and of patterns of inference, that is, sentences held true on the basis of other sentences
held true.
The second step focuses mainly on sentences containing
indexicals, expressions whose interpretation depends on features of the context, such as I or here. Take the sentence It
is raining or its German equivalent Es regnet. Their truth value
varies with easily observable circumstances in the environment
of the speaker. The idea (according to Davidson [1973] 1984, 135)
is to take data of the form
(E) Kurt belongs to the German speech community and Kurt
holds true Es regnet on Saturday at noon and it is raining near
Kurt on Saturday at noon as evidence for a T-sentence of the
form
(R) Es regnet is true-in-German when spoken by x at time t if
and only if it is raining near x at t.

Together, these two steps significantly limit the possibilities for


interpreting the predicates of L. The third step deals with the
remaining sentences of L.

Questions and Criticism


Over the years, there has been extensive discussion of radical interpretation and the underlying Davidsonian philosophy of language. Davidsons views on meaning determination
have been criticized as verificationist (see also verifiability
criterion ) or idealist, charges he was keen on refuting. His
semantic individualism and holism have been issues of debate.
With respect to the principle of charity, such questions as
whether it overrationalizes empirical speakers or illegitimately
imposes our logic on alien speakers have been raised. The most
important philosophical issues here concern the epistemic
and metaphysical status of the principle, and the questions
of whether and how it can be justified. Whether radical interpretation is possible, what kind of argument is required for its
possibility, and the precise role it plays in Davidsons philosophy of language, as well as its wider significance, are topics
on which there is no general consensus among the commentators. Today, many philosophers of mind and language are
of the opinion that the basic semantic behaviorism characterizing both Davidsons and Quines philosophy of language
has been superseded by the (social and physical) meaning

Reading
externalim currently dominating the theory of meaning and
content. However, foundational issues such as these remain
insufficiently explored; because of the role the shared environment plays in radical interpretation, Davidson thought of himself as a social and physical externalist, though clearly not of the
mainstream kind (cf. 2001). So long as a systematic comparison of these competing accounts of meaning determination
is lacking, it remains premature to simply write off semantic
behaviorism; prima facie, it is not even clear that Davidsonian
semantic behaviorism and mainstream (physical) externalism
are incompatible.
Kathrin Gler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Davidson, Donald. [1967] 1984. Truth and meaning. In Davidson 1984,
1736.
. [1973] 1984. Radical interpretation. In Davidson 1984, 12539.
. [1974] 1984. Belief and the basis of meaning. In Davidson 1984,
14154.
. [1975] 1984. Thought and talk. In Davidson 1984, 15570.
. [1982] 1984. Communication and convention. In Davidson
1984, 26580.
. 1984. Inquiries into Truth and Interpretation. Oxford: Clarendon
Press.
. [1992] 2001. The second person. In Subjective, Intersubjective,
Objective, 10722. Oxford: Clarendon.
. 1994. Radical interpretation interpreted. Philosophical
Perspectives 8: 1218.
. 2001. Externalisms. In Interpreting Davidson, ed. P. Kotatko,
P. Pagin, and G. Segal, 116. Stanford, CA: CSLI.
. 2005. Truth and Predication. Cambridge, MA: Belknap.
Fodor, Jerry, and E. Lepore. 1994. Is radical interpretation possible?
Philosophical Perspectives 8: 10119.
Gler, Kathrin. 2006. The status of charity: I. Conceptual truth or aposteriori necessity? International Journal of Philosophical Studies
14: 33760.
Lepore, Ernest, and K. Ludwig. 2005. Donald Davidson: Meaning, Truth,
Language, and Reality. Oxford: Clarendon.
Lewis, David. 1974. Radical interpretation. Synthese 23: 33144.
Pagin, Peter. 1997. Is compositionality compatible with holism? Mind
and Language 12: 1133.
Quine, Willard V. O. 1960. Word and Object. Cambridge, MA: MIT Press.
Ramberg, Bjrn. 1989. Donald Davidsons Philosophy of Language,
Oxford: Blackwell.
Rawlings, Pierce. 2003. Radical interpretation. In Donald Davidson, ed.
K. Ludwig, 85112. Cambridge: Cambridge University Press.

READING
Reading is the process of decoding and comprehending written
language. Decoding, the conversion of written forms into linguistic messages, is central to this definition to the extent that
the comprehension of written language shares its features with
the comprehension of spoken language.
Reading connects printed information conveyed in a writing
system with the readers knowledge of the language encoded by
that system. Writing systems vary in their mapping principles, in
their implementation in a particular language (the orthography),
and in their visual appearance (the script). Alphabetic writing
systems map graphic units to phonemes. Syllabary systems,

699

Reading
represented by Japanese Kana, map graphic units to spoken
language syllables. Chinese is usually classified as logographic
because its graphic units (characters) correspond primarily to
morphemes. However, the fact that characters have components that provide syllable-level pronunciation information justifies an alternative designation of morpho-syllabic (DeFrancis
1989).

Written Word Identification


Visual processing of a letter string results in the activation of the
grapheme units (individual and multiple letters) of words. In representational (or symbolic) models of reading, words are represented in the readers mental lexicon. Successful word reading is
a match between the graphic input and the corresponding word
representation. Phonological units are also activated and play an
important role in securing identification.
In dual route models of reading, identification occurs along
two pathways, a direct route to the word identity and an indirect route through phonological units (Coltheart et al. 1993).
The direct pathway must be used for exception words (e.g.,
iron) for which an indirect phonological route would fail and
may also function for any word that becomes highly familiar.
The phonological route must be used to read pseudowords
(e.g., nufe) for which there is no lexical representation and
may be used for words with regular grapheme-to-phoneme
patterns. Single-route connectionist models simulate
these two pathways with a single mechanism that learns how
to read letter strings on the basis of experience (Plaut et al.
1996). Alternative models use dynamic resonance mechanisms to capture interactions between existing states and new
inputs (Van Orden and Goldinger 1994). In a resonance model,
the patterns of graphic-phonological activation stabilize more
rapidly than do patterns of graphic-semantic activation, simulating a word-identification process that brings rapid convergence of orthography and phonology, with meaning slower to
exert an influence.
In studies of nonalphabetic reading, research has overturned the idea that reading Chinese involves only meaning
and not phonology (Perfetti, Liu, and Tan 2005). Even when
single-character words are read silently for meaning, the pronunciation of the character appears to be activated. This role
of phonology, where the writing system does not require it,
may reflect a universal phonological principle (see phonology, universals of ) that is grounded in spoken language.
Nevertheless, neuroimaging studies of the brains implementation of word reading show differences as well as similarities between alphabetic and nonalphabetic reading (Siok et al.
2004; see also writing and reading, neurobiology of ).
It is interesting to note that English-speaking adults learning to
read Chinese show brain activation patterns that partly overlap
those shown by native Chinese speakers, suggesting that properties of the writing system recruit specific brain areas (Perfetti
et al. 2007).
Individuals with word-identification problems are said to
have a specific reading disability, or dyslexia, provided they
also show a discrepancy between reading and achievements in
other domains. However, the processes that go wrong in a specific disability may not be much different from those that go

700

wrong for an individual who also has a problem in some other


area (Stanovich and Siegel 1994).
Dual route models allow two different sources of word-reading difficulties: Either the direct route or the indirect phonological route can be impaired (Coltheart et al. 1993). Surface dyslexics
have trouble with exception words, explained as selective damage to the direct route. Phonological dyslexics have trouble with
regular words and pseudowords, explained by selective damage
to the indirect phonological route. A different view from single
mechanism models is that only phonological dyslexia is the
result of a processing defect. Surface, or orthographic, dyslexia is
a delay in the acquisition of word-specific knowledge (Harm and
Seidenberg 1999), which comes through experience.

Reading Comprehension
Reading comprehension shares linguistic and cognitive processes with spoken language and correlates highly with it among
adults (Gernsbacher 1990). Because this correlation is based on
the use of equivalent texts across listening and reading, it may
miss the differences between ordinary spoken language and typical written texts that arise from their divergent syntactic structures, lexicons, and other aspect of their different registers. While
reading comprehension strongly depends on listening comprehension, reading and speech each place specific demands on
comprehension processes.
Reading comprehension processes begin with word identification and include context-relevant selection of word meanings and parsing (see parsing [human]), the basic process of
extracting grammatical relations among words in a sentence.
Beyond these word- and sentence-level basics, higher-level comprehension involves readers constructions of mental models
of text information. One mental model is based closely on the
language of the text, and another, the situation model, reflects
what the text is about (van Dijk and Kintsch 1983). The reader
builds a situation model from the linguistically based model (the
text base) by combining knowledge sources through additional
inference processes. A situation model may contain nonlinguistic representations, including spatial imagery (Glenberg, Kruley,
and Langston 1994) and the temporal organization of events
(Zwaan 1996), among others. Reading multiple texts that refer to
the same situation challenges the construction of a single situation model (Perfetti, Rouet, and Britt 1999) and requires additional comprehension skills in document use and evaluation
(Rouet 2006).
Because texts are never fully explicit, comprehension
research has had an enduring interest in inferences. Inferences
that link anaphora (e.g., pronouns) with their antecedents to
establish coreference are a routine part of comprehension. The
extent of elaborative and predictive inferences (Graesser, Singer,
and Trabasso 1994) is more in doubt (McKoon and Ratcliff 1992).
For example, the sentence The American tour group went to
London for its annual holiday may evoke an inference that the
group traveled by airplane, but whether a reader actually makes
this inference appears to be highly variable. Inferences about
causeeffect relations may be more likely than other kinds of
elaborative inferences (Trabasso and Suh 1993).
Comprehension skill is highly variable. Some children
appear to have a comprehension-specific problem (i.e., without

Reading
a decoding problem) that is general across reading and spoken
language (Nation and Snowling 1999; see also disorders of
reading and writing). The potential causes for comprehension problems include failures to make inferences during reading (Oakhill and Garnham 1988) and limitations in working
memory functions, among other factors (Nation 2005). Unstable
knowledge of word form and meaning (low lexical quality) also
contributes substantially to comprehension problems (Perfetti
and Hart 2001).

Learning to Read
In an alphabetic writing system, a child learns that letters and
strings of letters correspond to speech segments. For English,
this process is complicated by inconsistent orthography at the
letter-phoneme level, for example, the contrasts between choir
and chore and head and bead. Most European languages tend to
be coded by orthographies that more consistently map graphemes to phonemes, and learning to read reflects this fact; for
example, childrens errors reflect letter-to-phoneme decoding
procedures more than in English (Wimmer and Goswami 1994;
see also childrens grammatical errors).
Important for the alphabetic principle is phonemic awareness (see phonological awareness), the explicit understanding that the speech stream can be segmented into a set of
meaningless units (phonemes). Childrens phonemic awareness
correlates with early reading success, and phoneme segmentation instruction produces gains in reading. However, alphabetic
literacy experience itself affects awareness of phonemes, as
shown by studies of adults without exposure to alphabetic writing (Morais et al. 1979) and of Chinese who learned to read prior
to the introduction of the alphabetic Pinyin system (Read et al.
1986) as well as by longitudinal results that show a bidirectional
relation between phonological sensitivity and literacy (Perfetti
1992).
Theories of learning to read have usually referred to a series
of stages (Ehri 1991, 2005; Frith 1985; Gough and Hillinger 1980).
Alternative theoretical accounts emphasize the incremental
acquisition of decodable lexical representations and the role of
phonology to establish word-specific orthographic representations (Perfetti 1992; Share 1995; see also writing and reading, acquisition of).
Charles Perfetti
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Coltheart, Max, B. Curtis, P. Atkins, and M. Haller. 1993. Models
of reading aloud: Dual-route and parallel-distributed-processing
approaches. Psychological Review 100: 589608.
Coltheart, Max, K. Rastle, C. Perry, R. Langdon, and J. Ziegler. 2001. The
DRC model: A model of visual word recognition and reading aloud.
Psychological Review 108: 20425. Explains and defends the dual route
theory and its evidence.
DeFrancis, John. 1989. Visible Speech: The Diverse Oneness of Writing
Systems. Honolulu: University of Hawaii.
Ehri, Linn C. 1991. Learning to read and spell words. In Learning
to Read: Basic Research and Its Implications, ed. L. Rieben and
C. A. Perfetti, 5773. Hillsdale, NJ: Erlbaum.
. 2005. Development of sight word reading: Phases and findings.
In Snowling and Hulme 2005, 13555.

Frith, Uta. 1985. Beneath the surface of developmental dyslexia.


In Surface Dyslexia: Neuropsychological and Cognitive Studies
of Phonological Reading, ed. K. E. Patterson, J. C. Marshall, and
M. Coltheart, 30130. London: Erlbaum.
Gernsbacher, Morton. A. 1990. Language Comprehension as Structure
Building. Hillsdale, NJ: Erlbaum.
Glenberg, Arthur M., P. Kruley. and W. E. Langston. 1994. Analogical
processes in comprehension: Simulation of a mental model. In
Handbook of Psycholinguistics, ed. M. A. Gernsbacher, 60940.
San Diego, CA: Academic Press.
Gough, Philip B., and M. L. Hillinger. 1980. Learning to read: An unnatural act. Bulletin of the Orton Society 20: 17996.
Graesser, Arthur C., M. Singer, and T. Trabasso. 1994. Construction
inferences during narrative comprehension. Psychological Review
101: 37195.
Harm, Michael, and M. Seidenberg. 1999. Phonology, reading acquisition, and dyslexia: Insights from connectionist models. Psychological
Review 106: 491528.
McKoon, Gail, and R. Ratcliff. 1992. Inference during reading.
Psychological Review 99: 44066.
Morais, Jose, L. Cary, J. Alegria, and P. Bertelson. 1979. Does awareness
of speech as a sequence of phones arise spontaneously? Cognition
7: 32331.
Nation, Kate. 2005. Reading comprehension difficulties. In Snowling
and Hulme 2005, 24865.
Nation, Kate, and M. Snowling. 1999. Developmental differences in
sensitivity to semantic relations among good and poor comprehenders: Evidence from semantic priming. Cognition 70: B113.
Oakhill, Jane V., and A. Garnham. 1988. Becoming a Skilled Reader.
Oxford: Blackwell.
Perfetti, Charles A. 1992. The representation problem in reading acquisition. In Reading Acquisition, ed. P. B. Gough, L. Ehri, and R. Treiman,
14574. Hillsdale, NJ: Erlbaum.
Perfetti, Charles A., and L. Hart. 2001. The lexical basis of comprehension skill. In On the Consequences of Meaning Selection, ed. D. Gorfein,
6786. Washington, DC: American Psychological Association.
Perfetti, C. A., Y. Liu, J. Fiez, J. Nelson, and D. J. Bolger. 2007. Reading in
two writing systems: Accommodation and assimilation in the brains
reading network. Bilingualism: Language and Cognition 10.2: 13146.
Special issue on Neurocognitive approaches to bilingualism: Asian
languages, edited by P. Li.
Perfetti, Charles A., Y. Liu, and L. H. Tan. 2005. The lexical constituency
model: Some implications of research on Chinese for general theories
of reading. Psychological Review 12.11: 4359.
Perfetti, Charles A., J-F. Rouet, and M. A. Britt. 1999. Toward a theory of documents representation. In The Construction of Mental
Representations during Reading, ed. H. Van Oostendorp and
S. Goldman, 99122. Mahwah, NJ: Erlbaum.
Plaut, David C., J. L. McClelland, M. S. Seidenberg, and K. Patterson. 1996.
Understanding normal and impaired word reading: Computational
principles in quasi-regular domains. Psychological Review
103: 56115.
Rayner, Keith., B. R. Foorman, C. A. Perfetti, D. Pesetsky, and
M. S. Seidenberg. 2001. How psychological science informs the
teaching of reading. Psychological Science in the Public Interest
2.2: 3174. A supplement to Psychological Science. Reviews research
on reading and the acquisition of reading skills, connecting it to reading education.
Read, Charles, Y. Zhang, H. Nie, and B. Ding. 1986. The ability to
manipulate speech sounds depends on knowing alphabetic reading.
Cognition 24: 3144.
Rouet, Jean-Francois. 2006. The Skills of Document Use. Mahwah,
NJ: Erlbaum.

701

Realization Structure
Share, David L. 1995. Phonological recoding and self-teaching: Sine qua
non of reading acquisition. Cognition 55: 151218.
Siok, Wai T., C. A. Perfetti, Z. Jin, and L. H. Tan. 2004. Biological abnormality of impaired reading constrained by culture: Evidence from
Chinese. Nature 431 (September 1): 716.
Snowling, Margaret J., and C. Hulme, eds. 2005. The Science of Reading: A
Handbook. Oxford: Blackwell. A compendium of reviews across the
entire spectrum of reading-related research.
Stanovich, Keith E., and L. S. Siegel. 1994. Phenotypic performance profile of children with reading disabilities: A regression-based test of the
phonological-core variable-difference model. Journal of Educational
Psychology 86.1: 2453.
Trabasso, Tom, and S. Suh. 1993. Understanding text: Achieving explanatory coherence through online inferences and mental operations in
working memory. Discourse Processess 16: 334.
Van Dijk, Teun A., and W. Kintsch. 1983. Strategies of Discourse
Comprehension. New York: Academic Press.
Van Orden, Guy C., and S. D. Goldinger. 1994. The interdependence of
form and function in cognitive systems explains perception of printed
words. Journal of Experimental Psychology: Human Perception and
Performance 20: 126991.
Wimmer, Heinz, and U. Goswami. 1994. The influence of orthographic
consistency on reading development: Word recognition in English and
German children. Cognition 51: 91103.
Zwaan, Rolf. 1996. Processing narrative time shifts. Journal of
Experimental Psychology: Learning, Memory and Cognition
22.5: 11961207.

REALIZATION STRUCTURE
This term was coined by Keith Oatley (2002) to indicate how
one experiences a piece of literature. One does not just receive
or interpret it but realizes it, bringing it into being. The idea that
fiction involves such a realization or mental performance of the
piece by the reader or audience member was discussed in philosophy by Wolfgang Iser (1974) and in psychology by Richard J.

Gerrig (1993). Russian Formalists proposed that a literary work


has aspects of fabula and siuzhet, often translated as story
and plot. The fabula is a story structure: time-ordered events
in the story world. William Brewer and Ed Lichtenstein (1981)
suggested that the siuzhet may best be called the discourse structure: the ordered set of speech-acts of the writer to the general
reader or listener to prompt the story mentally into being. Oatley
suggested that two further aspects are necessary: One is the suggestion structure, the associations set off by the story idiosyncratically in individuals. The other is the realization structure,
the complete mental performance as realized in the mind of the
reader or audience member. The matter was well put by Virginia
Woolf (1957, 174):
Jane Austen is thus a mistress of much deeper emotion than
appears upon the surface. She stimulates us to supply what is
not there. What she offers is, apparently, a trifle, yet is composed
of something that expands in the readers mind and endows
with the most enduring form of life scenes which are outwardly
trivial.

The relationship among the four aspects of a piece of literary prose or poetry can be illustrated by the diagram in Figure 1
(from Oatley 2002, 45). The implication of the layout of the diagram is that the event structure starts off a story, usually by means
of a setting, that the discourse structure and suggestion structure
occur simultaneously, and that the realization structure is a
resultant of the other processes.
Keith Oatley
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Brewer, William, and Ed Lichtenstein. 1981. Event schemas, story schemas and story grammars. In Attention and Performance. Vol. 9. Ed.
J. Long and A. Baddeley, 36379. Hillsdale, NJ: Erlbaum.

Event Structure
The events of the story world.
A creation of the author.

Discourse Structure

Suggestion Structure

The text as written by the author, or the

Nonliteral aspects, suggested by the text,

drama as performed. Much of this structure

based on the readers or watchers share

is in the form of instructions to the reader

of knowledge, experience, emotions, and

or audience as to how to construct the story.

ideas.

Realization Structure
The enactment in the mind of the reader or
watcher, which results from the structure
process and suggestion structure
being applied to the discourse structure.

Figure 1.

702

Rectification of Names (Zheng Ming)


Gerrig, Richard J. 1993. Experiencing Narrative Worlds: On the
Psychological Activities of Reading. New Haven, CT: Yale University
Press.
Iser, Wolfgang. 1974. The Implied Reader: Patterns of Communication
in Prose Fiction from Bunyan To Beckett. Baltimore: Johns Hopkins
University Press.
Oatley, Keith. 2002. Emotions and the story worlds of fiction. In
Narrative Impact: Social and Cognitive Foundations, ed. M. Green,
J. Strange, and T. Brock, 3969. Mahwah, NJ: Erlbaum.
Woolf, Virginia. 1957. The Common Reader: First Series. London: Hogarth
Press.

RECTIFICATION OF NAMES (ZHENG MING)


The rectification of names is the adjustment of language to fit the
world, specifically insofar as this bears on action. It is an important concept in traditional Chinese philosophy. In the Analects,
Confucius established the place of this concept in political
thought in particular. Tzu-lu asked Confucius what he would put
first if he took over state administration. Confucius responded
that it would be the rectification of names, stressing the ill consequences if names are not correct (xiii.3). (This is not to say
that all schools of thought shared this view. Some downplayed
the importance of language; see Zhang 2002, 4758).
There has been considerable disagreement among writers in
the Chinese tradition as to just what the rectification of names
involves (see Zhang 2002, 46174). Approaching the topic from
the Western philosophical tradition, the obvious interpretation
is that meanings should be in accord with essences (see essentialism and meaning). For example, water should be used to
refer to H2O. There is an element of this in the various Chinese
schools that have debated the topic. Specifically, a range of writers suggest that a rectified term is in accord with li, or principle.
Unfortunately, there is no greater agreement on the meaning of
li than on the meaning of zheng ming. Abstracting somewhat
from the debates, however, we may infer that a principle is what
underlies and unifies a set of otherwise apparently diverse phenomena. Put differently, it is what manifests itself in the phenomenal patterns. This is related to the Western notion of essence.
However, it is not identical. For example, the rule complex governing plural formation in English should count as a principle
in this sense. It produces a patterned set of apparently diverse
phenomena (cats, dogs, bushes, and so on), defining the
unity of those phenomena. Moreover, understood in this way, a
principle may apply not only to real objects but also to ideals.
When a principle underlies an ideal, it defines a norm.
This normative part is crucial. Consider, for example, the
presence of U.S. troops in Iraq. Some speakers, such as representatives of the U.S. government, characterize this as liberation
(Chomsky 2006, 131). Others, including most Iraqis, characterize
it as occupation (Chomsky 2006, 163). These terms differ not only
descriptively but also evaluatively. They differ in the way they
apply norms to the situation, and thus in their implications for
appropriate response to that situation. In a case such as this, the
use or misuse of a term is significant because it has practical consequences. Specifically, it bears on our acceptance or rejection of
a particular structure of authority and of particular representatives of authority.

Such a relation to authority is often the sort of case that writers on zheng ming had in mind. In his famous treatment of the
rectification of names, Hsn Tzu (Xunzi) lamented the result of
verbal confusions wherein the distinction between the noble
and the humble is not clear and similarities and differences are
not discriminated. When this occurs, ideas will be misunderstood and work will encounter difficulty or be neglected (1963,
125). A similar practical concern is found in Mozi (Mo Tzu), but
with different political consequences. Thus, Mozi wrote against
those who distinguish names in the world in such a way as to
promote distinctions and discriminations. He favored those
who distinguish names in the world in such a way as to love
others and benefit others by advocating no discrimination
(quoted in Zhang 2002, 327).
The practical, political consequences of the rectification of
names are connected with the eight steps set out in the classic
Great Learning. The steps explain how to manifest ones virtue
by bringing order to a state. One brings order to ones state by
bringing order to ones family. One accomplishes this by developing oneself. That, in turn, results from rectifying ones mind/
heart, which derives from integrating ones thoughts. To do the
latter, one must extend ones knowledge, which is itself effected
by research. (On the eight steps, see Great Learning and Zhang
2002, 452. Note that the list of steps inverts their order in practice, with research, therefore, being the first step.) Knowledge is
at least in part a matter of knowing the right words and applying them properly. Thus, research is itself in part a matter of the
rectification of names. In research, one should seek the relevant
principles and connect them with names so that they will be
understood. Moreover, these principles, and our knowledge of
these principles, are not solely a matter of conceptualization.
As Zhang writes, names should match the reality, and reality
includes the actual use of the object (2002, 424). Zhang is referring to a Mohist idea. But the link between principle and practice
is much more widespread.
The eight steps also suggest the manner in which zheng
ming has consequences. In a given situation, our application of
a particular name affects our knowledge of the situation (Step
2). Moreover, it brings our understanding of the situation into
a complex of relations with other ideas and names (or words)
in such a way as to change the way our thoughts are integrated
(Step 3) and our heart or feeling is oriented (Step 4). We may
return here to the previous example. If I follow the lead of the
U.S. government and characterize the U.S. presence in Iraq as
liberation, then I understand its consequences in certain ways.
I also integrate this idea with other aspects of my thought
about the insurgents, about U.S. foreign policy, and so on. This,
in turn, has consequences for my emotional response to the
U.S. presence in Iraq, to the U.S. government, and so on ultimately with consequences for my support of that government.
The consequences are very different if I follow the lead of most
Iraqis and categorize the presence as occupation. Determining
which is correct, and using that term consistently in public discussion, is an instance of research leading to the rectification
of names, then to the integration of thought and orientation of
feeling.
Patrick Colm Hogan

703

Recursion, Iteration, and Metarepresentation


WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chan, Wing-tsit. 1963. A Source Book in Chinese Philosophy. Princeton,
NJ: Princeton University Press.
Chomsky, Noam. 2006. Failed States: The Abuse of Power and the Assault
on Democracy. New York: Metropolitan Books.
Confucius. 1979. The Analects. Trans. D. C. Lau. New York: Penguin.
Great Learning. 1963. In Chan 1963, 8494.
Hsn Tzu. 1963. The Hsn Tzu [selections]. In Chan 1963, 11535.
Zhang Dainian. 2002. Key Concepts in Chinese Philosophy. Trans. and ed.
Edmund Ryden. New Haven, CT: Yale University Press.

RECURSION, ITERATION, AND METAREPRESENTATION


Some researchers pinpoint recursion as our species key computational ability, making humans cognitively unique (e.g., Hauser,
Chomsky, and Fitch 2002; Corballis 2003). It may give us many
abilities hypothesized to be uniquely human: language, theory of
mind, complex problem solving, mathematics, and mental time
travel (episodic memory/future planning) (Hauser, Chomsky,
and Fitch 2002; Corballis 2003; Parker 2006; Stone and Gerrans
2006).
Within psychology and linguistics, recursion is understood as
a property of certain types of representations. Whether internal
to the mind or external, representations that can contain other
representations of the same type are recursive. Language, mental states, mathematical formulas, and spatial representations all
have this property. One can have a thought about someone elses
belief about another persons thoughts, or one can have a picture of a picture of a picture: Both are recursive representations.
Recursive processing requires that recursive representations be
unpacked in a systematic way, from the highest to lowest level, in
order to produce some output.
Recursion is distinct from the related concept, iteration, but
the two are often confused. Both involve repetition. In programming, iteration is the repetition of a process within a computation, with a top-level control structure that sees all the steps
involved (Anderson 2007). In recursion, however, the number of
steps is unknown to the highest level of the function; all that is
known to that level is whether an end condition has been satisfied
or whether the problem needs further breaking down (Anderson
2007; Suh 2007). In language, we can construct infinitely long
sentences by iterating elements, for example, I have lived in the
U.S. and England and Australia and Each iterative phrase is
independent, not requiring reference to the other phrases, only
to the top-level clause containing the phrases (Parker 2006).
We can also construct infinitely long sentences by using recursively embedded elements, for example, The blogger said that
Bush thought that Cheney thought that Libby believed that the
reporter did not know that Plame was a spy. These elements
are not independent, requiring full unpacking of each embedded level to understand the full meaning of the sentence. Each
level of embedding refers to another level: One cannot know the
semantic value of Cheney thought that without knowing the
semantic value of the clauses it includes.
Recursion should also be distinguished from the related
concept metarepresentation. Some use the terms interchangeably, using metarepresentation to mean a representation of a
representation (e.g., Corballis 2003). Metarepresentation means

704

being able to represent the relationship between the representation and what it refers to: to understand that a picture of
Niagara Falls stands for that visual scene, or that someones
belief that Santa Claus exists represents that potential state of
the world, or that rocks, the noun, refers to a set of stone objects.
Metarepresentation requires recursive embedding of representational relationships, but it is not identical to recursive embedding (Stone and Gerrans 2006). Metarepresentation may also be
uniquely human (Suddendorf 1999).
Marc Hauser, Noam Chomsky, and W. T. Fitch (2002) have
offered the hypothesis that recursion is the defining feature of
language, making it uniquely human. Other features of language,
however, do not follow directly from recursion and also seem to
be uniquely human, such as words, fine phonemic discriminations, and motor control of mouth, larynx, and so on (enumerated in Pinker and Jackendoff 2005; Parker 2006, Chapter 5).
Whether recursion is the single defining feature of language
or not, it might be uniquely human. Testing for recursive capacity directly is difficult. Instead, researchers rely on demonstrations of animals ability to do tasks dependent on explicit
recursion. Some claim that animals do implicit recursion in
certain tasks, for example, ants doing dead reckoning, but this is
difficult to substantiate. Although recursion is an efficient solution to many problems, unless one can test for the explicit content of the recursive steps in a computation, it is always possible
that animal brains solve problems using some other, nonrecursive computational technique. Thus, comparative research uses
tasks believed to depend on explicit recursion: mathematics,
theory of mind, problem solving involving interdependent steps,
mental time travel, or certain kinds of syntax (Corballis 2003;
Parker 2006). So far, no study has demonstrated that our closest
relatives, great apes, can do any of these tasks with the range and
flexibility of humans (Corballis 2003; Hauser 2005; Suddendorf
2006). For now at least, recursion can join a set of possibly unique
human cognitive capacities: other aspects of language, flexible
control of attention and inhibition, expanded working memory
capacity, and metarepresentation (Suddendorf 1999; Kawai
and Matsuzawa 2000; Hauser 2005; Pinker and Jackendoff 2005;
Stone and Gerrans 2006). Recursion may not be the key to unique
human cognition, but it is no less worthy of study for being one
of many keys.
Valerie Stone
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Anderson, A. K. 2007. Recursion in programming. Available online
at: http://www.usableresults.com. Retrieved January 15, 2007.
Corballis, M. C. 2003. Recursion as the key to the human mind. In
From Mating to Mentality, ed. K. Sterelny and J. Fitness, 15571. New
York: Psychology Press.
Hauser, M. 2005. Our chimpanzee mind. Nature 437.7055: 603.
Hauser, M., N. Chomsky, and W. T. Fitch. 2002. The faculty of language: What is it, who has it, and how did it evolve? Science
298.5598: 156979.
Kawai, N., and T. Matsuzawa. 2000. Numerical memory span in a chimpanzee. Nature 403.6765: 3940.
Parker, A.P. 2006. Evolution as a constraint on theories of syntax. Ph.D.
thesis, University of Edinburgh. Available online at: http://www.ling.
ed.ac.uk/~annarp. Retrieved February 4, 2007.

Reference and Extension


Pinker, S., and R. Jackendoff. 2005. The faculty of language: Whats special about it? Cognition, 95.2: 20136.
Stone, V. E., and P. Gerrans. 2006. Whats domain-specific about theory
of mind? Social Neuroscience 1.3/4: 30919.
Suddendorf, T. 1999. The rise of the metamind. In The Descent of Mind,
ed. M.C. Corballis and S. E. G. Lea, 21860. Oxford: Oxford University
Press.
. 2006. Foresight and evolution of the human mind. Science,
312.5776 : 10067.
Suh, E. 2007. Recursion and interation. Available online at: http://www.
aihorizon.com/essays/basiccs/general/recit.html. Retrieved from AI
Horizon February 5, 2007.

REFERENCE AND EXTENSION


Extension and reference are technical terms in the philosophy of
language, formal semantics, and pragmatics. We outline their
roles in three types of theoretical effort compositional semantic theories (which make use of both terms), various theories of
reference, which purport to tell us what it is for a word to have a
certain referent, and views that understand reference as something people do with words. The first two are semantic accounts;
the last conceives of reference as a matter of language use, so of
pragmatics.

Reference and Extension in Compositional Semantics


We begin with the use of reference and extension in compositional semantic theories. In this domain a referent is generally a
thing that a proper noun refers to or names, and an extension a set of objects to which a predicate applies (the term denotation is sometimes used interchangeably with both reference
and extension). However, compositional semanticists often generalize one or the other notion so that almost any kind of expression, including a sentence, can be said to have a referent or an
extension.
With few exceptions, compositional semantic accounts are
versions of truth conditional semantics attempts to
specify the meanings of sentences in terms of their truth conditions. Since natural languages allow for infinitely many sentences, the truth conditions of sentences must be specified
recursively in terms of the semantic values of their parts, and
referents and extensions are semantic values that enable us to
do just this. For example, we can specify the truth condition of
the sentence John smokes in terms of the referents and extensions of its parts as follows: John smokes is true if and only if
the referent of John (namely, John himself) is a member of the
extension of smokes (the set of things that smoke).
The primary historical source for compositional semantics
along these lines is Gottlob Freges ([1892] 1997) account of
Bedeutung often translated as reference (also as denotation). In it, a referent is assigned to every meaningful expression. Frege assumed that each complex expression is the result
of combining a functional expression (such as a predicate) with
one or more arguments (such as names) (see predicate and
argument). Further, he assumed that the referent of a functional expression F is always a function f, and that the referent
of any expression X that F accepts as an argument is the sort
of object that is among the arguments of f. Specifically, if F is a
functional expression and X an expression that F accepts as an

argument, the referent of F is the function that maps the referent of X onto the referent of F(X). Thus, the referent of a complex
expression is always the result of applying the referent of one of its
constituents, as a function, to the referents of its other constituents, taken as arguments. The referent of a sentence as a whole is
identified with its truth value. Thus, the referent of Chomsky is
Chomsky, the referent of is clever the function that maps each
object x onto truth if x is clever and onto falsehood otherwise, and
the referent (truth value) of Chomsky is clever is truth if and
only if Chomsky is clever.
It may seem surprising that the referent of a sentence is its
truth value, but it should be kept in mind that reference is used as
a technical concept within compositional semantics. Given the
use to which the concept is put, this is not an unnatural assumption: Frege was interested in a compositional semantics that
would tell us how the truth values of sentences are determined
by the referents of their parts, and all natural languages have
fragments in which, when a sentence has other sentences as parts,
the truth value of the whole depends only on the truth values of
the constituent sentences. Fragments of languages in which this
is the case, and in which the referent of a complex expression
in general depends only on the referents of its parts, are called
extensional. Thus, in an extensional fragment, expressions having the same referent can be substituted in any sentence without altering its truth value (contexts in which such substitutions
preserve truth value are also called extensional). Frege was primarily interested in constructing a semantics for the language
of mathematics, which is extensional, and so choosing truth
values as referents of sentences was natural. However, natural
languages as wholes are not extensional. In contexts involving
propositional attitudes, modality, and counterfactuals,
the substitution of clauses having the same truth value may alter
the truth value of the whole sentence. To account for such contexts, Frege held that each sentence or other expression has, in
addition to a referent, another kind of semantic value, which he
called the expressions sense (Sinn). The sense of a sentence is
what he called a thought, or, in contemporary terms, a proposition. In order to maintain a version of the principle of compositionality, he held that the truth values of nonextensional
sentences are determined in part by the senses of their constituents (see sense and reference).
For various reasons, Freges approach is now considered antiquated. Most recent work in formal semantics for natural languages is inspired by Alfred Tarskis work on the definability of
truth for formal languages. Richard Montague (1974) was the
first to apply Tarskis work productively to (fragments of) natural
languages. Here, extension is the preferred term. The extension of
a predicate is, again, the set of things to which it applies. Although
terminology varies, in this framework, too, one can speak of the
extension of almost any expression, including a sentence, so that
one identifies a sentences extension with its truth value. Applying
Tarskis approach, the aim is to recursively characterize not only
the truth conditions of sentences but also the entailments (logical consequences) for a language using the notion of extension: A
sentence S1 is said to entail a sentence S2 in language L if and only
if there is no assignment of extensions to the semantically simple
expressions of L (no model of L) under which S1 is true and S2
false. On this approach, the logical constants differ from other

705

Reference and Extension


expressions in that they are not assigned extensions/referents
(see logic and language).
Montagues approach differs notably from Freges: It does
not assign senses to expressions to account for nonextensional
contexts. Instead, it employs the tools of possible worlds
semantics to this end. One can, however, within Montagues
framework define objects corresponding roughly to Freges
senses: The sense of an expression could be thought to correspond to the function that maps each possible world onto the
extension that the expression has in that world. Such functions
are often called intensions (see intension and extension).

Theories of Reference, New and Old


The second set of theories (often called theories of reference)
in which the terms reference and extension are found appear in
the works of philosophers of language who aim to describe and
explain the wordworld relations that compositional semantic
theories of the sort discussed previously take for granted.
In this area, too, a classical source is Frege. According to
Frege, a word has a specific referent because its users associate it
with a particular sense something like a conceptual representation of its referent. Applied to proper names, his view was that
a name, say, George W. Bush, is associated by its users with a
certain descriptive condition, say, being the 43d president of the
United States, and that its referent is that object (if any) which
uniquely satisfies this condition.
Another view, the so-called new theory of reference (in vogue
since the 1970s), maintains that at least some expressions do
not have senses, but simply refer. Proper names are paradigm
examples. According to the approach, what cements the relation
between a name and its referent is not a mediating conceptual
representation in the speakers mind but a causal and historical
relationship between the names user and its referent. The idea,
articulated by Kripke (1972), is that a name is introduced by an
initial baptism, which involves a causal interaction between a
speaker and the referent itself, and reference for all other speakers
is preserved in chains of communication in which each speaker
intends to use the name to refer to the same object as those from
which he or she acquired the name. Extending Kripkes view in
ways suggested by Kripke himself, Hilary Putnam (1975) proposed baptism + history as an account of how natural kind
terms come to have and maintain their extensions (see essentialism and meaning).

Reference as Action
The third view we discuss maintains that reference depends
essentially on individual speakers (and possibly interpreters)
with variable interests: An appropriate slogan might be Words
dont refer; people do. One root of this view is found in the
work of the later Ludwig Wittgenstein, another in Descartes.
Those who defend it point out that it is difficult to find cases
of uniform wordworld relationships in the use of natural
languages. They grant that the practices of mathematicians
display uniformity, but these practices aside, reference varies
with time, context, speakers interests, and so on. They also
grant that some who offer theories of reference, such as Kripke
(1972), acknowledge a role for speaker intentions. But Kripke
and others incorrectly assume that ordinary speakers desire to

706

maintain uniformity to ensure rigidity in reference at least


with names. In fact, speakers paraphrasing Wittgenstein
play all sorts of games with language.
In taking reference as a form of action and treating refer as
a verb, we come closest to the commonsense idea of a person
referring to or talking about an object. Critical work on
Bertrand Russells analysis of definite descriptions by
P. F. Strawson (1950) and Keith Donnellan (1966), as well as
H. Paul Grices work on the semantics/pragmatics distinction,
inspired a distinction between speakers reference and semantic
reference, the latter being central to both compositional semantics and theories of reference of the sorts considered in the previous section. The speakers reference of an expression, on an
occasion, is whatever object that speaker uses the expression to
pick out, typically in order to assert (query, etc.) something about
that object. You may use the phrase the man drinking a martini to refer to a certain person, although the person you have in
mind is, unbeknownst to you, drinking water: He is not, then, the
semantic referent of the phrase.
Some writers hold that semantic reference either does not
exist (Strawson 1950 can be read this way), or if it is to sustain theoretical investigation must be reconceived (Chomsky).
Chomsky points out that natural language use (not in-house use
of the symbols of mathematics or natural science, where practitioners constrain their actions) displays creativity, where this
is thought of in terms not only of the uncaused production of
novel expressions but also of their free use for any number of
purposes (appropriateness). Because referring is a form of
free action and cannot sustain naturalistic study, Chomsky proposes the elimination of the semantic study of natural languages
as usually conceived (offering theories of wordworld relationships), placing the study of reference in a part of pragmatics
that resists theoretical investigation, and placing the study of
what he calls meaning (a psycholinguistic version of Fregean
senses) in syntax broadly conceived as the study of the intrinsic
properties of the mind/brain. The study of meaning semantics
reconceived becomes a psycholinguistic enterprise focusing
on the natures of mind-internal elements, such as lexical items,
their semantic features, and the computations in which they
figure. Chomsky (2000, 38f) points out that this kind of study
might employ a theoretical device called relation R, construed
as a postulated relationship between theoretically defined
expressions and objects in some introduced, stipulated domain.
Relation R is not reference outside the head that is not apt
for naturalistic study. Relation R and the domain D are, rather,
construed to be part of syntax theoretical devices aiding the
naturalistic study of syntax conceived as language in the head.
The members of D could be stipulated to be semantic values.
This might allow Chomsky to absorb the insights of Montague
and other developing theories within formal semantics into
syntax. It would also emphasize a view Chomsky maintains for
other reasons: Semantic compositionality is syntactic computation. Whether absorbing formal semantic accounts of compositionality in this way suits the intuitions and aims of those who
want their semantic efforts to provide explications of truth conditions is another matter.
James McGilvray and Juhani Yli-Vakkuri

Reference Tracking
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Chomsky, Noam. 2000. New Horizons in the Study of Language and Mind.
Cambridge: Cambridge University Press.
Donnellan, Keith. 1966. Reference and definite descriptions.
Philosophical Review 75.3: 281304.
Frege, Gottlob. [1892] 1997. On Sinn and Bedeutung. In The Frege
Reader, ed. M. Beaney. Oxford: Blackwell.
Kripke, Saul. 1972. Naming and necessity. In Semantics of Natural
Language, ed. Donald Davidson and Gilbert Harman, 253355, 76369.
Dordrecht, the Netherlands: Reidel.
. 1976. Speakers reference and semantic reference. Midwest
Studies in the Philosophy of Language 2: 25576.
. 1980. Naming and Necessity. Cambridge: Harvard University
Press.
Montague, Richard. 1974. Formal Philosophy. New Haven and
London: Yale University Press.
Putnam, Hilary. 1975. The meaning of meaning. In Language, Mind,
and Knowledge, ed Keith Gunderson, 13193. Minneapolis: University
of Minnesota Press.
Strawson, P. F. 1950. On referring. Mind 59.235: 32044.

Register
grammatical markings, such as agreement morphology: subjectverb agreement, person, number, and gender (see gender
marking) agreement; the switch-reference system (Amele); or
topic/subject markers (Japanese). Discourse analysis (see discourse analysis [linguistic]) finds that reference usage
follows the constraints of information flow: The grammatical
subject of a transitive clause tends to be coded with a pronoun
in English, or zero anaphora in Chinese or Japanese, to present
given or accessible information (the light subject constraint), for
instance, He in the example, whereas the grammatical object
tends to be an NP carrying new information (e.g., one plant). This
allows easy processing of accessible information early in an utterance, while the rest of the utterance introduces the new referent,
thus facilitating reference tracking and discourse processing.
Experimental studies find that the discourse pattern of a
language engenders specific reference tracking strategies in its
native speakers. Therefore, speakers of different languages may
develop different cognitive strategies to track reference during
discourse processes.
Liang Tao

REFERENCE TRACKING
Reference tracking, or ANAPHORA resolution, concerns how
language users track who or what the speaker is referring to in
discourse. Because everyday language use generally concerns
who does what to whom, reference tracking is important in
studying human language and cognition.
Anaphora devices include noun phrases (NPs), pronouns,
and zero anaphora, whose identities depend on their antecedents in discourse. In the example Isabel went to China, and this
volunteer/she helped with midwifery training, This volunteer
is an NP that refers back to its antecedent Isabel. It could be
replaced by the pronoun she or by an empty slot (zero anaphora)
as in the sentence Isabel went to China and ____ helped with
midwifery training.
Pronouns and zero anaphora give less explicit information
than full NPs. Still, the reader/hearer benefits from the efficiency
of these devices in conveying information that has been introduced/given in the prior discourse or can be accessed/inferred
from the context. These devices are crucial for global cohesion
and local coherence in discourse. Experimental studies find that
without a specific need, replacing a pronoun with an NP for given
information may hinder understanding.
A discourse topic provides a basic means for tracking the
identity of a pronoun or zero anaphora because the topic tends
to recur as given information continuously. Cross-linguistic
studies find that people can track the identity of a pronoun or
zero anaphora even when its referent is not in the immediately
preceding clause but in the prior context. Therefore, although
language production may be linear due to human physical limitations, language processing and reference tracking are hierarchical cognitive processes.
Reference tracking requires the hearer to make inferences
from world knowledge about likely events, especially for languages that have no morphological markings (Chinese) yet
allow abundant zero anaphora, as in He grew only one plant,
but ___ blossomed well. Many languages (e.g., French, German,
and Turkish) make reference tracking easier with specific

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Chafe, Wallace. 1994. Discourse, Consciousness, and Time: The flow
and Displacement of Conscious Experience in Speaking and Writing.
Chicago: University of Chicago Press.
Kintsch, Walter. 1988. The role of knowledge in discourse comprehension: A construction-integration model. Psychological Review
95: 16382.
Tao, Liang, and Alice Healy. 2005. Zero Anaphora: Transfer of Reference
Tracking Strategies from Chinese to English. Journal of Psycholinguistic
Research 34: 99131.

REGISTER
Speakers of a language use different words and grammatical
structures in different communicative situations. For example,
we do not use the same words and structures to write an academic term paper that we would use when talking to a close
friend about weekend plans.
Researchers study the language used in a particular situation
under the rubric of register: a language variety defined by its situational characteristics, including the setting, interactiveness, the
channel (or mode) of communication, the production and processing circumstances, the purposes of communication, and the
topic.
Although registers are defined in situational terms, they can
also be described in terms of their typical linguistic characteristics; most linguistic features are functional and, therefore, they
tend to occur in registers with certain situational characteristics.
For example, first and second person pronouns (I and you) are
especially common in conversation. Speakers in conversation
talk a lot about themselves, and so they commonly use the pronoun I. These speakers also interact directly with another person,
often using the pronoun you.
There are many studies that describe the characteristics of
a particular register, such as sports announcer talk (Ferguson
1983), note taking (Janda 1985), classified advertising (Bruthiaux
1996), and scientific writing (Halliday 1988). Other researchers

707

Regularization
take a comparative approach, studying the patterns of register
variation, which seems to be inherent in human language.
corpus linguistics has been an especially productive
analytical approach for studying register variation. For example,
the Longman Grammar of Spoken and Written English (Biber et
al 1999) applies corpus-based analyses to show how any grammatical feature can be described for both its structural characteristics and its patterns of use across spoken and written registers.
In the multidimensional approach to register variation, corpusbased analysis is combined with sophisticated statistical analysis to analyze the patterns of linguistic variation that distinguish
among registers (see, e.g., Biber 1988, 2006; Conrad and Biber
2001).
Some studies have distinguished between registers and
genres. In these studies, the term register refers to a general kind
of language associated with a domain of use, such as a legal register, scientific register, or bureaucratic register. Register studies
have usually focused on lexico-grammatical features, showing
how the use of particular words and grammatical features vary
with the situation of use. In contrast, the term genre has been
used to refer to a culturally recognized message type with a
conventional internal structure, such as an affidavit, a biology
research article, or a business memo. Genre studies have usually
focused on the conventional discourse structure of texts or the
expected sociocultural actions of a discourse community.
Douglas Biber
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Biber, Douglas. 1988. Variation across Speech and Writing.
Cambridge: Cambridge University Press.
. 2006. University Language: A Corpus-Based Study of Spoken and
Written Registers. Amsterdam: John Benjamins.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad, and
Edward Finegan. 1999. Longman Grammar of Spoken and Written
English. London: Longman.
Bruthiaux, Paul. 1996. The Discourse of Classified Advertising. New
York: Oxford University Press.
Conrad, Susan, and Douglas Biber, eds. 2001. Variation in English: MultiDimensional Studies. London: Longman.
Ferguson, Charles A. 1983. Sports announcer talk: Syntactic aspects of
register variation. Language in Society 12: 15372.
Ghadessy, M., ed. 1988. Registers of Written English: Situational Factors
and Linguistic Features. London: Pinter.
Halliday, M. A. K. 1988. On the language of physical science. In
Ghadessy 1988, 16278.
Janda, R. 1985. Note-taking English as a simplified register. Discourse
Processes 8: 43754.

REGULARIZATION
Regularization is the process of becoming more regular or
rule governed. In a general sense, it implies that events in the
world (or in individuals) become more predictable and orderly.
Regularization in language is one of many kinds of language
change. It occurs when forms in the language are under pressure
to be consistent with a common pattern. The patterns that speakers use in the regularization of the existing language are typically
the same as those that are used with new forms that enter the language and are referred to as productive. Thus, the past tense -ed

708

(with its allomorphs) is productive in English, not just because it


is frequent and ordered, but because it is the internalized pattern
that speakers automatically employ when a new verb, such as
supersize enters the language. We all agree that the past is supersized and not, for instance, *supersoze. Regularization is studied
diachronically (see synchrony and diachrony) as well as in
contemporary settings; and psycholinguistic research investigates and attempts to account for regularization processes,
particularly for paradigmatic regularization (the regularization
of inflectional paradigms), in the language of individual speakers. It is easier to agree on what has happened in the language
than on the underlying psycholinguistic processes in the minds
of speakers.
There is controversy among linguists as to whether speakers rely on abstract morphophonological algorithms or on more
general analogical processes when they inflect or derive words.
Differing models of mental representation and cognitive functioning can explain the productivity that ultimately makes regularization both possible and likely.

Diachronic Evidence of Regularization


historical linguistics reveals ways that regularization
occurs over time. Modern English reflects regularization in the
inflectional system that started hundreds of years ago. Over time,
English changed from a highly inflected language to a more analytic language, and the inflections that remained were largely
regularized, with some notable exceptions in the verb system.
Old English had different inflectional paradigms for strong verbs
(verbs that changed their vowels to form a past) and weak verbs
(verbs that maintained their stem and added a past tense suffix).
A typical kind of paradigmatic regularization occurred in Middle
English when many strong verbs like helpan to help, which had
various past forms (healp, hulpe, hulpon, hulpen), began to follow the pattern of the more regular weak verbs, leading to just
one past form, helped. Much of this took place in Middle English
between the years 1150 and 1500 (Baugh 1957, 189). In early
Middle English, plurals formed with -s were common, but there
were also many plurals formed with -en. During the fourteenth
century, the -s pattern became the regular plural, and today,
of the tens of thousands of nouns in Modern English, we can
point to very few (children, oxen, brethren) that form the plural
with -en. Many irregular words became regularized during this
period or fell into disuse, thus making English a more transparent and consistent language in general. In particular, many Old
English strong verbs simply disappeared while some others were
changed into weak or regular verbs. Thus, over a period of several hundred years, various kinds of regularization took place in
English. Regularization was not omnipresent, however: The most
frequent verbs in English (such as to be, to have, and to do) have
roots in Old English and are irregular. Highly frequent words tend
to resist regularization in other languages as well as in English.

Social Forces That Contribute to Regularization


The historical and societal forces that contribute to change in
spoken language are usually out of the awareness of speakers,
except when those in power exert control through standardization or some other kind of language policy. A standardized language is usually one particular variety designated

Regularization
the official or preferred language, and it does not follow that
the chosen dialect will be any more regularized than less valued varieties. Standardized spelling, by contrast, is typically
a move to simplify and regularize spelling conventions (Wijk
1977). There are cases, however, where official bodies are delegated to choose or change words or expressions in order to
provide conformity with the rules of a language: For instance,
the Academy of the Hebrew Language in Israel is charged with
replacing words borrowed from other languages with words
based on Hebrew semantics and morphology, in this way
providing a kind of regularization of the lexicon of contemporary Hebrew. The French Academy (lAcadmie franaise)
compiles a dictionary of the French language and makes recommendations as to the admissibility of loan words, usually in
favor of words that are more French for instance, courier lectronique instead of e-mail. These attempts to control language
by decree are often not successful, especially in keeping out loan
words that are pervasive and international. In any language,
loan words themselves are subject to regularization that brings
them into conformity with the morphology and phonotactics of
the target language. For example, the English word baseball has
been incorporated into Japanese as beisuboru, and the umpires
call to play ball is rendered as purei boru.
A powerful regularizing force in a language is the communication pressure that results from becoming more cosmopolitan,
whereas an insular language is more likely to remain irregular
and complex. When a society becomes more multiethnic, there
is often language contact. In addition, a rapidly changing
society may have new technologies and terminologies, a mobile
population carrying language to new communities, and many
speakers who do not know the language well. All of these factors lead to linguistic economy, simplification, and regularization. When new words are coined, their inflections are regular.
Loan words from contact languages are regularized. And, like
first language learners, second language learners (see second
language acquisition) of whatever age tend to regularize
the language they are acquiring. It is very common for learners
of French, for instance, to regularize the irregular second person plural of dire (to say), producing vous disez instead of vous
dites. This regularized form is not permissible in standard French,
but it occurs in various nonstandard dialects and is the norm in
Louisiana Cajun French. When a language is spoken in a homogeneous, isolated community with few non-native speakers and
little outside contact, there is apparently a greater tendency for
complexity to remain in the inflectional system. A comparison
of modern Icelandic and Norwegian provides a good example of
this tendency: Both have Old Norse beginnings, but the relatively
isolated Icelandic has preserved complex morphology, including
many of the strong verbs of Old Norse, whereas comparatively
cosmopolitan Norwegian shows more regularization, with simpler inflectional morphology and many fewer Old Norse strong
verbs (Kusters 2003).

Psycholinguistic Studies
Some of the best evidence we have for the processes of paradigmatic regularization has come from psycholinguistic studies of
both children and adults. We know that adult speakers of English,
for instance, are able to produce appropriate inflectional endings

for new words, following a complex set of morphophonological


rules. In order to form a plural, one must observe the ending of
the stem of the word, and then add /-s/ or /-z/ or /-iz/ depending
on whether the stem ends in a voiceless sound, a voiced sound,
or a sibilant (backs, bugs, busses). When new words such as hardscape, abdominoplasty, and riffage recently entered English, they
were all treated as regular plurals. At some point, adults appear to
have acquired the rules for making plurals, making it possible to
apply them to new words or to regularize existing words. Adults
are also able to produce appropriate irregular forms in English,
for instance, the past tenses of words like sing, keep, and catch.
An adequate psycholinguistic model must account for both regular and irregular inflections.
Studies of childrens acquisition of morphology provide some
insights into the development of these mental representations
(see morphology, acquisition of). It has also long been
observed that young children acquiring language make typical
errors of overregularization: They regularize words that are
irregular and say things like my foots, two mouses, and My
teacher holded the baby rabbits and we patted them, clearly not
as imitations of adult models (Gleason 1966). These errors are
overgeneralizations that children make as they acquire the linguistic system. Overregularized words are evidence that the child
has knowledge about language that goes beyond the memorization of heard forms. Studies of spoken language have shown a
U-shaped curve in childrens acquisition of irregular words. In
the first stage, young children produce correct irregular forms
along with uninflected regular words. Somewhat later, at about
the same time that children begin to produce correct regular
pasts and plurals, the overregularized forms appear, and finally
both regular and irregular inflections are used (Marcus et al.
1992). One explanation is that initially, they produce the irregulars correctly as rote unanalyzed forms, but when they begin to
acquire the rules for the regular inflections, they overapply those
rules, unaware of the exceptions. Only later are they able to handle both the regular forms and the irregular ones.
Experimental studies have shown that children as young as
four have systematic knowledge about the inflectional system.
When presented with pictures of creatures with novel names and
told This is a wug. Now there is another one. There are two ___?
they accurately produced the plural wugs (Berko 1958). When
given novel verb forms (This is a man who knows how to spow.
Yesterday he ___?), the children uniformly produced the regular
form spowed. Children were remarkably consistent in producing
the most frequent, productive, and uncomplicated inflections of
nouns and verbs.
There are several theoretical explanations for these phenomena. Some researchers contend that speakers operate with
abstract higher-order rules when inflecting regular forms (Pinker
1991). According to this view, the inflectional algorithm allows
speakers to deal with any regular word, whether it is familiar or
not. Irregular words, however, must be committed to memory.
Other scholars disagree in the interpretation of how it is that
speakers are able to handle new words, and whether activation of internalized rules adequately characterizes the process.
connectionist approaches to language do not rely on rules,
nor do they divide the lexicon into two groups of words, regular and irregular. In connectionist models, the mental processor

709

Relevance Theory
relies on exemplars or analogies: Speakers hear a variety of
words over time, some more than others, and their mental representation reflects the weight of these frequencies, the features
of the words, and the circumstances of their use. Frequently
encountered features are recognized and associated, and ultimately the learner produces language that matches the language
that has been heard.
Regardless of the theoretical model, it is clear that individuals
are sensitive to the characteristics of the language around them
and are able to generalize those characteristics to new instances.
By the time children are of preschool age, they have sufficient
knowledge of the most regular features of language to be able to
extend them to words they have never heard before. This kind of
knowledge underlies both productivity and regularization, and
it has implications for language change in general. Languages
everywhere are moved to become more regular in response to
communicative pressure. Speakers carry within themselves the
linguistic tools of regularization, which is a process that reveals
a fundamental characteristic of the way humans organize
information.
Jean Berko Gleason
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Baugh, Albert C. 1957. A History of the English Language. New
York: Appleton-Century-Crofts.
Berko, Jean. 1958. The childs learning of English morphology. Word
14: 15077.
Gleason, Jean Berko. 1966. Do children imitate? Proceedings of the
International Conference on Oral Education of the Deaf 2: 441448.
Washington, DC: Alexander Graham Bell Association for the Deaf.
Kusters, Wouter. 2003. Linguistic complexity: The influence of social
change on verbal inflection. Ph.D. diss., University of Leiden.
Marcus, Gary, S. Pinker, M. Ullman, M. Hollander, T. J. Rosen, and Fei Xu.
1992. Overregularization in language acquisition. Monographs of the
Society for Research in Child Development 57: 1182.
Pinker, Steven. 1991. Rules of language. Science 253: 5305.
Wijk, Axel. 1977. Regularized English: A Proposal for an Effective
Solution of the Reading Problem in the English-Speaking Countries.
Stockholm: Almqvist and Wiksell.

RELEVANCE THEORY
Relevance theory (RT) is best known for its account of verbal
communication and comprehension, which is employed by
people working in pragmatics. It also sets out a broad picture of the principles driving the human cognitive system as a
whole, and this plays a crucial role in underpinning the particular claims made about communication (Sperber and Wilson
[1986] 1995a; Wilson and Sperber 2004).

Relevance and Cognition


According to the RT framework, human cognitive processing quite
generally is geared toward achieving as many improvements to
its representational contents and their organization as possible,
while ensuring that the cost to its energy resources is kept as low
as reasonably possible. At the center of the theory is a technically defined notion of relevance, where relevance is a potential
property of any input to any perceptual or cognitive process. An

710

input may deliver a variety of different types of cognitive effects to


the system: It may, for instance, combine inferentially with existing assumptions to yield new conclusions (known as contextual
implications), it may provide evidence that strengthens existing
beliefs, it may contradict and eliminate already held information,
or it may rearrange the way information is stored. Such effects
may or may not be beneficial to an individual; that is, they may
increase or decrease the accuracy of the cognitive systems information about the world and may make useful information easier
or harder to access. An input is relevant to a cognitive system only
if it benefits that system, that is, only if it has positive cognitive
effects. The other crucial factor affecting the degree of relevance
of an input (whether an external stimulus or an internal mental representation) is the processing effort it requires: Deriving
effects from any given input requires a mobilization of cognitive
resources, including attention, memory, and various processing
algorithms and heuristics. Thus, the relevance of any input is a
trade-off between the positive cognitive effects it yields and the
processing effort it requires: The greater the ratio of effects to
effort, the greater the relevance of the input.
The basic claim of the framework is that human cognition is
oriented toward maximizing relevance (known as the cognitive
principle of relevance). The evolutionary idea underlying this
claim is that as a result of constant selection pressure toward
increasing cognitive efficiency, humans have evolved procedures to pick out potentially relevant inputs and to process them
in the most cost-effective way (Sperber and Wilson [1986] 1995b).
Human communicative behavior, including verbal communication and comprehension, exploits this prevailing cognitive drive
for relevance in a particular way.

Relevance and Linguistic Communication


The starting point for a pragmatic theory (that is, an account
of how speakers and their addressees manage to converge on
a shared interpretation) is the question concerning how hearers are able to bridge the gap between the encoded linguistic
meaning and the speakers intended meaning. The most obvious manifestations of this gap are nonliteral uses of language,
such as metaphor, metonymy or irony, and cases where the
speaker communicates, in addition to the proposition explicitly
expressed, a further proposition known as a conversational
implicature, exemplified by speaker Ys utterance in (1).
(1)

X: We need your written report now.


Y: Ive been very busy recently.
Implicating: Y hasnt written the report.

There is also a range of tasks involved in determining the


proposition explicitly expressed, including disambiguation,
assignment of referents to indexicals, and filling in missing constituents, as in (2), and various other enrichments or
adjustments of encoded content, as indicated in the examples
in (3): (3b) involves a narrowing down of take time, and (3c) a
loosening of the concept encoded by boiling. In each case, of
course, the particular proposition explicitly expressed is just one
of indefinitely many other possibilities:
(2)

He has taken enough from her.


Expressing: Jim has endured enough abusive treatment from Mary.

Relevance Theory
(3)

a. Ive eaten.
Expressing: Ive eaten dinner tonight.
b. Your knee will take time to heal.
Expressing: The knee will take a substantial amount of time to
heal.
c. The water is boiling.
Expressing: The water is very hot [not necessarily strictly at
boiling point].

How, then, is an addressee able to infer the intended meaning


from the encoded linguistic meaning, which is just a schematic
guide or set of clues? According to RT, the answer lies with a special property of overtly communicative acts, which is that they
raise certain expectations of relevance in their addressees, that
is, expectations about the cognitive effects they will yield and the
mental effort they will cost. Quite generally, an utterance comes
with a presumption of its own optimal relevance; that is, there
is an implicit guarantee that the utterance is the most relevant
one the speaker could have produced, given his or her competence and goals, and that it is at least relevant enough to be worth
processing. This is known as the communicative principle of relevance and it follows from the cognitive principle of relevance in
conjunction with the overtness of the intention that accompanies
an utterance: The speaker openly requests effort (attention) from
the addressee, who is thereby entitled to expect a certain quality
of information requiring no gratuitous expenditure of effort.
That utterances carry this presumption licenses a particular
comprehension procedure, which, in successful communication, reduces the number of possible interpretations to one.
According to the relevance-theoretic comprehension procedure:
(a) Follow a path of least effort in computing cognitive
effects: Test interpretive hypotheses (disambiguations, reference resolution, lexical adjustments, implicatures, etc.) in
order of accessibility.
(b) Stop when your expectations of relevance are satisfied.
This procedure is automatically applied in the on-line processing of verbal utterances: Taking the schematic decoded linguistic meaning as input, processes of pragmatic enrichment at the
explicit level occur in parallel with the derivation of the implications of the utterance.
Central to the working of the procedure is a subprocess of
mutual adjustment of explicit content and contextual implications, a process guided and constrained by expectations of relevance. Here is a brief example involving the adjustment of explicit
content in response to expected implications and where the outcome is a narrowing down of a lexically encoded meaning:
(4)

X (to Y): Be careful. The path is uneven.

Given that the first part of Xs utterance warns Y to take care, Y is


very likely to expect the second part of the utterance to achieve
relevance by explaining or elaborating on why, or in what way,
he should take care. Now, virtually every path is, strictly speaking, uneven to some degree or other (i.e., not a perfect plane),
but given that Y is looking for a particular kind of implication,
he or she will enrich the very general encoded concept uneven
so that the proposition explicitly communicated provides appropriate inferential warrant for such implications of the utterance

as: Y might trip over, Y should take small steps, Y should keep an
eye on the path, and so on. The result is a concept, which we can
label uneven*, whose denotation is a proper subset of the denotation of the lexical concept uneven.
A distinctive RT claim in this context is that metaphorical
and hyperbolic uses of words involve a kind of concept broadening (loose use), and so fall within this single process of lexical
meaning adjustment. For instance, an utterance of the sentence
in (5) could be taken as an ordinary broadening (if, say, the run
referred to was a little less than 26 miles) or as hyperbolic (if it
was considerably less than the length of a marathon) or as metaphorical for a long, arduous, exhausting experience, whether
physical or mental:
(5)

It was a marathon.

(For much more detailed exemplification of the RT-based


account of lexical adjustment, resulting in concept broadening,
or narrowing, or a combination of the two, see Carston 2002;
Wilson and Sperber 2002; Wilson and Carston 2007).
As these examples indicate, on the RT view, the contribution of
pragmatics to the proposition expressed by an utterance (i.e., its
truth conditional content) is very wide-ranging, going well
beyond the role of simply providing contextual values for indexicals. So, unlike the standard view of semantics and pragmatics
in the philosophy of language, the RT position on the semantics/
pragmatics distinction is that it does not coincide with the distinction between explicit utterance content and implicature but,
rather, with the distinction between context-free linguistically
encoded meaning and what is communicated.

Relevance Theory in a Broader Perspective


The output of pragmatic interpretation is a conclusion about
speakers communicative intentions, and the inferential process
may rely on assumptions about other mental states of speakers (their beliefs and desires). So there is a close relationship
between utterance comprehension and the theory of mind
capacity, both of which depend on a metarepresentational ability an ability to represent representations and attribute them
to others. Dan Sperber and Deirdre Wilson (2002) argue that the
pragmatic capacity is a subsystem of the mental system responsible for interpreting peoples behavior in terms of the underlying mental states that cause it. They further argue, in line with
current views in evolutionary psychology on cognitive
architecture, that it is an autonomous domain-specific system, operating just on ostensive stimuli (in particular, utterances), and has its own dedicated procedures, as outlined here.
Thus, it is a modular system, a submodule of the more general
theory-of-mind module.
There is currently a strong emphasis on spelling out the
empirical predictions of relevance theory, comparing them with
other theories of cognition and/or pragmatics and subjecting
them to experimental testing. There are three broad areas of
such empirical work. The first involves using familiar psycholinguistic techniques (e.g., measuring participants reaction
times in a range of on-line tasks) to investigate the time course
of processing and the relative allocation of time/effort to different aspects of pragmatic interpretation (see, for instance, Noveck
and Sperber 2004). The second is research into the development

711

Religion and Language


of communicative competence in children and its relation to
their linguistic maturation, on the one hand, and to their developing theory-of-mind capacity, on the other. There is evidence
of a close relation between mature pragmatic competence (e.g.,
ability to interpret nonliteral uses of language, such as metaphor
and irony) and mature theory of mind (e.g., ability to attribute
to others beliefs that one knows to be false). But it is also clear
that children can communicate ostensively from around the age
of two years, and so this early ability does not depend on a fullfledged metarepresentational capacity, which comes some years
later, but perhaps on some earlier emerging component of theory of mind, such as a capacity for joint attention. Finally, there is
empirical investigation of people with atypical or impaired communicative capacities, including those with autism, Williams
syndrome, or schizophrenia. The results so far provide tentative
support for RT predictions about the differences in the processing of literal, metaphorical, and ironical uses of language (see
Wilson 2005; Carston and Powell 2006).
Robyn Carston
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Carston, Robyn. 2002. Thoughts and Utterances: The Pragmatics of Explicit
Communication. Oxford: Blackwell.
Carston, Robyn, and George Powell. 2006. Relevance theory new directions and developments. In The Oxford Handbook of Philosophy of
Language, ed. Ernie Lepore and Barry C. Smith, 34160. Oxford: Oxford
University Press.
Noveck, Ira, and Dan Sperber, eds. 2004. Experimental Pragmatics.
Basingstoke, UK: Palgrave.
Sperber, Dan, and Deirdre Wilson. [1986] 1995a. Relevance: Communication and Cognition. 2d ed. Oxford: Blackwell.
. [1986] 1995b. Postface. In Sperber and Wilson 1995, 25579.
. 2002. Pragmatics, modularity and mind-reading. Mind and
Language 17: 323.
Wilson, Deirdre. 2005. New directions for research on pragmatics and
modularity. Lingua 115: 112946.
Wilson, Deirdre, and Robyn Carston. 2007. A unitary approach to lexical
pragmatics: Relevance, inference and ad hoc concepts. In Advances in
Pragmatics, ed. Noel Burton-Roberts. Basingstoke, UK: Palgrave.
Wilson, Deirdre, and Dan Sperber. 2002. Truthfulness and relevance.
Mind 111: 583632.
. 2004. Relevance theory. In The Handbook of Pragmatics, ed.
Laurence Horn and Gregory Ward, 60732. Oxford: Blackwell.

RELIGION AND LANGUAGE


The topic of religion and language could potentially fill a book.
In order to make it more manageable, this entry concentrates
mainly on theistic, especially Judeo-Christian, usage.

Descriptive Religious Language


In speaking and writing of God, or however they conceptualize
the ultimate, religious people attempt to portray a mystery that
transcends empirical reality by using language that is normally
applied to things or (particularly) human beings. Islam in particular emphasizes the transcendence of God.
Figurative forms of language draw on the associations of language in its home in the mundane world to highlight similarities
with this mysterious divine world. Using metaphors, people

712

speak about God in language suggestive of (Soskice 1985, 51)


a father, mother, master, king, shepherd, rock, fire, or wind but
without intending these references to be taken literally. To assist
reflection and argument, religions sometimes develop more stable, extended, systematic (and often complementary) linguistic
models from such illuminating metaphors for example, God
as personal; the death of Christ as a victory or sacrifice.
Traditional accounts argue that such models can truly represent,
although they do not literally describe, divine reality, but more
radical thinkers treat them as no more than imaginative fictions
whose purpose is to motivate and direct peoples spiritual and
moral lives and experiences.
Religious analogies are sometimes regarded as different
only in degree from religious metaphors, but many insist that
they apply to God literally and function as legitimate extensions
of their normal application. So God is really wise, merciful, and
living, but in a stretched sense appropriate to Gods very different nature when compared with humans. (The same may be said
of the meaning of clever when applied to dogs or computers.)
It is possible partly to specify and thus partly replace analogical religious language, providing that we know the ways in which
(say) Gods love is similar to that of a loving person and the ways
it is different. It is only when analogies are specified that they
can be reliably used in argument; metaphors are frequently too
vague to allow inferences to be drawn from them.
Extended metaphors in narrative form are used to speak of
the actions of divinities. In myths, God or the gods are pictured
as interacting with this world; in parables, this activity is said to
be like a wholly human situation (many are explicit similes). In
both cases, the stories are often self-involving and possess considerable emotional power and salience.

Performative Religious Language


In addition to this cognitive (fact-asserting) function, religious language is also used to perform many noncognitive
tasks through speech-acts in which attitudes or feelings are
expressed, commitments, vows and requests are made, or obligations prescribed. J. L. Austin distinguished this illocutionary force from an utterances consequences (perlocution).
In worship, religious persons expresses their trust, gratitude,
awe, and longing and makes prayerful requests and promises.
This language is intended to affect God, but it often also evokes
and deepens the faith of other worshippers as well as producing
a reinforcing effect on the language users themselves. In many
religions, especially nontheistic faiths, the language of meditation including the repetition of mantras serves to evoke mystical experiences or spiritual illumination of various kinds: for
example, the realization of the oneness of the self with absolute
Brahman in Hinduism or the enlightenment experience that
releases from dukkha (the unsatisfactoriness or suffering of
life) in Buddhism.

Religious Language and Religious Life


According to Ludwig Wittgenstein, all language is rooted within
human activity, as part of a form of life. Some argue that religion itself is a form of life, or at least that some of its distinctive
activities for example, its rituals are examples of this category.
Much of the language used in these contexts may be thought of

Religion and Language

Representations

as religious language-games, such as praying or confessing,


that possess a particular set of rules or logical grammar very
different from that operating in some other language-games.
(Thus, predestination is said to be less a theory than a sigh, or
a cry [Wittgenstein 1980, 30e]). For Wittgenstein, the forms of
life are given, for this is what people do, and the justification of
a language game lies in our acting. The meaning of religious
belief, therefore, may be thought of as thoroughly grounded in
religious behavior.
Christian theologians influenced by Wittgenstein sometimes
treat Christianity as a learned cultural-linguistic system, with
doctrine as the rules of its grammar; others have developed his
reflections into a wholly noncognitive view, in which belief in
God is nothing more than allegiance to a set of spiritual values.

Religious Readers and Religious Writers


Hermeneutics (the theory of interpretation; see philology and
hermeneutics) is key to understanding how sacred texts are
read. Friedrich Schleiermacher saw the readers task as that of
uncovering the authors intended meaning, but many insist that
writing distances the author from the text and that the reader has
his or her own part to play. Certainly, readers are always interpreters of the text and never approach it with innocent eyes
(without their own preconceptions and worldview). For HansGeorg Gadamer, the texts horizon of meaning does not swamp
that of the reader; rather, the two interact and fuse, resulting in a
new interpretation that goes beyond them both.
While religious conservatives seek to preserve a single understanding of the meaning of a text (and sometimes insist on its
literal interpretation), more liberal and radical scholars regard
it as perfectly legitimate for religious readers to find other meanings there that go beyond what the original author consciously
intended, even if that can be recovered. This second perspective
works well for some sacred poetry, and it fits the claim that religious responses are at their most authentic when they are most
personal. It is more difficult to sustain, however, for religious history and biography and wherever readers care about what the
author meant to say in using this language. Such considerations
raise theological questions about the divine inspiration of the
text and its role as a medium of revelation, in which God speaks
through sacred scripture (whether by infallible dictation or more
generally perhaps by superintending its writing, or merely by
authorizing or appropriating the fallible authors language).
Jeff Astley
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Astley, Jeff. 2004. Exploring God-Talk: Using Language in Religion.
London: Darton, Longman and Todd.
Caird, George B. 1980. The Language and Imagery of the Bible. London:
Duckworth.
Dhavamony, Mariasusai. 1973. Phenomenology of Religion. Rome:
Gregorian University Press.
Soskice, Janet Martin. 1985. Metaphor and Religious Language.
Oxford: Clarendon.
Smart, Ninian. 1996. Dimensions of the Sacred: An Anatomy of the Worlds
Beliefs. London: HarperCollins.
Stiver, Dan R. 1996. The Philosophy of Religious Language: Sign, Symbol,
and Story. Oxford: Blackwell.

Thiselton, Anthony C. 1992. New Horizons in Hermeneutics.


London: HarperCollins.
Wittgenstein, Ludwig. 1980. Culture and Value. Ed. G. H. von Wright,
trans. Peter Winch. Oxford: Blackwell.

REPRESENTATIONS
Definition
The core concept of a representation is made up of two distinguishable ideas.
(1) Any representation, r, must be about, or mean, or have as
its content, some distinct item s;
(2) r must be employable, in the absence of s, to guide the
behaviors of the consumers of r organisms or systems that
use r with respect to s.
To say that these ideas are distinguishable is not to say that they
are separable. On some accounts of representation, (1) is a function of (2). That is, a representation can have a distinct item as its
content only because of the way it is utilized by its consumers.
The relation between (1) and (2) is, however, disputed.

Original and Derived Representations


It is common to distinguish original and derived representations.
The distinction is intuitively clear, but its application less so. The
distinction pertains to how representations acquire their content. Some seem to acquire their content in virtue of intentional,
hence representational, states of subjects. Many people think
this is true of language: A form of words has content only because
of the intentions and understanding of its users (see intentionality). When persons use a form of words in a context
that is not metaphorical, ironic, or sarcastic they intend their
words to be taken in a certain way, and hearers/readers of their
words understand that they intend their words to be taken in this
way, and so on (Grice 1957; see communicative intention).
Language, then, would consist in derived representations: representations whose content derives from the intentional states of
its users. Intentional states, on the other hand, would be original
representations: representations whose content is not similarly
derived.

The Problem of Representation


This conception of the relation between original and derived
representation has exerted an immense influence on the way the
concept of representation has been developed in recent decades.
It entails that if we want to understand representations, then our
focus must be on intentional states. If we understand the content
of derived representations as issuing from original representations, then we must understand how the latter can have their
content. All representation ultimately reduces to mental representation. The difficulties inherent in this project were identified
by Ludwig Wittgenstein (1953).
Wittgenstein argued that the content of an intentional state
cannot be determined by any fact conscious or unconscious
about the states subject. Consider, first, an imagistic conception of content. One might suppose, for example, that when I
use the word dog an image of a dog must somehow flash before
my mind. However, as Wittgenstein showed, such an image

713

Representations
is neither necessary nor sufficient for content. It is not necessary because, typically, when I use the word dog, no such image
appears to me. It is not sufficient because the image is, itself, just
another symbol, and its content needs to be interpreted no less
than that of the word. An image of a dog can stand for many
things dog, furry creature, creature with four legs, and so on
and so the image can have content only if there is a further act
of interpretation on the part of its subject: an act that serves
to disambiguate the image. However, an act of interpretation is
itself an intentional state. Therefore, we have made no progress
in understanding how an intentional state has content; we have
simply replaced one intentional state with another.
These problems remain when we shift to more sophisticated
accounts of content. In one such account, content is understood
in terms of the following of rules: In using a sign, I consciously
or unconsciously follow a rule, a rule that determines how the
sign is to be applied in particular cases. Wittgenstein argued that
there is no fact about an individual that determines that he or
she is following one rule rather than another. Any behavior in
which I engage is compatible with an indefinite number of rules.
In continuing the mathematical sequence 2, 4, 6, 8, I could be following the n + 2 rule or the n + 2 if and only if n is less than 32,
if so n + 4 rule. And so on for an infinite number of ns. Similarly,
in applying the word dog, I could be following the Apply dog to
all and only dogs rule, or the Apply dog to all and only dogs
unless the dog is first seen after 2020 in which case apply cat,
and so on for an infinite number of permutations (see projectibility of predicates). Crucially, this point also extends
to mental rehearsals of a rule themselves just more subtle forms
of behavior. It would be implausible to suppose that whenever I
apply a rule I must mentally rehearse all the possible situations
in which the rule might apply for this would entail that whenever I followed a rule, I must be simultaneously thinking an infinite number of thoughts.
These examples are outlandish but are merely ways of making
graphic a simple point: Any rule is, logically, no different from
a word. The rule is just another symbol and, as such, stands in
need of interpretation if it is to have content. But interpreting is
a representational state. And so, in our attempt to understand
original representation, we have merely substituted the problem
of understanding the content of one representation with that of
understanding the content of another.
Wittgensteins response to this rule-following paradox
turns on the appeal to practice. This response is, however, deeply
problematic at least if understood as a constructive attempt to
explain the foundations of meaning. A practice is, as he put it,
what we do. But doing seems to be a form of acting. And actions
are essentially connected to, and presuppose, prior intentional
states of a subject. That is, both the status of an event as an action
and its identity as the particular action that it is depend on its
relation to a subjects intentional states. Therefore, the appeal to
action seems to presuppose representational states and so cannot explain what representations are (McDowell 1992; Hurley
1998; Rowlands 2006).

Naturalizing Representation
Wittgensteins legacy, therefore, is a convincing account of
what representation is not, coupled with a highly questionable

714

account of what representation is. One prominent response to


this legacy is the attempt to naturalize representation to explain
what makes something a representation by appealing only to
states that are nonintentional or less than fully intentional. These
attempts can be divided, roughly, into informational and teleological approaches.
Fred Dretske (1981) argued that representation can be
explained in terms of information: An item qualifies as representational only if it carries information about some item
extrinsic to it. Information reduces to relations of conditional
probability though precisely which relations is a matter of dispute. According to the stringent version, defended by Dretske,
information requires a conditional probability of 1. That is, for r
to carry the information that s, the probability of s given r must
be 1 (i.e., given r, s must be certain). Other versions (e.g., Lloyd
1989) identify information only with an increase in conditional
probability, though not necessarily to the level of 1. On this view,
r will carry information about s if the probability of s given r is
greater than the probability of s given not r.
The primary drawback with the informational account is
its well-documented problems accommodating an essential
feature of representations: normativity (Dretske 1986; Fodor
1990) Consider the mental representation of a dog dog. If dog
occurs, then the world should, in an appropriate way, contain a
dog. The representation, therefore, makes a normative claim: a
claim about the way the world should be, given that the representation is instantiated. However, it is likely that dog can be
caused by things that are not dogs foxes in the distance or cats
on a dark night, for example. So, what dog is most reliably correlated with is not dogs but a disjunction of dogs or foxes-in-thedistance or cats-on-a-dark-night. But relations of conditional
probability the core of the concept of information are a function of reliable correlation. So, the pure informational account
of representation cannot distinguish between the way the world
should be when dog is instantiated and the way the world in fact
is. Representation is normative in a way that information is not.
Central to teleological approaches is the concept of proper
function (Millikan 1984, 1993; Papineau 1984). This is a normative concept: The proper function of a mechanism, trait, or state
is what it is supposed to do, what it has been designed to do, what
it ought to do. It is not what that mechanism generally does or is
disposed to do. What something does or is disposed to do is not
always what it is supposed to do. First, any mechanism, trait, or
process will do many things, not all of which are part of its proper
function. Secondly, a mechanism, trait, or process can have a
proper function even if it never, or hardly ever, performs it. Third,
a mechanism, trait, or process may have a proper function and
yet not be able to perform it properly.
The normativity of proper functions is grounded in their history. The proper function of a heart is to pump blood because it is
their doing this in the past that explains why hearts proliferated
and exist today. They did not proliferate because of their ability to make noise or produce wiggly lines on an electrocardiogram. Thus, the proper function of an item is determined not by
the present characteristics or dispositions of that item but by its
history: Proper function is essentially historical.
The core idea of the teleological approaches to representation
is that the mechanisms responsible for mental representation are

Representations

Rhetoric and Persuasion

evolutionary products, and that we can therefore understand representation in terms of the concept of proper function. Suppose
we have a representational mechanism M capable of going into
a variety of states or configurations. The direct proper function of
mechanism M is, let us suppose, to enable the organism to track
various environmental contingencies for example, the presence
of predators. In the event that a predator is present, M goes into a
particular configuration, F. This state F of the mechanism has the
derived proper function deriving from the proper function of the
mechanism M of indicating the presence of a predator. That is,
it has the content, roughly, predator, there!
One strength of the teleological approach is the elegant manner in which it satisfies the normativity constraint. Another is
that it does not rely on the questionable assumption that all representation derives from mental representation. The historicalnormative account of representation can be applied directly to
linguistic forms independently of their connection to the intentional states of subjects. However, all attempts to naturalize
representation are controversial, and their success or failure is
a matter of continuing debate. The worries surrounding these
attempts can be divided into technical and foundational. With
respect to teleological approaches, for example, one technical
worry would be whether the approach can capture the fineness of
grain of certain content attributions, for example, whether it can
distinguish between the content of predator! and tiger! and
thing that will eat me! There is every reason to think that with
sufficient ingenuity, answers to these sorts of technical worries
will be (or, indeed, already have been) forthcoming.
Foundational worries are more serious. These concern
whether the sorts of natural relations invoked by these accounts
are the right sorts of things to explain the nature of representation.
Even if we were to identify items that satisfied ones preferred
naturalistic model of representation, it is argued, it would still be
an open question whether these items were, in fact, representations. Thus, some (e.g., McGinn 1991) argue that naturalistic
accounts provide only a criterion of representational content a
criterion that allows us to determine when one thing is about
another, and what thing it is about. But they fail to explain what
representation actually is. More positively, if we are to properly
understand representation, we also need to understand consciousness. And the prospects for a naturalistic interpretation
of the former are, therefore, tied to a naturalistic interpretation
of the latter.
Mark Rowlands
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Dretske, Fred. 1981. Knowledge and the Flow of Information.
Oxford: Blackwell.
. 1986. Misrepresentation. In Belief, ed. R. Bogdan, 1736
Oxford: Oxford University Press.
Fodor, Jerry. 1990. A Theory of Content and other Essays. Cambridge,
MA: MIT Press.
Grice, Paul. 1957. Meaning. Philosophical Review 66: 37788.
Hurley, Susan. 1998. Consciousness in Action. Cambridge: Harvard
University Press.
Lloyd, Dan. 1989. Simple Minds. Cambridge, MA: MIT Press.
McDowell, John. 1992. Meaning and intentionality in Wittgensteins
later philosophy. Midwest Studies in Philosophy 17: 3042.

McGinn, Colin. 1991. The Problem of Consciousness. Oxford: Blackwell.


Millikan, Ruth Garrett. 1984. Language, Thought and Other Biological
Categories. Cambridge, MA: MIT Press.
. 1993. White Queen Psychology, and Other Essays for Alice.
Cambridge, MA: MIT Press.
Papineau, David. 1984. Representation and Reality. Oxford: Blackwell.
Rowlands, Mark. 2006. Body Language: Representation in Action.
Cambridge, MA: MIT Press.
Wittgenstein, Ludwig. 1953. Philosophical Investigations. Oxford:
Blackwell.

RHETORIC AND PERSUASION


Rhetoric is traditionally a discursive skill, either written or oral,
used to produce a desired effect on an audience. Alternatively,
it is a study that focuses on the techniques a person should optimally employ in given contexts to produce such an effect. This
second sense echoes Aristotles emphasis, set out in his Art of
Rhetoric, pertaining to the detection of the persuasive aspects
of each matter (1991, 6970).
From an etymological perspective, the English noun rhetoric
is derived from the Greek word rhma (a word), which is linked
to rhtor (a teacher of oratory). Both are ultimately derived
from the Greek verb eir (I say). The notion of rhetoric, therefore, is grounded in language. The English noun persuasion is as
much grounded in cognition as it is in language. Even though the
suade element of the word goes back to the Latin sudre to
advise, the Greek word for persuasion derives from the cognitive verb to believe.
The formal codification of rhetoric as a heuristic-like system
is something that was first written down in circa 475 b.c. by
Corax of Syracuse and involved a simple but effective four-part
structure: introduction, background, arguments, and conclusion. Rhetoric can only operate when individuals are given the
opportunity to speak and persuade in a public forum. In ancient
societies that had tyrannical, royal, aristocratic, or oligarchic
systems, freedom of speech was either restricted or nonexistent, and as a result, rhetorical practice was very limited. It was
only with the advent of democracy, championed by Cleisthenes
in the city-state of Athens around 510 b.c., that individuals
were given the opportunity to practice persuasive oratory for
their own ends. Three distinct genres developed from circa 500
b.c. through to the formal end of Athenian democracy around
320 b.c. These were forensic oratory (the rhetoric of the law
courts), deliberative oratory (the rhetoric of the political arena),
and epideictic oratory (the rhetoric of praise or blame). Each
of these genres had its own separate focus expressed in their
means and topics. Some of the great orators of the time were
Gorgias (an epideictic, sophist orator), Lysias and Isocrates
(two forensic orators), and Demosthenes (the great deliberative orator). Famous Roman orators and writers of rhetorical
handbooks were Cicero (de Oratore and de Inventione) and
Quintilian (Institutio Oratoria). Both Isocrates and Quintilian
were also great educators.
Rhetoric is about structure and strategy. Structure refers
to both the arrangement of the whole process and that of the
speech itself. The first is expressed by means of the five canons of
rhetoric, pertaining to the discovery (herisis/inventio); arrangement (txis/dispositio), stylization (lxis/elocutio), memorizing

715

Rhetoric and Persuasion


(mnm/memoria), and delivery (hupkrisis/pronunciatio) of
persuasive arguments in a speech or essay.
Once a proposition has been decided on, a speaker or writer
needs to go about gathering, discovering, and generating arguments in support of it. A main strategy is to turn to the topics
(topoi/loci). These are places where arguments can be found.
The fact that they are already out there in the world and only
need to be used led Aristotle to refer to them as nonartistic
proofs. Topics can be internal, external, or special. Internal topics are textual strategies for generating arguments, such as definition, comparison, analogy, cause and effect, and testimony.
External topics are literal places and objects where people can
go to find arguments, like reference books in libraries, or search
engines on the Internet, to give a modern example. Special topics are systems that are particular to the three genres of rhetoric.
Deliberative oratory uses special topics like the worthy and
the advantageous in order to persuade people to act or think
in a certain way. Forensic oratory uses special topics appropriate for either defense or prosecution in the law courts. One such
strategy is stasis, a Roman judicial heuristic, which addresses
the three questions of whether something happens (evidence),
what something is (definition) and the quality of what happened
(motives). Epideictic oratory uses strategies relevant for either
amplifying or playing down the virtues and vices of individuals
or institutions.
A second mode of persuasion in this first canon of rhetoric is
what Aristotle termed the artistic proofs, incorporating the three
appeals: logos, pathos, and ethos. Logos centers on whether arguments are deductive or inductive, fallacious or nonfallacious,
syllogistic or enthymemic. Pathos deals with the psychology of
persuasion, focusing on the ways that emotions are triggered
and then channeled in an audience for the speakers own ends.
Ethos is concerned with the moral character of the speaker, the
trust or admiration an audience has for a speaker or writer. It
also includes ethics in the strategies employed in the language
use itself, like dealing with counterarguments, opponents, and
minorities in a fair and balanced way.
The second canon is concerned with arrangement. This occurs
at two levels: that of the entire speech and of the arguments
themselves. One famous system is that set out in the anonymous
first century b.c.e. Roman manual Rhetoric ad Herennium. This
work, which also deals with the last three canons, stipulates six
parts to a speech: introduction (exordium), background/scene
setting (narratio), a brief list of arguments (divisio), the arguments in favor (confirmatio), the counterarguments (confutatio),
and conclusion (peroratio). The second level of arrangement
concerns sections four and five: which arguments a speaker puts
forward first, which last, and which placed in the middle and,
most importantly, why.
The third canon deals with style and is realized by the use of
style figures (tropes and schemes) to produce differing linguistic and cognitive levels of parallelism and deviation in order to
draw in, delight, and ultimately persuade listeners and hearers.
It also suggests different styles that are suitable for specific occasions: high, middle, or low. The fourth and fifth canons set out the
performance aspects of rhetoric and involve oral, rather than
written, production. These are the memorizing and delivery of
a speech. The latter puts much focus on intonation, prosody,

716

voice, rhythm, and gesture, something the Roman orators


made into an art.
Throughout the latter half of the twentieth century, rhetoric
has often been viewed in university language departments as an
archaic system. One area where rhetoric did flourish was argumentation analysis. In the 1950s, influential work was conducted
by Chaim Perelman and Lucie Olbrechts-Tyteca. What they
called the new rhetoric challenged the validity of dialectic and
logico-rational approaches to argumentation, claiming instead
that real arguments are not grounded in absolute or perfect
contexts but, rather, in situations that are characterized by the
possible, plausible, and probable. Hence, a much more pragmatic linguistic approach to evaluating arguments is required.
Other important philosophical and theoretical scholars in this
period were I. A. Richards, with his work in literary criticism,
and Kenneth Burke, with his rhetoric of motives, based on his
notions of identification and appeal, which focuses more on
the reception and context of the persuasive discourse than on
the actual production.
Today, rhetoric and persuasion have dissipated into a
number of associative language-based domains that include
composition, word and image studies, philosophy, psychology, communication studies, argumentation analysis, and
stylistics. Rhetoric is crucial to freshman composition courses
in universities. This has traditionally been the case in the
United States but is becoming increasingly so in the rest of the
world. The structural and systematic nature of rhetoric lends
itself perfectly to teaching writing and oral composition . It is here that the didactic and pedagogical quality of
rhetoric as a skill, in the true Aristotelian sense of techn, to be
applied to other subjects rather than a subject in itself, comes
to the fore.
In communication studies, the term rhetoric has all but disappeared, even if what happens there is essentially still rhetoric as
the ancients described it. This social scientific approach to modern rhetorical theory tends to focus more on cognition and emotion than on language, more on reception than production. This
can be summed up in Herbert S. Simonss claim that persuasion
is about winning beliefs, not arguments (2001, xxii). This highlights the need of a persuader to position himself/herself closer
to the persuadee. This attempt at creation of common ground,
which Simons terms a coactive approach, must ultimately
draw on the classical notions of ethos and pathos.
Rhetoric is also very much evident in the domain of critical
discourse analysis, where the ordering, presentation, and
omission of information in news discourse is crucial for persuading and manipulating mass audiences. The same can be said in
the word and image domain of advertising discourse, arguably
the most fecund example of rhetoric at work in society in the
modern age. Rhetoric also continues to flourish in argumentation analysis, as can be seen in the work of Frans H. van Eemeren
and R. Grootendorst (2004), whose pragma-dialectic model
of argumentation provides a means of resolving differences of
opinion based on what they term critical discussion.
One current approach that warrants further elaboration is
STYLISTICS: a field which in the twentieth century replaced
and expanded on the earlier study of elocutio in classical rhetoric (Wales 2001, 372). Its object of analysis is primarily literary

Rhyme and Assonance


discourse, under the premise that literary discourse is either
consciously or subconsciously stylized to order to induce certain persuasive emotive and aesthetic effects in readers. The
rhetorical notion of foregrounding commonly through
such techniques as parallelism, repetition, and deviation is
central to this approach. Stylistics is the analytic side of the
literary rhetorical coin, the other being production, which
currently finds form in university creative writing programs.
Stylistics has also made a journey similar to that of rhetoric
and persuasion: from a focus on production and form, to one of
interaction and then reception, and finally on to the domain of
the mind, beliefs, and emotion. This can be seen in recent work
being conducted on cognitive stylistics, also often referred to
as cognitive poetics. This augmentation shows how linguistic and cognitive approaches to stylistics are not contrary
but complementary (M. Burke 2005, 194). Cognitive stylistics
thus explicitly seeks to integrate the language-oriented nature
of mainstream stylistic analysis with the mind-oriented nature
of emotion and cognitive linguistics. This is a line of
development that is at the very heart of twenty-first century
rhetoric and persuasion.
Michael Burke
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aristotle. 1991. The Art of Rhetoric. Ed. Hugh Lawson-Tancred.
London: Penguin.
Burke, Kenneth. 1969. A Rhetoric of Motives. Berkley and Los
Angeles: University of California Press.
Burke, Michael. 2005. How cognition can augment stylistic analysis.
European Journal of English Studies 9.2: 18595.
Cockcroft, Robert, and S. Cockcroft. 2005. Persuading People: An
Introduction to Rhetoric. London: Palgrave. This practical book
explores persuasive techniques in English and provides clear links to
stylistics.
Eemeren, Frans H. van, and R. Grootendorst. 2004. A Systematic
Theory of Argumentation: The Pragma-Dialectic Approach.
Cambridge: Cambridge University Press.
Kennedy, George A. 1994. A New History of Classical Rhetoric. Princeton,
NJ: Princeton University Press. This book brings together several of
Kennedys previous works to provide an engaging and informative
account of the history of rhetoric and persuasion.
Perelman, Chaim, and L. Olbrechts-Tyteca. 1969. The New Rhetoric: A
Treatise on Argumentation. Notre Dame, IN: University of Notre Dame
Press.
Simons, Herbert, S. 2001. Persuasion in Society. London: Sage. Taking a
social sciences approach to communication, this book looks at rhetoric and persuasion in a number of everyday domains, such as advertising, politics, law, and psychology.
Wales, Katie. 2001. A Dictionary of Stylistics. 2d ed. London: Longman.

RHYME AND ASSONANCE


Rhyme and assonance are patterns of phonetic similarity that
function primarily as devices of poetic euphony. In phonology,
rhyme (sometimes distinguished by the spelling rime) denotes
the unit comprising a vowel the nucleus and optional following consonants the coda that follows an onset in a syllable.
In poetics, rhyme refers to the repetition of similar segments
beginning with a nucleus or to a word linked perceptually to

others by such repetition. Rhymes are often classified by the


kind of segment repeated, the exactness of the repetition, and
the location of the repetition within surrounding segments.
Masculine rhyme pairs words on final stressed syllables, like
kind/blind. Feminine rhyme pairs words on stressed syllables
with one or more identical unstressed syllables following, like
litter/bitter or authority/majority. In grammatical rhyme, the
word-final syllable(s) are similar inflectional suffixes. In exact
or full rhyme, the repeated segments are highly similar while
the syllable onsets are not. Slant, half, or near rhyme allows various deviations from strict similarity: in the syllable coda, such
as time/nine or light/rights; in the nucleus, like want/flaunt; or
even in syllabic structure, as in lying/mine. End rhymes occur at
the ends of poetic lines (see verse line), while internal rhymes
occur irregularly within a group of lines.
Assonance is the repetition of like vowels in neighboring syllables. Nigel Fabb (1997) observes that some instances of assonance could be reclassified as rhyme in traditions that allow
sets of phonologically similar consonants in codas. Rhyme
and assonance usually, but not necessarily, involve stressed
syllables.
Rhyme and assonance are essentially perceptual phenomena, and classifications like those here typically serve to facilitate descriptions of their perception by listeners and readers.
Masculine rhymes register more vigorously with listeners and
feminine rhymes less so, and grammatical rhymes register very
little or not at all. Slant rhymes are less vigorous than full rhymes,
while internal rhymes are ordinarily less noticeable than end
rhymes. The important question of effective distance between
repeated segments has not been closely researched. With assonance, like alliteration, repetition outside of a rather narrow
structural window, such as a line or two adjacent lines, is not
likely to be perceived as echoic. Rhyme that demarcates poetic
lines has a larger effective window because the segmentation is
overdetermined by other operating schemes.
Outside of poetry and verbal play, rhyme and assonance
figure little in linguistic organization. They sometimes help
motivate word coinage, as, for example, in the so-called
rhyme-motivated reduplications of English, like mumbojumbo, teeny-weeny, or claptrap. Nevertheless, the human
facility for rhyme and rhyme recognition serves as an important tool for linguistic investigation. psycholinguistics has
revealed that rhyme words are associated in memory, so that a
word prompts recall of its rhyme cohorts, and the presence of
a rhyming competitor of a target item delays visual identification of that words referent. Phonologists have used the poetic
classification of half rhymes to investigate the perceptual basis
for phonological similarity judgments. Eve Sweetser (2006)
suggests that rhyme relations between words can prompt
conceptual blending.
Claiborne Rice
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Fabb, Nigel. 1997. Linguistics and Literature: Language in the Verbal Arts
of the World. Oxford: Blackwell.
Sweetser, Eve. 2006. Whose rhyme is whose reason? Sound and sense in
Cyrano de Bergerac. Language and Literature 15.1: 2954.

717

Rhythm

RHYTHM
The notion of rhythm is widely present in language sciences and
an abundant literature ranging from acoustics to phonological
theory and neuropsychology is available, leading to several and
sometimes conflicting definitions. Nonetheless, most would
agree that rhythm involves the temporal organization of speech
and results from a threefold complex interaction among
the nature of the rhythmic atomic constituents;
the use of alternations between more and less prominent
constituents;
the pattern of regularity for the grouping of the constituents
into longer units.
According to this definition, rhythm is fundamental to languages
(it seems that no language may be defined as arrhythmic, even if
the last two proposed dimensions may be irrelevant in specific
languages).
In the twentieth century, phonetics searched mainly for the
acoustic correlates of rhythm units, while phonology with the
exception of metrical phonology usually considered rhythm as
a mere sequence of timing slots on which linguistic properties
are cast. In addition, cognitive science addressed distinct questions, namely, why languages are rhythmic and whether rhythm
plays a role in the cognitive processing of language.

Current State of the Field


RHYTHMIC TYPOLOGY. The long-lasting view that speech rhythm
would consist in isochronous occurrences of some acoustic
event or unit along the speech stream, popularized by Kenneth
L. Pike (1945) and David Abercrombie (1967), is now widely
rejected. Explaining how rhythm is perceptually salient despite
the absence of objective regularity and why it seems nevertheless possible to gather languages into a few rhythmic categories
is consequently a challenging issue. These categories, initially
known as stress-, syllable- and mora-timed, have been renamed
stress-, syllable- and mora-based in Laver (1994, 5289). This
distinction is enlightening about the change from a discrete
to a continuous approach to rhythm variation across languages: Rhythmic typology has to cope with languages that
do not strictly match categorical prototypes, and there is now
general agreement that this typology better reflects tendencies,
rather than mutually exclusive categories (Roach 1982; Dauer
1983). According to Rebecca M. Dauer, the difference between
stress-timed and syllable-timed languages has to do with differences in syllable structure, vowel reduction and the phonetic
realization of stress and its influence on the linguistic system
(1983, 51). In other words, she states that typological differences
in rhythm are side effects of the phonological characteristics of
languages.
Over the last decade, the durational correlates of rhythm
types have been thoroughly investigated, highlighting in particular distributional properties of vocalic and intervocalic segment durations (Ramus, Nespor, and Mehler 1999) and the
pairwise variability of these segment durations (Grabe and Low
2002). For example, British English exhibits both vowel reduction and fairly complex syllable structure, yielding a low proportion of vocalic intervals and a high variation in the duration of

718

consonantal intervals. On the contrary, European Spanish lacks


vowel reduction, and its syllabic structure is simpler, resulting in
a reversed pattern with a relatively higher proportion of vocalic
intervals and a lower duration variability in consonant intervals.
Several experimental studies have emphasized the salience of
these indices in language discrimination tasks performed by
human subjects (e.g., Ramus, Dupoux, and Mehler 2003), and
an abundant literature has followed. However, further investigation is still needed for understanding some dynamical aspects
of rhythm (metric patterns, speech rate, etc.) and the possible
interaction among intensity, pitch, and duration, explicitly in
terms of rhythm.
RELATION AMONG RHYTHM, METRICS, AND STRESS. As Anthony
Fox pointed out, [rhythm] is rarely taken into account in a formal way in phonological theory and description (2000, 86).
However, nonlinear approaches, and especially metrical phonology (see meter), take rhythm into consideration by investigating both the structure and the weight of rhythmic constituents
(e.g., Hyman 1985; Blevins 1995) and their relation to metric and
stress patterns (Hayes 1995).
WHY ARE LANGUAGES RHYTHMIC? There is extensive evidence
in support of speech rhythmicity as fundamental for speech
communication. From the production side, no uncontroversial position has yet emerged. For instance, Peter MacNeilage
(1998) proposed an evolutionary scenario deriving speech
rhythmicity from cycles of mandibular oscillation during
ingestion; Robert F. Port (2003) proposed that neurocognitive
oscillators could synchronize the production of prominent
events with perceptual attention, renewing approaches initiated in psychology (for a review, see Evans and Clynes 1986).
Furthermore, quite a few experiments have assessed human
awareness of rhythm differences for neonates, young infants,
or adults, and several studies suggest that rhythm plays an
important role in segmenting the speech stream and thus for
language acquisition (see, among others, Morgan and Demuth
1996 and Mehler and Nespor 2004; see also speech perception in infants ).
Franois Pellegrino
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abercrombie, David. 1967. Elements of General Phonetics.
Edinburgh: Edinburgh University Press.
Blevins, Juliette. 1995. The syllable in phonological theory. In
Handbook of Phonological Theory, ed. John A. Goldsmith, 20644.
Oxford: Blackwell.
Dauer, Rebecca M. 1983. Stress-timing and syllable-timing revisited.
Journal of Phonetics 11: 5162.
Evans, James R., and M. Clynes, eds. 1986. Rhythm in Psychological,
Linguistic and Musical Processes. Springfield, IL: Charles C. Thomas.
Fox, Anthony. 2000. Prosodic Features and Prosodic Structure: The
Phonology of Suprasegmentals. Oxford: Oxford University Press.
Grabe, Esther, and E. L. Low. 2002. Durational variability in speech and
the rhythm class hypothesis. In Papers in Laboratory Phonology 7, ed.
Carlos Gussenhoven and Natasha Warner, 51546. Berlin: Mouton de
Gruyter.
Hayes, Bruce. 1995. Metrical Stress Theory. Chicago: University of Chicago
Press.

Right Hemisphere Language Processing


Hyman Larry M. 1985. A Theory of Phonological Weight. Dordrecht, the
Netherlands: Foris.
Laver, John. 1994. Principles of Phonetics. Cambridge: Cambridge
University Press.
MacNeilage, Peter. 1998. The frame/content theory of evolution of
speech production. Behavioral and Brain Sciences 21: 499546.
Mehler, Jacques, and M. Nespor. 2004. Linguistic rhythm and the development of language. In Structures and Beyond: The Cartography of
Syntactic Structures, ed. A. Belletti, 21321. Oxford: Oxford University
Press.
Morgan, James L., and K. Demuth, eds. 1996. Signal to Syntax: Bootstrapping from Speech to Grammar in Early Acquisition. Mahwah,
NJ: Lawrence Erlbaum.
Pike, Kenneth L. 1945. The Intonation of American English. Ann
Arbor: University of Michigan Press.
Port, Robert F. 2003. Meter and speech. Journal of Phonetics
31: 599611.
Ramus, Franck, E. Dupoux, and J. Mehler. 2003. The psychological reality of rhythm classes: Perceptual studies. In Proceedings of the 15th
International Congress of Phonetic Sciences, ed. Maria-Josep Sol,
Daniel Recasens, and Joaqun Romero, 33742. Barcelona: Causal
Productions.
Ramus, Franck, M. Nespor, and J. Mehler. 1999. Correlates of linguistic
rhythm in the speech signal. Cognition 73: 26592.
Roach, Peter. 1982. On the distinction between stress-timed and syllable-timed languages. In Linguistic Controversies, ed. David Crystal,
739. London: Edward Arnold.

RIGHT HEMISPHERE LANGUAGE PROCESSING


The human brain is composed of two cerebral hemispheres,
which are anatomically very similar but have well-known
asymmetries in terms of their functional capacities. One of
the best-known facts about the interface between the brain
and cognition is that the left hemisphere (LH) is dominant
for language. (This description applies to the vast majority of
right-handed people, and probably at least two-thirds of lefthanders.) However, the nature of this left hemisphere language
dominance, and of right hemisphere (RH) contributions to language, remains far from clear, and several schools of thought
persevere. What is known:
aphasia, a marked deficit in language function following
brain damage, is almost always associated with damage to
the LH, whereas similar damage to the RH is not associated
with overt aphasia.
Yet, people with aphasia following extensive LH brain damage
often recover at least some linguistic function, and in some
(but not all) cases, homologous areas of the RH (those areas
mirroring the damaged LH language areas) seem responsible
for recovery (Blasi et al. 2002; Leff et al. 2002). Indeed, children who have their entire LH removed as a last resort to treat
a crippling form of epilepsy often go on to develop language
abilities within a normal range (Vargha-Khadem et al. 1997).
Neurologically intact individuals can perform most language
tasks better (faster and more accurately) when processing linguistic stimuli received via the right visual field or the right
ear, both of which project most directly to the LH, compared
to stimuli received via the left visual field or left ear, which
project to the RH.

neuroimaging in various modalities often shows stronger


brain activity in the LH than in the RH during most standard language tasks. However, there usually is some activity
in homologous regions of the RH, even if it is weaker (e.g.,
Mazoyer et al. 1993).
All of this information (derived from lesions, divided inputs
to LH and RH, and neuroimaging) reveals that the RH seems
important for processing many aspects of prosody and the
melodic contour of language used to stress some elements,
conveying emotion, or (in many languages) demarcating
questions and sentences (Joanette, Goulet, and Hannequin
1990; cf. Schirmer and Kotz, 2006).
Likewise, a variety of evidence demonstrates that the RH
strongly contributes to some language tasks, particularly in
higher-level language comprehension.
This final point concerns the vast majority of work on RH language processing, and it demands elaboration. Lesion studies,
neuroimaging, and behavioral evidence all indicate that RH language processing is important for drawing connective inferences
(Beeman 1993; Brownell et al. 1986; Mason and Just 2004; Virtue
et al. 2006); understanding jokes (Coulson and Wu 2005; Goel
and Dolan 2001) and at least certain types of metaphors (Brain
and Language special issue in 2007; see metaphor, neural
substrates of); integrating new input with previously comprehended text (Robertson et al. 2000); and comprehending the
gist of stories or conversations (St. George et al. 1999). In other
words, many of the truly communicative aspects of language
seem to rely in some good measure on RH function. Moreover,
paradigmatic tasks known to elicit primarily LH brain activity
can be slightly adjusted to elicit increases in RH activity, if more
distant semantic associations are involved. For instance, generating the first verb that comes to mind when reading a noun
elicits predominantly LH frontal activity, but generating a novel
or unusually associated verb predominately increases activity in
the RH, compared to generating the first verb (Abduallaev and
Posner 1997; Seger et al. 2000).
Any successful theory of language asymmetries must account
for all of these facts. One long-favored view is that only the LH
possesses the requisite neural machinery for language processing. A common explanation is that evolution favored a single
hemisphere to control speech, and that other aspects of language processing could interact with speech more efficiently if
organized within the same hemisphere. By evolutionary chance
(or perhaps for reasons associated with right-hand tool use), the
LH gained control of speech, and thus the brain evolved to have
language centers (performing various component processes that
continue to be enumerated) only within the LH. On this view, all
contributions of the RH for understanding language are merely
paralinguistic, that is, outside the realm of pure language processing. It is easy to see how this applies to prosody; moreover,
it is often also applied to the higher-level language processes for
which the RH seems important, described previously.
This LH-only view has the advantage of easily explaining
the association of classic aphasia with LH lesions. However,
it suffers from several disadvantages. First, it is merely
descriptive, failing to explain how the LH, but not the RH, is
able to process language. It may be true that the left planum

719

Right Hemisphere Language Processing


temporale (near Wernickes area) or the left Brocas area, or
both, are slightly larger than their RH counterparts, but these
asymmetries seem imperfectly correlated with language laterality (e.g., Gannon et al. 1998; Moffat, Hampson, and Lee
1998). Second, the LH-only view has difficulty parsimoniously
accounting for RH involvement in language recovery from
aphasia following LH brain damage (or entire hemispherectomy in children). Although the brain is famously plastic, there
are limits to plasticity, and it seems prima facie unlikely that a
brain area having nothing to do with language prior to injury
(e.g., one that processes visuospatial information) could, after
LH injury, rewire itself to suddenly perform language processing. Furthermore, the LH-only view does not incorporate language comprehension subprocesses to which the RH is known
to contribute.
The LH superiority view is a slightly less extreme theory,
maintaining that the LH is greatly superior to the RH at most
or all language processing, although there may be a range of
RH linguistic ability (or bilaterality) among different individuals. Some theorists hold a hybrid view, that the RH is inferior at
some processes but completely lacking in others, such as syntactic processing or speech production. Such LH superior views
describe fairly well, without explaining, the aforementioned
phenomena regarding language asymmetries, and may better
account for RH involvement in language recovery. Specifically,
patients who suffer extensive damage to the LH early in life
may recover language very well because developmental plasticity permits a rewiring of the rudimentary language systems
in the RH to take over that function. This is more likely than,
say, rewiring a visuospatial processing system to suddenly perform language function. However, while plausible, it seems
uneconomical for large cortical regions to be poorly functioning backup systems that only retune to serve useful purposes
in the event that the primary language systems, all in the LH,
are severely impaired.
Finally, a more recent class of theories is that the two hemispheres both process all (or nearly all) aspects of language but
compute the information in slightly different manners, each type
of computation conveying advantages for different sets of tasks.
Such computational asymmetry views do not posit equality in
language ability, but rather assert that when cooperating with
an intact LH, the RH makes unique and significant contributions
to language processing. Furthermore, they provide, to varying
degrees, some causal explanation for the observed asymmetries
of language function.
For instance, one theory (Jung-Beeman 2005) asserts that relatively fine coding of information by LH language areas permits
rapid categorization and selection of information (ThompsonSchill et al. 1997) that is fundamental to many language functions. When speaking a word, for instance, people must select
one lexical and phonological representation to articulate; for
example, they must choose between cup and mug, but the precise duration, stress, and accent are less important. During
comprehension, upon hearing the word foot, LH fine coding
will strongly activate a small set of semantic features strongly
related to the context (or to the dominant meaning, in absence
of context). This is clearly beneficial, as comprehenders would
quickly get bogged down if they fail to discern whether foot

720

means a part of the body or a unit of measurement. Also, LH


relatively fine semantic coding is unlikely to maintain activation of semantic features peripherally related to foot, such as
the fact that it is susceptible to injury.
In contrast, the theory asserts that relatively coarse coding of information by RH language areas poorly serves such
selection, because multiple lexical and phonological representations remain active. This is clearly disadvantageous for
speech output. Coarser semantic coding explains why the
RH shows more sensitivity to distant semantic associations in
behavioral priming (Chiarello et al. 2003) and electrophysiological studies (Federmeier and Kutas 1999). Thus, coarser
coding allows for multiple word meanings, shades of meaning, and seemingly distant associations to remain active and
contribute to the aspects of language comprehension that
are relatively dependent on RH processing. For example, RH
semantic processing is conducive to recognizing semantic
overlap from multiple words that are each distantly related
to a primed concept (Beeman et al. 1994). Thus, when people
hear Joan walked barefoot near some glass, then felt a stab
of pain, coarser coding may help detect the semantic overlap
that helps people infer that she cut her foot; as noted, multiple
lines of evidence suggest the RH contributes to such connective inferences in important ways.
Some theorists also claim that computational asymmetries
in language processing arise from documented asymmetries in
neural microcircuitry of language areas, such as broader dendritic branching that expands input fields to pyramidal neurons
(which constitute the majority of excitatory neurons in the cortex), thus proposing a causal mechanism (Jung-Beeman 2005;
see also Hutsler and Galuske 2003). Furthermore, recovery from
aphasia is readily explained by small changes in previously
asymmetric microcircuitry, such as a slight retuning of dendritic
branching, rather than wholesale rewiring of areas that were previously completely nonlinguistic, or merely paralinguistic.
Mark Beeman
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Abduallaev, Yalchin, and Michael I. Posner. 1997. Time course of activating brain areas in generating verbal associations. Psychol. Sci.
8: 569.
Beeman, Mark. 1993. Semantic processing in the right hemisphere may
contribute to drawing inferences during comprehension. Brain Lang.
44: 80120.
Beeman, Mark, and Christine Chiarello, eds. 1998. Right Hemisphere
Language Comprehension: Perspectives from Cognitive Neuroscience.
Mahwah NJ: Lawrence Erlbaum. This book contains chapters on hemispheric differences in many different language functions, focusing on
right hemisphere contributions.
Beeman, Mark. et al. 1994. Summation priming and coarse coding in the
right hemisphere. J. Cog. Neurosci. 6: 2645.
Blasi, Valeria, et al. 2002. Word retrieval learning modulates right frontal
cortex in patients with frontal damage. Neuron 36: 15970.
Brain and Language 100.2. 2007. Special issue on metaphor processing.
Brownell, Hiram H., et al. 1986. Inference deficits in right brain-damaged patients. Brain Lang. 29: 31021.
Chiarello, Christine, et al. 2003. Priming of strong semantic relations in the left and right visual fields: A time course investigation.
Neuropsychologia 41: 72132.

Role and Reference Grammar

ROLE AND REFERENCE GRAMMAR


Role and reference grammar (RRG) is a theory of the syntaxsemantics-pragmatics interface (Van Valin 2005); its

SYNTACTIC REPRESENTATION

Linking
Algorithm
SEMANTIC REPRESENTATION

Discourse-Pragmatics

Coulson, Seana, and Wu Y. C. 2005. Right hemisphere activation of jokerelated information: An event-related brain potential study. J. Cog.
Neurosci. 17: 494506.
Federmeier, Kara D., and Marta Kutas. 1999. Right words and left
words: Electrophysiological evidence for hemispheric differences in
meaning processing. Cogn. Brain Res. 8: 37392.
Ferstl, Evelyn C., et al. 2005. Emotional and temporal aspects of situation
model processing during text comprehension: An event-related fMRI
study. J Cog. Neurosci. 17: 72439.
Gannon, Patrick J., et al. 1998. Asymmetry of chimpanzee planum temporale: Humanlike pattern of Wernickes brain language area homolog. Science 279: 2202.
Goel, Vinod, and Raymond J. Dolan. 2001. The functional anatomy of
humor: Segregating cognitive and affective components. Nature
Neurosci. 4: 2378.
Hutsler, Jeffrey, and R. A. W. Galuske. 2003. Hemispheric asymmetries
in cerebral cortical networks. Trends Neurosci. 26: 42936.
Joanette, Yves, Pierre Goulet, and Didier Hannequin. 1990. Right
Hemisphere and Verbal Communication. New York: Springer-Verlag.
Jung-Beeman, Mark. 2005. Bilateral brain processes for comprehending
natural language. Trends in Cognitive Sciences 9: 51218. This paper
proposes a neural microcircuitry explanation of computational asymmetries in language processing.
Leff, Alexander, et al. 2002. A physiological change in the homotopic
cortex following left posterior temporal lobe infarction. Ann. Neurol.
51: 443558.
Mason, Robert A., and Marcel Just. 2004. How the brain processes causal
inferences in text: A theoretical account of generation and integration
component processes utilizing both cerebral hemispheres. Psychol.
Sci. 15: 17
Mazoyer, B. M, et al. 1993. The cortical representation of speech. J. Cog.
Neurosci. 5: 46779.
Moffat, Scott D., Elizabeth Hampson, and Donald H. Lee. 1998.
Morphology of the planum temporale and corpus callosum in left
handers with evidence of left and right hemisphere speech representation. Brain 121: 236979.
Robertson, David A., et al. 2000. Functional neuroanatomy of the cognitive process of mapping during discourse comprehension. Psychol.
Sci. 11: 25560.
Schirmer, Annett, and Sonja A. Kotz. 2006. Beyond the right hemisphere: Brain mechanisms mediating vocal emotional processing.
Trends in Cognitive Sciences 10: 2430.
Seger, Carol A., et al. 2000. Functional magnetic resonance imaging
evidence for right hemisphere involvement in processing unusual
semantic relationships. Neuropsy. 14: 3619.
St. George, Marie, et al. 1999. Semantic integration in reading: Engagement of the right hemisphere during discourse processing. Brain 122: 131725.
Thompson-Schill, Sharon L., et al. 1997. Role of left inferior prefrontal
cortex in retrieval of semantic knowledge: A re-evaluation. Proc. Nat.
Acad. Sci. 94: 147927.
Vargha-Khadem, Faraneh, et al. 1997. Onset of speech after left hemispherectomy in a nine-year old-boy. Brain 120: 15982.
Virtue, S., J. Haberman, Z. Clancy, T. Parrish, and M. Jung-Beeman. 2006.
Neural activity of inferences during story comprehension. Brain
Research 1084: 10414.
Xu, J. et al. 2005. Language in context: Emergent features of word, sentence, and narrative comprehension. NeuroImage 25: 100215.

Figure 1. The organization of RRG.

theoretical and descriptive constructs derive from the investigation of many non-Indo-European languages, yielding a theory
with rather different approaches to the analysis and explanation
of morphosyntactic phenomena. RRG posits only a single level
of syntactic representation, corresponding to the actual form of
the sentence; there are no abstract levels of representations or
derivations, as in Chomskyan generative grammar. Moreover,
no phonologically null elements are allowed in syntactic representations. The name of the theory comes from its view that
much of grammar is an interaction between semantics (role) and
discourse-pragmatics (reference). The organization of the theory
is in Figure 1. There is a mapping between the syntactic and
semantic representations, mediated by the linking algorithm.
Discourse-pragmatics plays a role in this mapping, but the role
varies across languages, leading to important typological differences among languages. It is represented in part by discourse
representation structures (DRSs) taken from discourse representation theory.
The syntactic representation is called the layered structure of
the clause and is composed of three interlocking projections: the
constituent projection (predicate, arguments, adjuncts), the
operator projection (grammatical categories like aspect, negation, tense), and the focus structure projection (information
structure of the sentence, that is, the potential focus domain
of the clause [where focus can potentially fall], the actual focus
domain, and the presupposed, non-focus part). The semantic
representation is an Aktionsart-based decompositional representation. thematic relations are defined in terms of positions in the decompositions. More important are the semantic
macroroles of actor and undergoer, which correspond to the
two primary arguments in a transitive predication, either one of
which can be the single argument of an intransitive predicate.
Based on extensive typological evidence, RRG rejects traditional
grammatical relations and instead posits a construction-specific
concept, the privileged syntactic argument (PSA) of a particular
construction.
The linking between syntax and semantics is bidirectional,
reflecting production and comprehension. In semantics-tosyntax linking, the elements in the semantic representation are
mapped into positions in a syntactic template, following both
universal and language-specific principles. An example from
English is given in Figure 2. In syntax-to-semantics linking,
information about the syntactic-semantic function of arguments
is derived from the overt morphosyntactic markings, and this is
linked to the argument positions in the lexical representation of
the predicate; a simple example is in Figure 3.

721

Role and Reference Grammar


Presupposition

A ssertion

w, z

w, x, y, z

Sandy(w)
party(z)
w P at z

Sandy(w)
flowers(x)
Chris(y)
party(z)

Predicate focus
present x to y

w present x to y at z

SPEECH A CT
SENTENCE

SYNTA CTIC
INVENTORY

CLA USE

CORE < PERIPHERY


NUCLEUS

NP

NP

PP

PP

PRED
V
Sandy

at: A CC

presented

PSA :NOM

the flowers to Chris at the party

A CTIVE: 3sg

be-at(party

, [[do(Sandy

A CV

A CV

A CC

Non-Macrorole UNDERGOER

A CTOR
LEX ICON

to: A CC

, )] CA USE [BECOME have(ChrisINA, flowersINA )]]

Figure 2. Linking from semantics to syntax. Constituent and focus structure projections shown, along with discourse
representation structures.

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

SENTENCE
CLA USE
PrCS

CORE <--------NP

Parser

PP

PRED
NP

Voice? -- A ctive
PSA = A ctor

NUC

PERIPHERY

A DV

What did Mary give to John

yesterday

A ctor

Lexicon

[do(x, )] CA USE [BECOME have (y,z)]

Figure 3. Linking from syntax to semantics.

Not only has work been done using RRG on a wide variety of languages, but there has also been psycholinguistic and
neurolinguistic work done as well. RRG takes a cognitivist
approach to language acquisition (see Van Valin 1998, 2001),
as well as references on the RRG Web site). The application
of RRG to the study of sentence processing is investigated in
Bornkessel, Schlesewsky, and Van Valin (2004) and Van Valin
(2006).
Robert D. Van Valin, Jr.

722

Bornkessel, Ina, Matthias Schlesewsky, and Robert D. Van Valin, Jr. 2004.
Syntactic templates and linking mechanisms: A new approach to
grammatical function asymmetries. Available online at: http://linguistics.buffalo.edu/people/faculty/vanvalin/rrg/vanvalin_papers/
rrg_adm_CUNY04.pdf.
Van Valin, Robert D., Jr. 1998. The acquisition of wh-questions and
the mechanisms of language acquisition. In The New Psychology of
Language: Cognitive and Functional Approaches to Language Structure,
ed. Michael Tomasello, 22149. Mahwah, NJ: Lawrence Erlbaum.
Available online at: http://linguistics.buffalo.edu/people/faculty/vanvalin/rrg/vanvalin_papers/acqwh.pdf.
. 2001. The acquisition of complex sentences: A case study in the
role of theory in the study of language development. Chicago Linguistic
Society Parasession Papers 36: 51131. Available online at: http://linguistics.buffalo.edu/people/faculty/vanvalin/rrg/vanvalin_papers/
Acq_of_complex_sent.pdf.
.
2005.
Exploring
the
Syntax-Semantics
Interface.
Cambridge: Cambridge University Press.
. 2006. Semantic macroroles and language processing. In Semantic
Role Universals and Argument Linking: Theoretical, Typological and
Psycho-/Neurolinguistic Perspectives, ed. Ina Bornkessel, Matthias
Schlesewsky, Bernard Comrie, and Angela Friederici, 263302.
Berlin: Mouton de Gruyter. Available online at: http://linguistics.
buffalo.edu/people/faculty/vanvalin/rrg/vanvalin_papers/Sem_
Macroroles_&_Lang_Processing.pdf.
The best source for work and references relating to role and reference grammar is the RRG Web site. Available online at:
http://linguistics.buffalo.edu/research/rrg.html.

Rule-Following

RULE-FOLLOWING
In Ludwig Wittgensteins late writings, one finds numerous
interconnected remarks having to do with meaning, understanding, and rule-following remarks in which Wittgenstein
can seem to be continually circling around these topics without ever arriving at any definite conclusion. An exceptionally
clear and compelling way into this material was provided by
Saul Kripke in his influential 1982 book, Wittgenstein on Rules
and Private Language. By presenting Wittgensteins concerns
in the form of a single extended argument, Kripke brought
many readers to see that Wittgensteins discussions of rulefollowing bear significantly on central issues in (among other
things) contemporary philosophy of language. As Kripke
recasts Wittgensteins remarks, they amount to an argument
for a radical form of skepticism a skepticism according to
which there are no facts about what we mean by any of our
utterances or inscriptions. In what follows, a sketch is provided
of the skeptical argument that Kripke finds in Wittgensteins
remarks on rule-following. That done, an effort is made to
distinguish the actual, historical Wittgenstein from Kripkes
reconstruction of him.

The Skeptics Challenge


Suppose that Im asked to add two numbers that I happen never
before to have added together: Im asked (lets say) Whats 68
plus 57? I think for a moment and reply 125. Kripke imagines a bizarre sceptic (1982, 8) who contests this answer. The
skeptic suggests that, given what Ive always meant by the word
plus in the past, the correct answer to Whats 68 plus 57? is
not 125. For in the past, he says, what I meant by plus might
have been a function call it the quus function that yields
5, rather than 125, given the arguments 68 and 57. By hypothesis, the quus function is consistent with all of the sums that I
have computed before today. (Imagine that the only difference
between the quus function and the plus function is that 68 quus
57 that is, the quum of 68 and 57 is 5.) Kripkes skeptic challenges me to show that, in the past, when I used the word plus
(or addition or sum), what I meant was plus (addition, sum)
rather than quus (quaddition, quum). Im to show this by citing a
fact or set of facts about my former self.
Although the skeptics challenge concerns my use of a mathematical expression, it is important to understand that whats
at issue has nothing especially to do with mathematics. I might
just as well have been challenged to show that in the past I
meant horse by horse. If the skeptic can secure his conclusion concerning my past uses of plus, the point will generalize from plus to other words and from past uses to present
ones. The conclusion he seeks is that there are no facts of the
matter about what I, or anyone, ever meant, or mean, by any
utterance or inscription. The argument may be understood to
proceed by a process of elimination. Kripke adduces various
answers to the skeptics challenge and, one by one, shows each
to be unsatisfactory.

Interpretationalism
Perhaps the most natural reply to the skeptic would make appeal
to the notion of a rule or interpretation in something like the

following manner: When I learned to add, and so learned what


plus (and +) mean, I didnt merely memorize the answers to
a finite list of addition problems. I learned a rule for addition (and
so for using plus or +) that determines the sum of any pair of
numbers, regardless of whether I happen to have yet encountered them. That the correct answer to What is 68 + 57? is 125,
rather than 5, is determined by this rule a rule that, Kripke suggests, might have taken the following form:
Take a huge bunch of marbles. First count out x marbles in one
heap. Then count out y marbles in another. Put the two heaps
together and count out the number of marbles in the union thus
formed. The result is x + y. (Kripke 1982, 15)

The problem with this sort of answer to the skeptics challenge is not that I may be misremembering the rule the interpretation of + that I learned as a child. The skeptic is happy
to grant that as a child I learned precisely this formulation and
that I remember it clearly to this day. But he questions how this
rule lets call it R ought to be understood. While the skeptic allows that R is the interpretation that Ive long assigned to
the plus sign, he claims that I am, at present, misapplying R.
He suggests that given what, in the past, I meant by the word
count, the result that a correct application of R would yield,
given 68 + 57, is 5.
At this point, I might reply to the skeptic by recalling an
interpretation of the word count that I also internalized as a
child. But he will just make the same sort of move again: Hell
grant me the remembered interpretation and suggest that I
am, at present, misapplying it. Thus, each new interpretation
that I adduce seems to require another one standing behind it,
and an infinite regress threatens. The moral the problem with
what might be called interpretationalism may be put as follows: However tempting it is to say that our words derive their
meanings from rules or interpretations, saying this just leaves
us with the question of where these rules or interpretations get
their meanings.

Dispositionalism
Another kind of reply to Kripkes skeptic would appeal to dispositions rather than to interpretations or rules. Thus, the skeptics challenge might be answered as follows: While I have never
before added 68 and 57, its nonetheless true that had I been
asked to add these numbers, I would have arrived at the answer
125. My having meant plus (rather than quus) by plus consists
in my having been so disposed disposed to answer questions
of the form What is x plus y? with the sum (and not the quum)
of x and y.
A problem with this sort of dispositionalism becomes apparent when one considers the fact that speakers are sometimes
disposed to make mistakes. Imagine that when Im sleepy, I
am disposed to answer 115 if someone asks me to add 68 and
57. We dont want an account of meaning according to which
it would follow from this that when Im sleepy, what I mean by
plus is not addition but, instead, some function that yields 115,
given 68 and 57 as arguments. Or consider a nonmathematical
example: It might be that I have a disposition to answer the question Is that a horse? affirmatively when Im shown an albino

723

Rule-Following
zebra or a donkey in dim light. Our account of meaning had better not commit us to claiming that the word horse in my idiolect therefore includes albino zebras and dimly lit donkeys in its
extension.
A further problem with the dispositionalist answer to Kripkes
skeptic (one that Kripke himself does not discuss) is this: The
dispositionalist at least as we (following Kripke) have imagined him or her takes it as a datum that for much of my life, I
have been disposed to answer 125 in response to being asked for
the sum of 68 and 57 (disposed, as Kripke puts it, when asked
for any sum x + y to give the sum of x and y as the answer [1982,
22]). But to say any such thing is to give a thoroughly intentional
characterization of my behavior to describe me not merely as
having a disposition to produce certain physically describable
movements and sounds (under some set of physically describable conditions), but as being disposed to answer a question in
a particular way, that is, to say something something with a
particular semantic content in response to being asked something. Thus dispositionalism, at least as weve imagined it, takes
for granted the phenomenon (semantic content, meaning) that
it pretends to vindicate and explain the very phenomenon that
Kripkes skeptic calls into doubt.

Reductionism and Platonism


As these and other replies to Kripkes skeptic are tried and shown
to be unworkable, it begins to look as if there may be no way to
meet his challenge that is, no way to justify my present answer
to the question Whats 68 plus 57? by adducing facts that constitute my having meant plus by plus. Perhaps the chief virtue
of Kripkes book on Wittgenstein is that it makes a strong case for
thinking that persons meaning something by their words cannot be reduced to a set of, as it were, semantically neutral facts
nonintentionally characterized truths concerning, for example,
their (or, for that matter, their communitys) behavior or dispositions to behave. As Kripke reads Wittgenstein, however, the latter
takes the failure of this sort of reductionism to show that there
are no facts about what persons mean by any of their utterances
or inscriptions. (Or, anyway, this is how Kripkes book is most
often read. There are grounds for claiming that he equivocates
on the question of how much Wittgenstein ultimately grants
to the skeptic.) If one is convinced that meaning-facts cannot
be reduced to semantically neutral ones, must one accept this
skeptical conclusion? Why not instead reject reductionism about
meaning? Why not claim, for example, that I have always meant
plus, not quus, by plus but that this fact about me does not
consist in any set of facts that could be characterized without recourse to intentional vocabulary? Notice that given such
a claim, my inability to satisfy Kripkes skeptic wouldnt come
with any earth-shaking skeptical implications. But to many philosophers Kripkes Wittgenstein among them this sort of antireductionism is liable to look like a kind of mystery-mongering
Platonism.
Think of Platonism as a desperate attempt to resuscitate interpretationalism (in the sense of the preceding sections) by blocking the infinite regress that seems to undermine it. Platonism
represents meanings as mysterious regress stoppers that stand

724

behind our in-themselves-empty utterances and inscriptions


superinterpretations that, unlike mere vocalizations or marks
on a page, neither need nor brook any further interpretation.
Wittgenstein characterizes the impulse toward this sort of position when he writes, What one wishes to say is: Every sign is
capable of interpretation; but the meaning mustnt be capable of
interpretation. It is the last interpretation. (Wittgenstein 1958,
34). Platonism is a target of criticism in his late writings. But it is
a mistake to read Wittgenstein as holding that in order to reject
reductionism of the sort that Kripkes skeptic presupposes, one
must embrace Platonism.
Getting Wittgensteins response to Platonism into view
requires that one appreciate how much the semantic skeptic,
the interpretationalist, the dispositionalist, and the platonist
have in common. All four see our communicative activities
as the production of in-themselves-meaningless movements,
noises, and marks. All four agree, in effect, with Wittgensteins
interlocutor at Philosophical Investigations 431 when he
says, The order why, that is nothing but sounds, ink-marks
(1953). Disagreement arises concerning how best to answer
a question that might be put as follows: What, if anything,
gives semantic significance to the (mere) sounds and inkmarks that our orders, assertions, questions, greetings, and
so on really are? Wittgenstein does not offer a fifth answer
to this question. Instead, he tries to expose and undermine
the philosophical moves that make it seem pressing, moves
whose effect is to induce us, first, to consider our words apart
from the contexts in which they have semantic significance
thus giving them the appearance of being nothing more than
sounds, ink marks and, then, to search in vain for the special somethings (interpretations or dispositions or meaningsin-the-sky) that, as it were, bestow significance upon the now
flat-seeming words.
Wittgenstein understands this seeming flatness as a kind
of self-induced illusion. If he is right, freeing oneself from
this illusion would allow one to reject the sort of reductionism that Kripkes skeptic takes for granted without lapsing into
Platonism.
David H. Finkelstein
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Finkelstein, D. H. 2000. Wittgenstein on rules and Platonism.
in The New Wittgenstein. ed. A. Crary and R. Read, 5372.
London: Routledge.
. 2003. Expression and the Inner. Cambridge: Harvard University
Press.
Kripke, S. A. 1982. Wittgenstein on Rules and Private Language.
Cambridge: Harvard University Press.
McDowell, J., 1998a. Meaning and intentionality in Wittgensteins later
hilosophy. In Mind, Value, and Reality, 26378. Cambridge: Harvard
University Press.
. 1998b. Wittgenstein on following a rule. In Mind, Value, and
Reality, 22162. Cambridge: Harvard University Press.
Wittgenstein, L. 1953. Philosophical Investigations. Trans. G. E. M.
Anscombe. New York: Macmillan.
. 1958. The Blue and Brown Books. New York: Harper and Row.

Schema

S
SCHEMA
A schema is a high-level conceptual structure or framework
that organizes prior experience and helps us to interpret new
situations. The key function of a schema is to provide a summary of our past experiences by abstracting out their important
and stable components. For example, we might have a schema
for a classroom that includes the fact that it typically contains a
chalkboard, bookshelves, and chairs. Schemas provide a framework for rapidly processing information in our environment. For
example, each time we enter a classroom, we do not have to consider each element in the room individually (e.g., chair, table,
chalkboard). Instead, our schemas fill in what we naturally
expect to be present, helping to reduce cognitive load. Similarly,
schemas also allow us to predict or infer unknown information in
completely new situations. If we read about a third grade classroom in a book, we can use our established classroom schema
to predict aspects of its appearance, including the presence of a
coatroom and the types of posters that might decorate the walls.
Schemas play an important role in language and linguistic
processing by helping to frame the semantic content of a situation. Even when linguistic input is sparse or vague, activation of
the appropriate schema can aid in the comprehension and retention of linguistically communicated material (see next section for
an example). In addition, schemas and scripts often help us to
define and interpret the discourse associated with particular contexts. In the classroom example, certain aspect of the communication between a student and teacher are captured by the schema,
including the facts that students should quietly raise their hand in
order to get the teachers attention and that the teacher will stand
facing the class and may call upon the student.
In a functional sense, schemas share much in common with
categories (see categorization) or concepts. However, a
distinguishing feature of schemas is that they are structured mental representations made up of multiple components. Schemas
typically contain various slots, which each take on any number
of values, and a set of relational structures that organize the slots
and represent their interconnections. The values of particular
slots are usually determined by the current context, percepts,
or situation. For example, the schema for a generic room might
include a slot for walls, doors, and windows, which could be filled
with specific values (i.e., wooden door, bay windows, etc.). Slots
that are left unspecified in the current situation are given default
values, which reflect expectations or inferences about unseen or
unknown information. Schemas derive their predictive power
through a process of filling in default values so that incomplete
knowledge about the current situation can be supplemented by
past experience. In addition, a slot may be filled by other schemas, allowing for the compositional construction of more
complex structures.

The History of Schemas


Contemporary work on schemas was initiated F. Bartlett (1932),
who was interested in the role that prior knowledge played in
the interpretation and memory for stories. His work largely

challenged the perspective of behaviorism by emphasizing the


role of internalized representations in the control of behavior
and thought. In one set of studies, participants were told a Native
American folktale that included a number of unfamiliar cultural
elements. On subsequent occasions, participants were brought
back to the lab and asked to retell the story. Over time, participants account of the story drifted in systematic ways, including
the omission of information that did not make sense to them and
the reinterpretation of certain facts in order to match their own
cultural backgrounds.
Schemas actively guide our interpretation of new events and
are, thus, highly related to stereotypes and scripts (Schank
and Abelson 1977). Consider the following passage:
It can be hard work going down, but luckily the facilities make it
much easier going up. Keep them pointed upwards, and be careful when you exit so you dont stop things from moving. Be on
the lookout for others who are having difficulty, and watch out
for the edges!

J. Bransford and M. K. Johnson (1972) showed how ambiguous


passages similar to this one are at first difficult to interpret; however, when cues about the appropriate schema to apply (snow
skiing) are provided, the information makes more sense and is
easier to remember.
Bartletts pioneering ideas led the way for considerable
research demonstrating the role of prior knowledge on memory
and encoding. However, schemas remained a relatively vague
and ill-specified construct until the work of artificial intelligence
pioneer Marvin Minsky (1975). Minsky was interested in developing computer systems with intelligent real-world behaviors.
Like a number of later theorists (e.g., Rumelhart 1980), Minsky
believed that the basic unit of knowledge representation should
be a predicated structure that he called a frame. Frames are symbolic knowledge structures that contain fixed structural relationships between a number of attributes. The modern conception of
schemas as being composed of slots and fillers inherits directly
from Minskys frames.
In contrast to the highly structured, symbolic processes
assumed by Minsky, other theorists have attempted to develop
accounts of schema representation from the perspective of connectionism. Connectionist networks represent knowledge as a
set of simple processing units that are connected to one another
with weights. Activation flows through the network and causes
different units to become more or less active on the basis of
the patterns of input and the way the units are connected (see
spreading activation). Weights can be positive or negative
and can thus represent either excitatory or inhibitory relationships between units.
D. Rumelhart and colleagues (1986) showed how special
types of connectionist networks (called constraint satisfaction
networks) could provide many of the same processing features
as symbolic frames, including the ability to represent structured
attributes and default values. For example, units in the network
might represent objects that one might encounter in a typical
room such as a globe, blackboard, bed, or desk. Positive association weights between these units are used to represent the fact
that these items often occur together. Thus, the units for globes,
blackboards, and desks might be mutually interconnected with

725

Schema
positive weights, while blackboards and beds might be linked
by a strongly negative weight. If a partial description or view of
a classroom activates the blackboard and pencil sharpener element, other classroom elements (such as desk or globe) will
also become active through their positive links to the observed
events, while classroom-irrelevant information (such as a bed)
would be inhibited. Structured attributes could form by subsets
of mutually inhibitory elements. For example, most classrooms
have either a blackboard or a whiteboard. Mutually inhibitory
weights between blackboard and whiteboard units can ensure
that only one value would fill this slot for at a time. Critically, the
parallel distributed processing (PDP) approach to schemas dispenses with the traditional structure of schema representations,
with slots, fillers, and relations favoring emergent and implicit
structures, such as partwhole relations and hierarchies.

Schemas and Memory


The schemas we use when interpreting a new situation heavily influence what we encode about a situation and are able to
remember. For example, W. F. Brewer and J. C. Treyens (1981)
were interested in how schemas might influence the way that
objects are encoded into memory. In their studies, participants
were shown a picture of a typical office and were tested for their
memory of objects in the room. Their results showed that the
schema recruited by participants (i.e., a typical office) influenced
what they remembered about the scene. For example, people
were able to accurately recall office-appropriate items such as
desks, chairs, and bookshelves. However, the schema also filtered out from memory surprising or irregular items, such as a
human skull, that was placed in the room.
While these studies suggested that processing the world
through the lens of a schema favors schema-consistent information over schema-inconsistent information, other work has
found the opposite effect (cf. Bower, Black, and Turner 1979).
The apparent contradiction between these two views of schemamediated memory processing was resolved by K. Rojahn and
T. Pettigrew (1992), who found that after accounting appropriately for false alarm rates, schema-inconsistent information is
generally remembered better than schema-consistent information. Similarly, categorizing objects by providing their basic
level category label can lead to worse memory because doing
so recruits schema-based representations of the object category
(Lupyan 2005). In fact, young children sometimes show better
memory for items than do adults because adults process items
in terms of well-established categories and schemas, whereas
children use similarity-based processes tied more directly to
the perceptual features of the input (Sloutsky and Fisher 2004).
In this sense, applying a schema (or label) can remove from the
encoding process the details necessary for identification.

Connecting Schemas, Bodies, and Worlds


The traditional view of schemas (inherited from Minsky) emphasizes, to a large extent, amodal, symbolic processes that operate over highly processed and abstracted units. Contemporary
work has attempted to extend the schema concept to include
processes grounded in bodies and external environment (see
embodiment). Perhaps most notable is L. Barsalous (1999)
perceptual symbol system theory. Like schemas, perceptual

726

symbol systems involve framelike structures that have slots, preserve fixed relationships between attributes, and may take on
default values. However, the objects upon which these framelike structures operate are direct, multimodal, sensory-motor
representations. Evidence in favor of this approach includes the
fact that brain areas representing particular concepts appear to
overlap with perceptual processing of that concept. For example,
damage to the visual cortex impairs conceptual processing of
categories that are primarily visual in nature. The fundamental
contribution of the perceptual symbols approach is to rethink
the relationship between conceptual and perceptual processing and to suggest how symbolic, schema-like mental structures
might emerge from perceptual experience.
Similarly, in cognitive linguistics, image schemas have
been proposed as a way to link bodily actions, perceptual experience, and semantic processing (Johnson 1987; Lakoff 1987).
Image schemas are an embodied prelinguistic structure of experience that can provide the basis of conceptual metaphors.
For example, one might have a generalized containment schema
that represents an object being inside another object (just like a
small ball might physically fit inside a cup). This representation
is assumed to be an abstracted, but perceptual, instantiation of
the concept and includes a number of structured spatial relationships. Generative meanings can be produced on the basis of this
schema through metaphorical mappings. For example, understanding a phrase such as a deep depression takes a long time
to get out of is accomplished through metaphor by the general
containment schema (so one might envision the depressed person rising out of the depression like an element out of its physical container; Johnson 1987). Like perceptual symbol systems,
image schemas emphasize perceptual, multisensory, embodied
content in schema-like representations.

The Future of Schemas


The importance of schematized background knowledge on cognitive processes such as memory, interpretation, and inference
is now well appreciated in the literature. However, there remains
considerable debate about the precise mechanisms that support
this behavior. Several of the more exciting avenues for future
work include reconceptualizing the notion of schema outside of
the literal, symbolic frames of Minsky with slots and fillers. These
developments include dynamical systems models (where a
schema would be a robust attractor state), neural networks (with
mutually interacting microfeatures), and perceptually grounded,
modal symbol systems.
Todd M. Gureckis and Robert L. Goldstone
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Barsalou, L. 1992. Frames, concepts, and conceptual fields. In Frames,
Fields and Contrasts: New Essays in Semantic and Lexical Organization,
ed. A. Lehrer and E. Kitta, 2174. Hillsdale, NJ: Lawrence Erlbaum. An
excellent introduction to the issues involving the representation of
schemas and frames.
. 1999. Perceptual symbol systems. Behavioral and Brain Sciences
22: 577660.
Bartlett, F. 1932. Remembering. Cambridge: Cambridge University Press.
Bower, G., J. Black, and T. Turner. 1979. Scripts in memory for text.
Cognitive Psychology 11.2: 177220.

Scripts
Bransford, J., and M. K. Johnson. 1972. Contextual prerequisites for
understanding: Some investigations of comprehension and recall.
Journal of Verbal Learning and Memory 11: 71726.
Brewer, W. F., and J. C. Treyens. 1981. Role of schemata in memory for
places. Cognitive Psychology 13: 20730.
Johnson, M. 1987. The Body in the Mind: The Bodily Basis of Meaning,
Imagination, and Reason. Chicago: University of Chicago Press.
Lakoff, G. 1987. Women, Fire, and Dangerous Things. Chicago: University
of Chicago Press.
Lupyan, G. 2005. When naming means forgetting: Verbal classification
leads to worse memory. In Proceedings of the 27th Annual Meeting of
the Cognitive Science Society, ed. B. Bara, L. Barsalou, and M. Bucciarelli.
Hillsdale, NJ: Lawrence Erlbaum.
Mandler, J. 1984. Stories, Scripts, and Scenes: Aspects of Schema Theory.
Hillsdale, NJ: Lawrence Erlbaum.
Minsky, M. 1975. A framework for representing knowledge. In
The Psychology of Computer Vision, ed. P. Winston, 21177. New
York: McGraw-Hill.
Rojahn, K., and T. Pettigrew. 1992. Memory for schema-relevant
information: A meta-analytic resolution. British Journal of Social
Psychology 31: 2.
Rumelhart, D. 1980. Schemata: The building blocks of cognition. In
Theoretical Issues in Reading and Comprehension, ed. R. Spiro, B.
Bruce, and W. Brewer, 3358. Hillsdale, NJ: Lawrence Erlbaum. An
interesting and influential argument that schema may represent the
basic units of everyday knowledge in cognitive systems.
Rumelhart, D., P. Smolensky, J. McClelland, and G. Hinton. 1986.
Schemata and sequential thought processes in PDP models. In
Parallel Distributed Processing: Explorations in the Microstructure of
Cognition. Vol. 2. Ed. J. McClelland and D. Rumelhart, 757. Cambridge,
MA: MIT Press.
Sloutsky, V. M., and A. V. Fisher. 2004. When development and learning
decrease memory. Psychological Science 15: 5538.
Schank, R., and R. Abelson. 1977. Scripts, Plans, Goals and Understanding.
Hillsdale, NJ: Lawrence Erlbaum.

SCRIPTS
Let me tell you a simple story.
John went to a restaurant. He ordered lobster. He paid the check
and left.

Now let me ask you some questions about your understanding


of this story.
What did John eat? Did he sit down? Who did he give money to?
Why?

These questions are easy to answer. Unfortunately, your


answers to them have no basis in actual fact. He may have put
the lobster in his pocket. He might have been standing on one
foot while eating (if he was eating.) Who really knows whom
he paid?
We feel we know the answer to these questions because we
are relying on knowledge we have about common situations
encountered in our own lives. What kind of knowledge is this?
Where does it reside? How is it that our understanding depends
upon guessing?
People have scripts. A script can be best understood as a
package of knowledge that people have about particular kinds of
situations that they have encountered frequently. There are culturally common scripts everyone you knows shares them and

there are idiosyncratic scripts only you know about them.


When I refer to something that takes place in a restaurant, I can
leave out most of the details because I know that my listener can
fill them in. I know what you know. But if I were telling a story
about a situation with which only I was familiar, I would have
to explain what was happening in great detail. Knowing that
you have the baseball script, I can describe a game to you quite
quickly. But if I were speaking to someone who had never seen a
baseball game, either I would have to make reference to a script
he or she already had (cricket perhaps) or else I would be in for
a long explanation.
Scripts help us understand what others are telling us, and they
also help us comprehend what we are seeing and experiencing.
When we listen to people talk about scripts, we dont know that
we cannot even comprehend what they are saying even though
we likely know every word. What does they decided to go for it
on fourth and one on their own two despite being up a field goal
and were stuffed at the line of scrimmage for a safety mean to
someone who speaks perfect English and knows nothing about
American football?
The world, and especially language, in incomprehensible
without the background knowledge that scripts provide. When
a small child fails to understand what was said to him or her, the
lack of appropriate scripts is more likely the root of the problem
than lack of appropriate words. Even a toddler who does not
speak knows the morning routine or the ride in the car script.
Scripts drive our expectations, and when they are violated, we
are confused.
When we want to order in a restaurant and start to talk to the
waiter and he hands us a piece of paper and a pencil, we are surprised. We may not know what to do. But we may have had experience with private clubs that want orders written down. If not,
we ask. When our expectations are violated, when a script fails
and things dont happen the way we expected, we must adjust.
In daily life, adjustments to script violations are the basis of
learning. Next time, we will know to expect the waiter to hand us
a paper and pencil. Or we might generalize and decide that next
time doesnt only mean in this restaurant but in any restaurant of
this type. Making generalizations about type is a major aspect of
learning. Every time a script is violated in some way, every time
our expectations fail, we must rewrite the script so that we are
not fooled next time.
Since scripts are really just packages of expectations about
what people will do in given situations, we are constantly surprised because people dont always do what we expect. This
means, in effect, that while scripts serve the obvious role of telling us what will happen next, they also have a less obvious role as
organizers of memories of experiences we have had.
Remember that time in the airplane when the flight attendant
threw the food packages at the passengers? You would remember such an experience and might tell people a story about
it: You know what happened on my flight? Stories are descriptions of script violations of an interesting sort. But suppose that
this happened twice or five times, or suppose it happened every
time you flew a particular airline. Then, you would want to match
one script violation with another to come to the realization that
it wasnt a script violation at all, just a different script you hadnt
known about. Learning depends upon being able to remember

727

Second Language Acquisition


when and how a script failed, marking that failure with a memory
or story about the failure event, and then being able to recognize
a similar incident and make a new script.
Scripts fail all the time. This is why people have trouble
understanding one another. Their scripts are not identical. What
one person assumes about a situation the script built because
of the experiences he or she has had may not match anothers
because that person has had different experiences. Children get
upset when their scripts fail. They cry because what they assumed
would happen didnt happen. Their world model is naive and
faulty. But they recover day by day, growing scripts that are just
like the ones that adults have. They do this by expecting, failing,
explaining their failure (maybe they ask someone for help), and
then making a new expectation that will probably fail, too, someday. This cycle of understanding is a means by which people can
learn every day from every experience.
Now, of course some people stop learning. They expect all
scripts to be followed the way they always were. They get angry
when a fork is on the wrong side of a plate because thats the way
it has always been and has to be. We all have such rigidity in following our scripts. There are some that we wouldnt consider
violating because we want to live in an orderly world. We confuse
people when we fail to follow culturally agreed-upon scripts. We
depend upon people to follow the rules. And our understanding
of the behavior of others depends upon everyone agreeing to
behave in restaurants the way people behave in restaurants. It is
so much easier to communicate that way.
Scripts dominate our thinking lives. They organize our memories, they drive our comprehension, and they cause learning to
happen when they fail. They provide the background knowledge
for understanding the world we live in. That understanding has
little to do with words or vision. We dont know what we are seeing or what we are hearing if we are witnessing or hearing about
something for which we are lacking a script. We may not know
why we do what we do when we are in a script. When we are told
on an airplane to turn off all electronic devices, we turn off the
computer and the iPod, but not our watch or our pacemaker. We
know the script. The words dont matter all that much.
Roger C. Schank
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Schank, R. 1999. Dynamic Memory Revisited. Cambridge: Cambridge
University Press
Schank, R., and R. P. Abelson. 1977. Scripts, Plans, Goals, and
Understanding. Hillsdale, NJ: Lawrence Erlbaum.

SECOND LANGUAGE ACQUISITION


At its inception in the 1960s, the research emphasis in modern
second language acquisition (SLA) was on non-native language
development (but also simultaneous childhood bilingualism) by
children or adults learning naturalistically and/or with the aid of
formal instruction, as individuals or in groups, in foreign and second language environments. Work in SLA now encompasses all
that, plus the same phenomena in lingua franca settings, second
language attrition and loss, second dialect acquisition (SDA),
and third (fourth, etc.) language acquisition.

728

There is no dominant SLA theory. Sophisticated work has been


conducted within a universal grammar (UG) framework
(White 2003a), the general nativist model of William OGrady
(2003), functionalist approaches like Talmy Givns (Sato
1990), and emergentist positions (Ellis 1998). Psychological
models motivate work on cognitive mechanisms and processes,
including attention and memory (Robinson 2003; Tomlin and
Villa 1994), automaticity (Segalowitz 2003), implicit and explicit
learning (DeKeyser 2003), and intentional and incidental learning (Hulstijn 2003).
Altogether, some 4060 theories (loosely defined) of or in
SLA coexist uneasily, some complementary but many oppositional (Beretta 1991), and theoretical proliferation is arguably
one of the fields chief obstacles to progress. The theories differ
in source, with some imported from linguistics and psychology,
others developed, data first, within SLA itself. They differ in scope
(child or adult, formal or informal), form (causal-process and
set-of-laws), type (special nativist, general nativist, hybrid nativist, and empiricist), and content (primarily linguistic, cognitive,
or social variables considered important). Examinations of the
theory construction process have appeared in recent years, with
proposals offered for minimum criteria for adequate theories
(Crookes 1992; Gregg 2003; Jordan 2004) and for comparative
theory evaluation (Long 2007, 340).
Most SLA research falls within the domain of cognitive science, and internally, within one of three areas: i) patterns and
processes in interlanguage (IL, the psycholinguistic equivalent of an idiolect) development, ii) the linguistic environment,
and iii) individual differences. While far from an exhaustive
listing, judging from the number of studies over time reported
in the leading refereed journals (Studies in Second Language
Acquisition, Language Learning, Second Language Research,
Bilingualism: Language and Cognition, etc.), the following is representative of work within each of the three areas, and indicates
what, in the sense of Laudan (1977), SLA researchers consider
the most salient problems to be solved. The database on some
is sufficient to have merited qualitative and statistical meta-analyses of the findings (Norris and Ortega 2006).

Patterns and Processes in IL


DEVELOPMENTAL SEQUENCES. While each learners idiosyncratic
version of a second language (L2) his or her IL is different, at
least in detail, all exhibit some common patterns and features, and
those from particular learner groups (e.g., learners with the same
first language, or L1), share still more. Thus, instructed and naturalistic learners from different L1 backgrounds commit many of the
same errors and error types, albeit in different frequencies at different proficiency levels (Pica 1983). Again, regardless of acquisition
context and L1 background, they traverse broadly similar developmental sequences. An example is the well-documented four-stage
sequence for L2 English (ESL) negation: no V (*No have job), dont
V (*He dont work on Friday), aux-neg (He cant play the guitar),
and analyzed dont (He doesnt like me). Stages in a sequence
(by definition) cannot be skipped (e.g., as a result of instruction),
and L1 differences can modify, but not change, the sequence. For
instance, the L1 position of the negator can influence the time a
learner spends at a given stage, such that learners whose native
language, for example, Spanish, has preverbal negation will tend

Second Language Acquisition


to spend longer in preverbal stages 1 and 2. Learners with postverbal L1 negation, for example, Japanese, also start with preverbal
negation, however, even when the L2 also has a postverbal system. An attempt to explain such observed sequences in (German
L2) morphology and syntax as a function of universal processing
constraints (Meisel, Clahsen, and Pienemann 1981) evolved, with
subsequent incorporation of lexical-functional grammar,
into processability theory (Pienemann 1998). The role of perceptual salience in learning difficulty and, hence, in determining morpheme accuracy orders, was shown by Jennifer M. Goldschneider
and R. M. DeKeyser (2001).
AUTONOMOUS SYNTAX. Preverbal negation in an IL when both
L1 and L2 have postverbal systems is an example of autonomous
syntax, that is, interlingual structures not easily explained as the
product of either L1 transfer or L2 input. Another is the use of
resumptive pronouns when neither L1 nor L2 allows them, for
example, Italian learners production of utterances like That is
the man who I saw him (Pavesi 1986). Such findings suggest an
important role for cognitive factors strong enough to override
environmental ones. Coupled with studies showing mismatches
between instructional and acquisitional sequences (Doughty
2003), a corollary for language instruction is that teachability
is constrained by learnability (Pienemann 1989), that is, by
what learners can process.
TRANSFER. While L1 transfer is not at work in such cases as
preverbal negation and illicit resumptive pronouns, it remains
a powerful force in L2 development overall (Odlin 1989, 2003),
most pervasively and perceptibly in phonology, but in all interlingual subsystems. An early assumption that L1L2 differences
predicted learning difficulty has been replaced by a more sophisticated understanding. L1L2 similarities, for example, can often
promote transfer as readily as differences, as illustrated by James
E. Flege (1995) in L2 phonology. Also, cross-linguistic influence
is subject to certain constraints, factors that either prevent a
learner noticing a L1L2 similarity or from deciding that it is a
helpful similarity (Odlin 2003, 454), the latter often referred to as
perceived transferability. For example, learners are less likely to
transfer items, such as idioms, that they consider marked in their
L1, and so potentially peculiar to it (Kellerman 1978).
LANGUAGE UNIVERSALS AND MARKEDNESS. Language universals constitute another important influence on IL development.
An example is learners preference for the unmarked, open syllable, or, at least, for simpler syllable structures achieved through
such means as reduction of consonant clusters and devoicing of
voiced obstruents in clusters (Sato 1984). Incorporating notions
of typological markedness, Fred R. Eckmans (1991) structural
conformity hypothesis predicts greater learner difficulty with
more marked than less marked structures in the L2 in all areas,
not just areas of difference, following the pattern observed in the
development of native language phonologies. In place of rulebased accounts, constraint-based optimality theory has
been utilized successfully (Broselow, Chen, and Wang 1998) to
account for the role of markedness in producing patterns in IL
phonology observed in neither L1 nor L2. (For an excellent historical overview, see Eckman 2004).

FOSSILIZATION. Constrained though it is, transfer is considered


one of two processes that distinguish SLA from L1A. The other is
fossilization (Selinker 1972), understood variably in the literature
as i) explanandum, the permanent cessation of learning short
of target competence in one or more, usually many, linguistic
domains, despite adequate ability, motivation, and opportunity
to progress; and ii) explanans, with an array of biological, linguistic, cognitive, and social factors adduced to account for premature linguistic rigor mortis, none conceptually satisfactory or free
of problematic empirical findings (Long, 2003). Due to methodological limitations, few, if any, studies have unequivocally
documented fossilization, but it remains widely accepted as psychologically real, and divergent end-state grammars are a topic
of considerable research interest (Han and Odlin 2006; Lardiere
2007; Sorace 2003, White 2003b).

The Role of the Linguistic Environment


First explored by Diane E. Larsen-Freeman (1976), the role of
input frequency has been recently revived as an area of concern,
prompted by interest in connectionist and emergentist models
and in ascertaining the extent of implicit learning by adults (Ellis
2002, 2003). Frequency, in turn, is one way in which saliency is
promoted, and more salient forms in the input are more likely to
be noticed or encoded by learners. Noticing is argued by Schmidt
to be necessary and sufficient for acquisition, a strong claim that
has stimulated studies of the role of attention (Robinson 2003;
Schmidt, 2001; Tomlin and Villa 1994) and related work on focus
on form during language instruction (Doughty, 2001). (For a summary of findings on the influence of the wider social and sociolinguistic context on SLA and SDA, see Sato 1989 and Siegel 2003.)
Inspired by work on caretaker talk in L1A in the 1970s,
research has been conducted on foreigner talk and native/nonnative conversation to identify the linguistic characteristics of
unmodified, simplified, and elaborated spoken and written
input to L2 learners, and their effects on comprehension and
development (Long 1996). Findings largely reflect those in the
L1A literature, with the facilitative role of interactional modifications on communication and on the acquisition of both grammar and lexis extensively documented (Gass 1997; Mackey and
Goo, 2007). Particular attention has been focused on the role of
negative feedback, especially the relative effectiveness of models and corrective recasts in promoting learner uptake and IL
change. Negative feedback is of potentially greater importance
in SLA than in L1A, due to the fact that L2 learners frequently
have to unlearn options available in their L1 in situations
where the L2 system is in a subset relationship to that in the target language (e.g., L1 French verb-adverb-object or verb-objectadverb, learning L2 English with V-O-AdV only), and so where
positive evidence can help only indirectly, if at all. Pervasive in
naturalistic and classroom settings, recasts have been shown to
be usable and used by learners (provided they are developmentally ready to process them (Mackey 1999), capable of facilitating
acquisition of perceptually salient items, and superior to models
in some (rather limited) experiments (Long 2007, 75116).

Individual Differences
Whole theories have been built around social-psychological and
affective variables (for a review of research, see Dornyei 2005),

729

Second Language Acquisition


for example, attitude, motivation, and acculturation in situations
of SLA in multilingual, multicultural settings (Gardner 1988;
Schumann 1986). Cognitive variables, such as attention, awareness, and working memory, appear more centrally involved,
however, in aptitude and maturation in particular.
LANGUAGE APTITUDE. Second only to age of onset in the amount
of variance in L2 proficiency that it explains, language aptitude is a
reliable predictor of success in SLA, especially achievement up to
intermediate proficiency levels in classroom foreign language
learning, this despite most aptitude measures dating back 40
years. Research is under way to develop theoretically better-motivated instruments capable of predicting high-level achievement
in both formal and informal contexts. Aptitude, like intelligence,
may be partly molded by experience, and if so, trainable. (For
reviews, see Dornyei and Skehan 2003; Robinson 2005.)
AGE EFFECTS. One of the most salient differences between child
and adult language acquisition is the near-uniform success and
homogeneity of the former, and the considerable variance in
achievement, often amounting to failure, of the latter. Young
children start more slowly, but can attain nativelike abilities in
the long run. Conversely, nativelike achievement in languages
to which first exposure occurs after the midteens is vanishingly
rare, and nativelike L2 or second dialect phonology appears
impossible starting later than ages 46 in some individuals and
after 13 in the rest.
These findings are robust, many researchers concluding that
L2A, like L1A, is maturationally constrained, and positing one
or more critical periods for second language development
(DeKeyser 2000; Hyltenstam and Abrahamsson 2003). Studies
of L2 parameter (re)setting have been subjected to a statistical meta-analysis (Dinsmore 2006), with results indicating that
postcritical period learners performance on tests of grammatical
knowledge is fundamentally different from that of native speaker
groups, and that UG does not fully operate in adolescent and adult
learning. Scholars differ, nevertheless, on the scope and timing of
the putative constraints, and some (e.g., Birdsong and Molis 2001)
dispute their existence altogether, claiming to have shown that at
least a few adults can acquire nativelike L2 abilities, even if the vast
majority in practice do not. The supposed counterevidence has
been questioned on methodological grounds (Long 2005).
The search for an explanation for age effects is arguably the
single most important research issue in SLA. The eventual findings, whatever they may be, will be important for SLA theorists,
who need to know whether older learners bring an intact or
qualitatively or quantitatively diminished acquisition capacity to
the learning task, and, hence, whether theories of child and adult
first or second language acquisition must differ. For the light
they will throw on the maturing capacity of the human mind, as
with work in many areas of SLA, the findings will have important
implications for cognitive science.
Michael H. Long
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Beretta, Alan. 1991. Theory construction in SLA: Complementarity and
opposition. Studies in Second Language Acquisition 13.4: 493511.

730

Birdsong, David, and M. Molis. 2001. On the evidence for maturational


constraints in second-language acquisition. Journal of Memory and
Language 44: 23549.
Broselow, Ellen, S. Chen, and C. Wang. 1998. The emergence of the
unmarked in second language phonology. Studies in Second Language
Acquisition 20.2: 26180.
Crookes, Graham. 1992. Theory format and SLA theory. Studies in
Second Language Acquisition 14.4: 42549.
DeKeyser, Robert. 2000. The robustness of critical period effects in second language acquisition. Studies in Second Language Acquisition
22.4: 499533.
. 2003. Implicit and explicit learning. In Handbook of SLA, ed.
Catherine J. Doughty and M. H. Long, 31348. Oxford: Blackwell.
Dinsmore, Thomas H. 2006. Principles, parameters, and SLA: A retrospective meta-analytic investigation into adult L2 learners access to
universal grammar. In Synthesizing Research on Language Learning
and Teaching, ed. John M. Norris and L. Ortega, 5390. Amsterdam and
Philadelphia: John Benjamins.
Dornyei, Zoltan. 2005. The Psychology of the Language Learner: Individual
Differences in Second Language Acquisition. Mahwah, NJ: Lawrence
Erlbaum.
Dornyei, Zoltan, and P. Skehan. 2003. Individual differences in L2 learning. In Handbook of SLA, ed. Catherine J. Doughty and M. H. Long,
589630. Oxford: Blackwell.
Doughty, Catherine J. 2001. Cognitive underpinnings of focus on form.
In Cognition and Second Language Instruction, ed. Peter Robinson,
20657. Cambridge: Cambridge University Press.
. 2003. Instructed SLA: Constraints, compensation, and enhancement. In Handbook of SLA, ed. Catherine J. Doughty and M. H. Long,
256310. Oxford: Blackwell.
Eckman, Fred R. 1991. The structural conformity hypothesis and the
acquisition of consonant clusters in the interlanguage of ESL learners.
Studies in Second Language Acquisition 13.1: 2341.
. 2004. From phonemic differences to constraint rankings;
Research on second language phonology. Studies in Second Language
Acquisition 26.4: 51349.
Ellis, Nick C. 1998. Emergentism, connectionism and language learning. Language Learning 48.4: 63164.
. 2002. Frequency effects in language acquisition: A review with
implications for theories of implicit and explicit language acquisition.
Studies in Second Language Acquisition 24.2: 14388.
. 2003. Constructions, chunking, and connectionism: The emergence of second language structure. In Handbook of SLA, ed.
Catherine J. Doughty and M. H. Long, 63103. Oxford: Blackwell.
Flege, James E. 1995. Second language speech learning: Theory, findings,
and problems. In Speech Perception and Linguistic Experience: Issues in
Cross-Language Research, ed. W. Strange, 23377. Timonium, MD: York.
Gardner, Robert C. 1988. The socio-educational model of second language learning: Assumptions, findings, and issues. Language Learning
38.1: 10126.
Gass, Susan M. 1997. Input, Interaction, and the Second Language
Learner. Mahwah, NJ: Lawrence Erlbaum.
Goldschneider, Jennifer M., and R. M. DeKeyser. 2001. Explaining the
natural order of L2 morpheme acquisition in English: A meta-analysis
of multiple determinants. Language Learning 51.1: 150.
Gregg, Kevin R. 2003. SLA theory: Construction and assessment. In
Handbook of SLA, ed. Catherine J. Doughty and M. H. Long, 83165.
Oxford: Blackwell.
Han, Zhaohong, and T. Odlin, eds. 2006. Studies of Fossilization in Second
Language Acquisition. Clevedon, UK: Multilingual Matters.
Hulstijn, Jan H. 2003. Incidental and intentional learning. In Handbook
of SLA, ed. Catherine J. Doughty and M. H. Long, 34981. Oxford:
Blackwell.

Second Language Acquisition


Hyltenstam, Kenneth, and N. Abrahamsson. 2003. Maturational constraints in second language acquisition. In Handbook of SLA, ed.
Catherine J. Doughty and M. H. Long, 53988. Oxford: Blackwell.
Jordan, Geoff. 2004. Theory Construction in Second Language Acquisition.
Amsterdam and Philadelphia: John Benjamins.
Kellerman, Eric. 1978. Giving learners a break: Native language intuitions
about transferability. Working Papers in Bilingualism 15: 5992.
Lardiere, Donna. 2007. Ultimate Attainment in Second Language
Acquisition: A Case Study. Mahwah, NJ: Lawrence Erlbaum.
Larsen-Freeman, Diane E. 1976. An explanation for the morpheme
acquisition order of second language learners. Language Learning
26.1: 12534.
Laudan, Larry. 1977. Progress and Its Problems: Towards a Theory of
Scientific Growth. Berkeley and Los Angeles: University of California
Press.
Long, Michael H. 1996. The role of the linguistic environment in second
language acquisition. In Handbook of Second Language Acquisition,
ed. William C. Ritchie and T. J. Bhatia, 41368. New York: Academic
Press.
. 2003. Stabilization and fossilization in interlanguage development. In Handbook of second language acquisition, ed. Catherine J.
Doughty and M. H. Long, 487535. Oxford: Blackwell.
. 2005. Problems with supposed counter-evidence to the critical period hypothesis. International Review of Applied Linguistics
43.4: 287317.
. 2007. Problems in SLA. Mahwah, NJ: Lawrence Erlbaum.
Mackey, Alison. 1999. Input, interaction and second language development. Studies in Second Language Acquisition 21.4: 55787.
Mackey, Alison, and Jaemyung Goo. 2007. Interaction research in SLA: A
meta-analysis and research synthesis. In Conversational Interaction
and Second Language Acquisition. A Series of Empirical Studies, ed.
Alison Mackey, 377419. New York: Oxford University Press.
Meisel, Jurgen M., Harold Clahsen, and Manfred Pienemann. 1981. On
determining developmental stages in natural second language acquisition. Studies in Second Language Acquisition 3.1: 10935.
Norris, John M., and Lourdes Ortega, eds. 2006. Synthesizing Research on
Language Learning and Teaching. Amsterdam and Philadelphia: John
Benjamins.
Odlin, Terence. 1989. Language Transfer: Cross-Linguistic Influence in
Language Learning. Cambridge: Cambridge University Press.
. 2003. Cross-linguistic influence. In Handbook of SLA, ed.
Catherine J. Doughty and M. H. Long, 43686. Oxford: Blackwell.
OGrady, William. 2003. The radical middle: Nativism without universal grammar. In Handbook of SLA, ed. Catherine J. Doughty and
M. H. Long, 4362. Oxford: Blackwell.
Pavesi, M. 1986. Markedness, discoursal modes, and relative clause
formation in a formal and an informal context. Studies in Second
Language Acquisition 8.1: 3855.
Pica, Teresa. 1983. Adult acquisition of English as a second language under different conditions of exposure. Language Learning
33.4: 46597.
Pienemann, Manfred. 1989. Is language teachable? Psycholinguistic
experiments and hypotheses. Applied Linguistics 10.1: 5279.
. 1998. Language Processing and Second Language
Development: Processability Theory. Amsterdam and New York: John
Benjamins.
Robinson, Peter. 2003. Attention and memory. In Handbook of SLA, ed.
Catherine J. Doughty and M. H. Long, 63178. Oxford: Blackwell.
.. 2005. Language aptitude. Annual Review of Applied Linguistics
24: 4573.
Sato, Charlene J. 1984. Phonological processes in second language acquisition: Another look at interlanguage syllable structure. Language
Learning 34.1: 4357.

Self-Concept
. 1989. A nonstandard approach to standard English. TESOL
Quarterly 23.2: 25982.
. 1990. The Syntax of Conversation in Interlanguage Development.
Tubingen: Gunter Narr.
Schmidt, Richard W. 2001. Attention. In Cognition and Second Language
Instruction, ed. Peter Robinson, 332. Cambridge: Cambridge
University Press.
Schumann, John H. 1986. Research on the acculturation model for second language acquisition. Journal of Multilingual and Multiculutural
Development 7: 37992.
Segalowitz, Norman. 2003. Automaticity and second languages. In
Handbook of SLA, ed. Catherine J. Doughty and M. H. Long, 382408.
Oxford: Blackwell.
Selinker, L. 1972. Interlanguage. International Review of Applied
Linguistics 10.3: 20931.
Siegel, Jeff. 2003. Social context. In Handbook of SLA, ed. Catherine J.
Doughty and M. H. Long, 178223. Oxford: Blackwell.
Sorace, Antonella. 2003. Near-nativeness. In Handbook of SLA, ed.
Catherine J. Doughty and M. H. Long, 13051. Oxford: Blackwell.
Tomlin, Russell, and V. Villa. 1994. Attention in cognitive science and
SLA. Studies in Second Language Acquisition 15.2: 185204.
White, Lydia. 2003a. Second Language Acquisition and Universal
Grammar. Cambridge: Cambridge University Press.
. 2003b. Fossilization in steady state L2 grammars: Persistent
problems with inflectional morphology. Bilingualism: Language and
Cognition 6.2: 12941.

SELF-CONCEPT
The self and self-related perceptions are widely recognized for
their significance in explaining human behavior and in individuals mental well-being. Different disciplines study the self
from different perspectives. The present entry relies on findings
derived from cognitive-developmental psychological research.
The self is an explanatory and organizing framework that provides the person with a sense of individual identity and continuity. Early psychologists soon made the distinction between self
as a subject (I-self as active agent and as knower) and self as an
object of what is known (Me). The self as an object of knowledge
is reflected in the construct of self-concept. There is a lack of
a universally accepted definition of self-concept due mainly to
different theoretical perspectives. Self-concept is the persons
beliefs, attitudes, and perceived characteristics about the self. It
includes self-representations of abilities, past actions, goals,
intentions, interests, future self, ideal self, and so on. Perceived
competence is considered a key component of self-concept that
plays a significant role in motivation, learning, and achievement.
Therefore, self-concept represents descriptive as well as evaluative aspects of the self that are reflected in self-description of
ones strengths and weaknesses in given domains of functioning.
For instance, academic self-concept reflects individuals beliefs
and attitudes about their own competence in academic settings.
Before the 1990s, many studies often treated the various self
terms, such as self-concept, self-esteem, and self-efficacy, interchangeably. These constructs represent different aspects of the
self-system. Self-concept is essentially descriptive, and it includes
personal beliefs and attitudes regarding ones state of abilities.
Self-esteem is the overall evaluation of ones worth or value as a
person. It includes emotional appraisals of the self, ones likes or
dislikes as well as feelings of self-acceptance. Therefore, it reflects

731

Self-Concept
the affective aspect of the self-system. Self-efficacy is an individuals judgments and expectations for successfully performing
given tasks at designated levels of performance, and it should not
be regarded as synonymous with self-concept (Bandura 1997).
Self-concept is both a cognitive and a social construction.
Sources for self-concept are the individuals mastery experiences, internal and social comparison processes, and perceived appraisals by others (Harter 1999, 16694). For instance,
it has been shown that aspects of adolescents self-concept
regarding abilities in school language were influenced by their
performance in language and by perceived appraisals by
others (Dermitzaki and Efklides 2000, 62934). Self-concept is
considered a dynamic structure that changes with experience,
especially its more peripheral and situation-specific aspects. The
development of memory and language in toddlers allows for
the verbal expression and representation of the self as an object
and for gradually constructing an inner self by developing selfrepresentations.
Psychological research has documented that self-concept
affects cognition, emotions, motivation, and behavior, and it
is associated with social adjustment, leadership, and level of
anxiety. Self-concept has strong motivational power since it is
tied closely to how we choose to invest our time and effort and
whether we persist during our efforts. Therefore, positive selfconcept is associated with high achievements and active involvement in learning situations. In educational settings, academic
self-concept is considered an important mediator of academic
motivation and performance. Systematic relations have been
documented between the academic self-concept of students and
teachers and such factors as students intrinsic motivation, goal
setting, anxiety, course selection, effort spent, study skills, and
self-regulation, as well as teachers level of engagement and persistence in classroom activities.
Before the 1980s, self-concept was approached as a monolithic, unidimensional construct. Nowadays, it is believed that
self-concept is an entity with multiple facets and dimensions.
Various profiles of increasing multidimensionality that assess a
complex set of perceived competencies have been progressively
developed (see Byrne 1996). Although the multidimensionality of
self-concept is rarely disputed, researchers do not always agree
on the self-concept structure, that is, on the nature of relations
among the different aspects and dimensions of self-concept (see
Marsh and Hattie 1996). An influential theoretical model on the
self-concept structure is the hierarchical one, first proposed by R.
J. Shavelson and colleagues in 1976. The evidence is not conclusive. However, analysis using sophisticated statistical procedures,
such as confirmatory factor analysis, tends to support a potential
self-concept hierarchy. Within this model, there is a superordinate, general self-concept as a higher-order factor that subsumes
domain-specific self-concepts, which may be correlated, though
they can be interpreted as separate constructs. Academic, social,
emotional, and physical self-concepts are some of the main
domains identified within this model. Within a domain of action,
there are several levels of specific self-perceptions when moving from the top to the bottom of the hierarchy. For instance,
academic self-concept includes self-perceptions regarding various subdomains of knowledge, such as self-concept in English
language, self-concept in mathematics, self-concept in science,

732

Self-Organizing Systems
and so on. Self-concept in language reflects students beliefs and
attitudes about their competence in language domain. Selfconcept in language subsumes self-perceptions from more general aspects of the language domain, such as grammar, syntax,
reading, writing, and so on, to situation-specific self-perceptions, that is, with regard to the task at hand (see, for example,
Lau et al. 1999).
There are still many issues to investigate about self-concept
and its relations to human behavior and well-being. What we
should always bear in mind is that the fundamental goal of selfconcept research is to help persons function, grow, and adapt
better in various demanding settings.
Irini Dermitzaki
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bandura, Albert. 1997. Self-Efficacy: The Exercise of Control. New
York: Freeman.
Byrne, Barbara M. 1996. Measuring Self-Concept Across the Life
Span: Issues and Instrumentation. Washington, DC: American
Psychological Association.
Dermitzaki, Irini, and Anastasia Efklides. 2000. Aspects of self-concept
and their relationship to language performance and verbal reasoning
ability. American Journal of Psychology 113: 62138.
Harter, Susan. 1999. The Construction of the Self. A Developmental
Perspective. New York: Guilford.
Lau, Ivy Cheuk-yin, Alexander Seeshing Yeung, Putai Jin, and Renae Low.
1999. Toward a hierarchical, multidimensional English self-concept.
Journal of Educational Psychology 91: 74755.
Marsh, Herbert W., and John Hattie. 1996. Theoretical perspectives on
the structure of self-concept. In Handbook of Self-Concept, ed. Bruce
A. Bracken, 3890. New York: Wiley.
Shavelson, R. J., J. J. Hubner, and G. C. Stanton. 1976. Selfconcept: Validation of construct interpretations. Review of Educational
Research 46: 40741.

SELF-ORGANIZING SYSTEMS
A broad division is made between systems that organize themselves via their own internal constraints and systems that are
organized by others, that is, by agencies that are exterior to them.
Although no single definition is agreed upon as yet, the nature
of systems that make themselves up as they go along has been
intensively studied, and there is a general understanding of the
basic requirements.
A self-organizing (S-O) system is typically composed of many
differently sized parts operating at many time scales. The organized
states or patterns exhibited by an S-O system at more macroscopic
scales arise strictly from local interactions among its parts. The
interactions are multiple and nonlinear meaning that they produce disproportionate effects. Figure 1 shows that in a BelousovZhabotinsky (B-Z) chemical reaction involving dozens of molecule
types, microscopic interactions abiding few constraints result in
highly constrained collective behaviors (concentric circles or spirals
in a petri dish viewed from above) that then modulate the microscopic interactions in a circularly causal manner. These disproportionate effects are hidden in the nonlinearities and made manifest
by circumstances internal to the S-O system (specifically, fluctuations or noise) or external to the S-O system (e.g., energy flows

Self-Organizing Systems

Figure 1. A Belousov-Zhabotinsky chemical


reaction.

from the embedding environment). The external circumstances


referred to as control parameters and frequently nonobvious (e.g.,
the inclination of the petri dish in the B-Z reaction) enable the
patterns of an S-O system (in the B-Z figure, either concentric circles or spirals) but do not prescribe them.
The processes resulting in the transformation from a less
ordered to a more ordered (collective) state are studied formally
under the labels of bifurcation and symmetry breaking. The collective states themselves are studied formally under the labels of
attractors and stability. The intuitive understanding of an attractor is a collective state to which the parts are drawn. An independence of parts evolves to a cooperation of parts. It is of particular
note that a S-O system may more typically evolve to a state that
is neither one of cooperation nor one of separation. Instead of
an attractor, there is a coexistence of competing tendencies: to
cooperate (in a particular way) and to separate. Metastability,
as the latter condition is called, expresses a delicate balance
between dependence and independence, between integration
and segregation. It confers on an S-O system the key ability to
be simultaneously adaptive (to fit the conditions) and flexible (to
change quickly, as needed).
The concept of self-organization provides a theory-constitutive metaphor for language and a natural framework for language study. Although the metaphor and framework are not in
common usage, their virtues, and even their necessity, are under
discussion in the contemporary literature (e.g., Ke et al. 2002;
Oudeyer 2005; Turvey and Moreno 2006; Van Orden, Holden,
and Turvey 2003).
Theoretical approaches to S-O systems include dissipative
structures (Nicolis and Prigogine 1989), homeokinetics (Iberall and
Soodak 1987), synergetics (Haken 1983), self-organized criticality
(Bak 1996), and catalytic closure (Kauffman 1995). Each approach
to S-O systems has its promoters and detractors for a variety of reasons, both conceptual and technical. The named approaches differ in the range of phenomena they encompass, in the degree and
nature of their dependence on nonlinear thermodynamics, and in
their use of the mathematics of dynamical systems. Their shared
virtue, given the present state of development, is the methods they
make available and encourage for the study of S-O processes.
Synergetics, for example, has proven useful in several domains
of cognitive science (Kelso 1995). Its methods flow from the socalled slaving principle. In a collection of interacting parts, there
are pockets of temporary organizations, or modes (e.g., such
as the depicted patterns in the B-Z reaction). Some modes will
become dominant and dictate how the others will evolve. The
synergetics strategy for addressing self-organization is, therefore, to identify the leading modes, derive collective equations for them in terms of order parameters, and interpret the

influences of the enslaved modes as perturbations to the master


dynamics. Identifying the methods by which the strategy can be
implemented in wide-ranging situations has been an important
contribution of the synergetics approach.
Lessons from homeokinetics and self-organized criticality
have shaped research on linguistic processes. The simple experimental tasks of naming a letter string or verbally reporting the
lexical status (yes or no) of a spoken form are events that
embed processes at faster time scales and, in turn, are embedded
within processes at slower time scales. On the basis of homeokinetics and self-organized criticality, a process at any one time
scale must be reflective of all other processes, both slower and
faster. The expectation is that latency measures of naming or lexical decisions obtained under commonplace experimental procedures should exhibit the power-law scaling typically manifest by
S-O systems. Evidence for such scaling of linguistic performance
has been reported (e.g., Van Orden, Holden, and Turvey 2003).
It highlights the possibility that language competence might
reflect the interaction dynamics of multiple processes at multiple time scales rather than the (conventional) additive dynamics
of several modular components (e.g., syntax, lexicon).
Miguel A. Moreno and Michael T. Turvey
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bak, Per. 1996. How Nature Works: The Science of Self-Organized
Criticality. New York: Copernicus Springer-Verlag.
Haken, Herman. 1983. Synergetics. 3d ed. Berlin: Springer Verlag.
Iberall, Arthur, and Harold Soodak. 1987. A physics for complex
systems. In Self-Organizing Systems: The Emergence of Order, ed.
F. E. Yates, 499520. New York: Plenum.
Kauffman, Stuart. 1995. At Home in the Universe: The Search for the Laws
of Self-Organization and Complexity. New York: Oxford University
Press.
Ke, J., J. W. Minett, C-P Au, and W. S-Y Wang. 2002. Self-organization in
the emergence of vocabulary. Complexity 7: 4154.
Kelso, J. A. S. 1995. Dynamic Patterns. Cambridge, MA: MIT Press.
Nicolis, Gregoire, and Ilya Prigogine. 1989. Exploring Complexity. New
York: Freeman.
Oudeyer, P-Y. 2005. The self-organization of speech sounds. Journal of
Theoretical Biology 233: 43549. Identifies how discrete speech codes
might arise from sensory-motor interactions without presupposing
linguistic capability.
Turvey, Michael, and Miguel Moreno. 2006. Physical metaphors for the
mental lexicon. The Mental Lexicon 1: 733. Surveys a variety of selforganizing principles as possible metaphors for understanding the
mental lexicon.
Van Orden, Guy, Jay Holden, and Michael Turvey. 2003. Self-organization
of cognitive performance. Journal of Experimental Psychology: General
132: 33151.

733

Semantic Change

SEMANTIC CHANGE
The study of semantic change is a core area of historical semantics, that is, the diachronic study of the meaning of natural language expressions. Next to the study of meaning changes and the
principles governing such changes, historical semantics includes
the etymological reconstruction of the meaning of proto-forms,
and the semantic study of older states of languages regardless
of previous or later changes. This entry is concerned exclusively
with lexical change of meaning, given that the study of word
meaning has always taken up a central position in the field of
historical semantics. (See the entry on grammaticalization
for issues involving grammatical meaning.)
Historically speaking, the study of meaning change reigned
supreme within semantics in the era of prestructuralist semantics, from roughly 1830 to 1930, receded to the background in
the structuralist and generativist periods, and received
a new impetus with the emergence of cognitive approaches to
linguistics (see Nerlich 1992 and Ullmann 1962 for the older
traditions.) The contemporary field of diachronic lexical semantics may be charted according to the linguistic level on which
changes are envisaged: the level of the individual readings, that
of the word, and that of the lexicon at large.

The Level of Senses: Conceptual Mechanisms of Semantic


Change
The fundamental event in semantic change involves the process
by which a given meaning of a word develops into a new one, as
when match develops from the meaning the wick of a candle or
a lamp to a small piece of wood tipped with a composition that
bursts into flame when rubbed. These transitions are not arbitrary; they are governed by conceptual associations between the
source reading and the target reading. In the example, there is a
functional similarity between source and target. More generally, if the target is not conceptually accessible from the source,
no shift of meaning can occur. A basic question for diachronic
semantics, therefore, involves which elementary associative
links should be distinguished.
Although the matter is still vividly debated (see Blank 1997
for a comprehensive treatment of competing classifications, old
ones and novel ones), the most common classification of semantic change involves four mechanisms: metonymy as based on
conceptual contiguity, metaphor as based on similarity, and
specialization (narrowing) and generalization (broadening) as
based on semantic inclusion.
Two additions must be mentioned with regard to this traditional
quartet of semantic mechanisms. First, the four mechanisms apply
to denotational meanings only. For changes of nondenotational
meaning, the usual distinction is that between pejorative changes
(shifts of emotive meaning in a negative direction) and meliorative
changes (shifts of emotive meaning in a positive direction).
Second, denotational changes of meaning may be analogical
or not, according to whether the new meaning does or does not copy
the semantics of another, related expression. Semantic borrowing,
for instance, is the process by means of which a word x in language A
that translates the primary meaning of word y in language B copies
a secondary meaning of y: French forme shape copies the reading
physical condition from English form used as a sports term.

734

The Level of Lexical Items: The Prototype Structure of


Change
Semantic changes do not occur at random. They follow the
paths set out by the conceptual mechanisms of such change,
but they are also restricted by their starting point: the existing
meanings of words and the mutual relationships between those
meanings. The study of semantic change, in other words, has to
take into account the internal semantic architecture of words.
Contemporary lexicology conceives of this structure primarily in terms of a prototype model, highlighting two architectural characteristics: differences of structural weight, on
the one hand, and fuzziness and flexibility, on the other. This
model has been fruitfully applied to meaning change, showing
that the latter is not an atomistic process in which individual
word meanings develop separately (as might be suggested by
an exclusive focus on mechanisms of change). Rather, words
develop as radial networks of interrelated meanings, clustered
around core cases. For instance, the central meaning of foot
lower part of a human leg extends by metonymy to length of
12 inches, and by comparison to supporting or lower part of a
vertically oriented object (foot of a glass). This meaning is further applied to horizontal objects with an implicit orientation
(foot of the table). The foot of a grave, in turn, could illustrate the
latter, or derive by a direct metonymical link (the place where
the feet are) from the central reading. (For a systematic treatment of the various prototypicality effects in semantic change,
see Geeraerts 1997.)

The Level of the Lexicon: Onomasiological Perspectives


The distinction between semasiology and onomasiology is
equivalent to the distinction between meaning and naming: Semasiology takes its starting point in the word as a form,
and charts the meanings with which the word can occur; onomasiology takes its starting point in a concept or entity, and
investigates the different lexical expressions by which the concept or entity can be designated, or named. The forms of diachronic semantic research mentioned earlier are semasiological
ones, starting as they do from individual words or senses. But an
onomasiological perspective is ultimately more important: The
choices that speakers make are essentially onomasiological
choices between alternative forms of expression (and as such,
onomasiology automatically involves the level of the vocabulary
at large).
What does it imply to study lexical change onomasiologically?
Traditional onomasiological research focuses on the mechanisms
of lexicogenesis, that is, the various ways in which new expressions
come about (including semasiological extensions of meaning;
see Tournier 1985). Contemporary research is more interested
in underlying cognitive principles: Do dominant onomasiological
choices or developments reveal general cognitive mechanisms?
Two research paradigms currently put this perspective into practice. A comparative approach (Blank and Koch 2003) investigates
cross-linguistically whether there are any preferred naming patterns for given domains of human experience. A dynamic approach
(Sweetser 1990; Traugott and Dasher 2005) looks for conceptual
regularity in the semantic development of related expressions.
Dirk Geeraerts

Semantic Fields
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Blank, Andreas. 1997. Prinzipien des lexikalischen Bedeutungswandels
am Beispiel der romanischen Sprachen. Tbingen: Niemeyer.
Blank, Andreas, and Peter Koch, eds. 2003. Kognitive romanische
Onomasiologie und Semasiologie. Tbingen: Niemeyer.
Geeraerts,
Dirk.
1997.
Diachronic
Prototype
Semantics.
Oxford: Clarendon.
Nerlich, Brigitte. 1992. Semantic Theories in Europe, 18301930.
Amsterdam: John Benjamins.
Sweetser, Eve. 1990. From Etymology to Pragmatics. Cambridge:
Cambridge University Press.
Tournier, Jean. 1985. Introduction descriptive la lexicognetique de
langlais contemporain. Paris and Geneva: Champion, Slatkine.
Traugott, Elizabeth C., and Richard B. Dasher. 2005. Regularity in
Semantic Change. Cambridge: Cambridge University Press.
Ullmann, Stephen. 1962. Semantics. An Introduction to the Science of
Meaning. Oxford: Basil Blackwell.

SEMANTIC FIELDS
A semantic field consists of two parts: 1) a conceptual domain,
for example, color, containers, cooking (see concepts); and
2) lexical items that are mapped onto the conceptual domain.
Semantic field theory assumes that vocabularies of languages
are structured, not just a list of atomic units. A major part of field
theory provides an inventory of lexical relations, that is, how lexemes can be related to one another, such as by synonymy and
antonymy. The lexeme, the basic semantic unit, is generally a
word or an idiomatic phrase like wake up or kick the bucket.
Ferdinand de Saussure ([1916] 1966) is credited with the beginning of structuralism, where the elements are an interconnected system of signs. Jost Trier (1932) distinguished between
conceptual fields and lexical fields and showed how lexical items
divide up the conceptual field, focusing on paradigmatic relationships, while W. Porzig (1950) emphasized syntagmatic relationships. John Lyons (1963, 1968, 1977, Vol. 1. 231335) expanded
semantic field theory, stressing the role of context in semantic
analysis and embedding his work into a generative grammar. Adrienne Lehrer (1974), Peter R. Luzeier (1981), Richard
Grandy (1987), and Eva F. Kittay (1987), among others, have further developed, applied, or defended field theory.
A version of semantic fields has also been used by cognitive
semanticists, for example, Zoltn Kovesces (1986), George Lakoff
and Mark Johnson (1980), and many others. However, whereas
semantic field practitioners start with the lexemes and then analyze the structural relations and conceptual structure, cognitive
semanticists start with the conceptual structure.

Lexical Relations: Paradigmatic and Syntagmatic


PARADIGMATIC RELATIONS. Lexical relations can be divided into
paradigmatic and syntagmatic. The principal paradigmatic ones
are synonymy, incompatibility, antonymy, meronomy, hyponymy, and converseness.
Synonyms are lexemes that substitute for each other. Big and
large are synonyms in the domain of size. My house is big My
house is large.
Hyponymy is the kind of relationship. B is a hyponym of
A; B entails A. A is a superordinate (or hypernym) of B. Rose is
a hyponym of flower and flower is the superordinate of rose.

animal
mammal
feline

canine
dog

wolf

fish

bird

tiger

lion

Figure 1. A simplified example of a taxonomy.

animal
domestic animal

wild animal
livestock

pet
cat

dog

bird

Siamese retriever

cow horse

pig

canary

Figure 2. A functional hyponymy-superordinate structure.

mid

big
more
less

little
more
less

Figure 3. An antonymous scale.

Usually a superordinate has several hyponyms. Other hyponyms of flower are lily, petunia, dahlia, geranium, and bachelor button. All the co-hyponyms are incompatible with one
another; that is,they form a contrast set (Grandy 1987). A is
a rose entails A is not a lily or petunia, or dahlia, and so on.
A taxonomy, especially a scientific taxonomy, is a subtype of
hyponymy. Figure 1 illustrates this structure. However, a different structure for animals, a functional one, is possible,
where animals are classified by their relationship to people, as
in Figure 2. In functional relationships, the kind of is weaker
than entailment. We think of a cat as a kind of pet, but a feral
cat is not a pet. Most birds are not pets, and few people consider snakes to be appropriate pets.
Antonymy is a special type of incompatibility that consists of
two opposing terms. Complementary antonyms divide the space
exhaustively, such as deadalive or singlemarried. In such cases,
A not B and B not A. If Al is married, Al is not single. If Bee
is single, Bee is not married. More interesting are the gradable
antonyms where the structure consists of a center point or center
region with bidirectionality. Figure 3 illustrates this relationship
with big and little. These antonyms are intimately connected to
comparison. A is bigger than B B is littler than A. There are various types of antonymy, which involve markedness and subtle
entailments (see Cruse 1986; Lehrer 1983; Murphy 2003).
Closely related to antonymy is converseness, which interacts
with syntax to provide sentences paraphrases. Examples are

735

Semantic Fields
Table 1.

hot

Actor-action:

Action-object:

Actioninstrument:

Cause-effect

surgeon-operate

The surgeon operated


on several patients.

chef-cook

This chef cooks great


food.

dog-bark

Your dog barks too


much.

warm

tepid

cool

burning

cold

freezing

frigid

more
less

more
less

Figure 5. Temperature words.

play-game

Lets play a game now.

drive-vehicle

He drives a large
vehicle.

see-eye

We see with our eyes.

kochen1
kochen2 to boil

hear-ear

We hear with our ears.

kill-die

The murderer killed


the man, who died
immediately.

sieden
simmer

dnsten
stew

to cook
braten fry, grill

rsten
roast, pan fry

backen bake

grillen
broil

Figure 6. German cooking words.

gotowa1 cook
gotowa2 boil

sma y

fry

piec bake, roast

cook
dusi
boil
simmer
poach

fry
saut

broil
deep-fry =
French-fry

roast

Figure 4. English cooking words.


buysell and husbandwife. Ann sold Bill a car Bill bought a
car from Ann. Ann is Bills wife Bill is Anns husband.
Meronomy, the part of relation, is illustrated by items like eyes
face; handlecup, suitcase, door; roofhouse, building; and actplay.
SYNTAGMATIC RELATIONS. Syntagmatic relations include verbs and
their semantically related subjects and objects. These generally
involve a verb or predicate and thematic relations, as in Table 1.
SOME EXAMPLES Cooking and temperature lexemes in English
illustrate a semantic field analysis, as in Figures 4 and 5. (This
example is a much simplified version of Lehrer 1974.)
In the cooking domain, cook is the superordinate of all terms
under it. Boil, fry, broil, roast, and bake are the immediate hyponyms of cook. Boil has one hyponym, simmer, which in turn has
poach and stew as hyponyms. Saut and deep fry, with its synonym French fry, are co-hyponyms of fry. Broil is a U.S. lexeme, a
concept expressed by grill in the UK. (Grill in U.S. dialects is not
an immediate hyponym of cook.) The structural relationships are
only a part of a semantic analysis, which must be completed by
definitions or other characterizations (e.g., componential analysis) to explicate the meanings of the lexemes:

736

Figure 7. Polish cooking words.

grill

stew

cook: prepare food by heating


boil: cook with water
fry: cook with fat
bake: cook in an oven
broil: cook under a flame

stew

bake

Another part of the semantic analysis would contain the selection restrictions or collocations for each lexeme. Poach typically
occurs with foods like eggs or fruit. We bake hams but roast meat,
even though both are nowadays cooked in ovens. The phrase
roast meat is a relic of earlier periods when meat was cooked on
a spit over an open fire.
Temperature words illustrate a field with antonyms, as in
Figure 5. In the temperature field, there are two pairs of antonyms, an outer pair, hotcold, and an inner pair, warmcool.
The middle region has a lexeme, tepid, and its synonym, lukewarm, which are limited in distribution, used literally only for the
temperature of liquids. The outer words have various hyponyms,
only a couple of which are illustrated here. There are interesting
asymmetries in these structures (see Cruse 1986; Lehrer 1983).

Applications
Semantic field theory has at least four important applications: 1)
comparative lexicology, 2) pedagogy, 3) metaphor, and 4) philosophical semantics.
COMPARATIVE LEXICOLOGY. Lexemes in one language are rarely
equivalent in meaning to those of other languages. The space of
a semantic field can be structured in various ways, and field theory provides an excellent way of comparing the semantic structures of different languages. For example, German, Polish, and
Japanese show alternative structures to English cooking words
(see Figures 6, 7, and 8). In both German and Polish, the words
for cook have a general meaning for preparing food with heat
and also a specific meaning of cooking with water, for example,
boil. And whereas English has different lexemes for bake, broil,
fry, and roast, German braten covers fry, roast, and broil.
In Polish, pec includes bake and broil but has a separate word

Semantic Fields
nitaki cook

niru boil
taku boil <rice>

musU steam

ageru deep fry

yaku bake, roast, broil, fry

Figure 8. Japanese cooking words.

itameru
aburu
stir fry cook with direct heat

for fry, smay, while in Japanese yaku refers to all these processes. Japanese also has a separate word for cooking rice.
That a language lacks a specific word for a concept in English is
of no consequence since the meaning can easily be specified by a
phrase. German im Ofen braten for roast and im Pfanne braten
for fry can be used whenever needed to clarify the sense.
PEDAGOGY. Closely related to comparative lexicology is pedagogy, especially in teaching foreign languages (see teaching
language and bilingual education). Although textbooks
frequently introduce related lexemes together, such as words for
parts of the body, spatial relations, furniture, and so on, a word
is typically translated by the nearest equivalent in the students
native language. However, most lexical items in different languages are rarely exact equivalents. A word in one language can
be translated by several in another language, depending on subtleties of meaning and on selection restrictions. A semantic field
analysis that compares and explicates partial synonyms can help
students greatly. For example, the two textbooks for advanced
English learners by B. Rudzka and colleagues (1981, 1985) offer
students sets of closely related words that are differentiated by
their definitions and collocations. One of their examples is a lexical set meaning roughly being insufficient, inadequate or thinly
distributed: meager, scant, sparse, frugal. Each of these words is
defined, semantically and pragmatically contrasted with others
in the set, and illustrated with relevant examples.
METAPHOR. A third application of semantic fields is to the study
of metaphor and extended word meanings. Often, when words
from one domain are extended metaphorically to another domain,
called bridge terms by Kittay (1987), other lexemes from the first are
also transferred to the second, retaining their relationship with one
another (Lehrer 1974; Lakoff and Johnson 1980; Kittay 1987). This
can be illustrated with the temperature terms hot, warm, cool, cold. In
addition to temperature (of weather, objects, sensations), they are all
used to refer to emotional states or friendship or passion or the lack
thereof. A sexually excited person can be described as hot and typically experiences a rise in body temperature. A cold or frigid person
does not experience a lowered body temperature, but the senses of
unfriendly or unresponsive derive from the meaning by virtue of the
antonymy of hot and warm. Finally, consider the sentence He traded
his hot car for a cold car. Hot car has at least two conventional metaphorical meanings: fast, for example, a Corvette, and stolen.
Because hot and cold are antonyms, one can figure out a plausible
meaning. He traded his fast/sports car for a conventional one (e.g., a
family sedan) or he traded his stolen car for a legally acquired one.
Extended metaphors are widely used in literature (see poetic
metaphor). In William Wordsworths poem On the Extinction
of the Venetian Republic, the history of Venice is described in

terms of the life cycle of a noble woman. Each historical period


corresponds to a stage in a womans life: birth, childhood, maidenhood, marriage, old age, death, mourning after death.
PHILOSOPHY OF LANGUAGE. Finally, in the philosophy of language
a semantic field provides an intermediate chunk of language,
between complete atomism, in which all words are independent
of one another, and holism, in which the entire language must be
treated. Grandy (1987) argues that semantic fields provide tools
for philosophers of language to deal with word meaning, a topic
most have ignored by their focusing on sentence meaning.
Adrienne Lehrer
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Cruse, D. A. 1986. Lexical Semantics. Cambridge: Cambridge University
Press. This textbook provides a good overview and discussion of lexical
relationships.
Grandy, Richard. 1987. In defense of semantic fields. In New Directions
in Semantics, ed. Ernest LePore, 25980. London: Academic Press.
Kittay, Eva F. 1987. Metaphor: Its Cognitive Force and Linguistic Structure.
Oxford: Clarendon. This work provides an excellent theoretical discussion of semantic field theory and its application to the understanding
of metaphor.
Kovecses. Zoltn. 1986. Metaphors of Anger, Pride, and Love: Pragmatics
and Beyond. Amsterdam: John Benjamins.
Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By.
Chicago: University of Chicago Press.
Lehrer, Adrienne 1974. Semantic Fields and Lexical Structure.
Amsterdam: North Holland. This book analyzes a variety of semantic
fields in English and other languages.
. 1983. Markedness and antonymy. Journal of Linguistics
21: 397429.
Lyons, John. 1963. Structural Semantics. An Analysis of Part of the
Vocabulary of Plato. Oxford: Blackwell.
. 1968. Introduction to Theoretical Linguistics. Cambridge:
Cambridge University Press.
. 1977. Semantics. Vol. 1. Cambridge: Cambridge University Press.
The two chapters on pages 231335 provide an excellent discussion of
theoretical issues.
Lutzeier, Peter R. 1981. Wort und Feld. Tbingen: Niemeyer. This work
provides many analyses of semantic fields in German.
. 1993. Studien zur Wortfeldtheorie. Tbingen: Niemeyer. English
translation: Studies in Lexical Field Theory.
Murphy, M. Lynne. 2003. Semantic Relations and the Lexicon.
Cambridge: Cambridge University Press. This work concentrates on
paradigmatic relationships. Murphys theoretical treatment of lexical
relations are different from those of Lyons and Lehrer.
Porzig, W. 1950. Das Wunder der Sprache. Bern: Francke.
Rudzka, B., J. Channell, Y. Putseys, and P. Ostyn. 1981. The Words You
Need. London: Macmillan.
. 1985. More Words You Need . London: Macmillan.

737

Semantic Memory
Saussure, Ferdinand de. [1916] 1966. A Course in General Linguistics.
Trans. W. Baskin. New York: Philosophical Library
Trier, Jost. 1932. Der Deutsche Wortschatz im Sinnbezirk des Verstandes.
Heidelberg: Winter.

SEMANTIC MEMORY
Semantic memory refers to a major division of long-term
memory that includes knowledge of facts, events, ideas, and
concepts. Thus, semantic memory covers a vast cognitive terrain, ranging from information about historical and scientific
facts, details of public events, and mathematical equations to the
information that allows us to identify objects and understand the
meaning of words.
The idea of semantic memory as a distinct form of memory has
a long history dating back at least to the late nineteenth century.
More recently, however, the idea of semantic memory as a separate memory system began in 1972 with Endel Tulvings distinction between semantic and episodic memory. While the notion
of episodic memory has undergone considerable evolution since
that original formulation (Tulving 2002), it remains helpful to
describe the properties of semantic memory in relation to episodic memory. In current formulations, episodic memory can
be thought of as synonymous with autobiographical memory.
Episodic memory is the system that allows us to remember (consciously recollect) past experiences and, perhaps, may also be
critical for imagining and/or simulating future events. Semantic
memories, in contrast, are devoid of information about personal
experience. Thus, unlike episodic memories, semantic memories
lack information about the context of learning, including situational properties like time and place, as well as personal dimensions like how we felt at the time the event was experienced. In
relation to episodic memory, semantic memory is considered to
be both a phylogenically and an ontologically older system. In
fact, rather than arising as an independent evolutionary development, it is commonly assumed that episodic memory emerged as
an add-on or embellishment to semantic memory.

Cortical Lesions and the Breakdown of Semantic Memory


Neuropsychological investigations have established that semantic memories are stored in the cerebral cortex. Many of these
studies have focused predominantly on measures designed to
probe knowledge of object concepts. An object concept refers
to the representation (i.e., information stored in memory) of
an object category (a class of objects in the external world; see
Murphy 2002 and categorization). The primary function
of a concept is to allow us to quickly draw inferences about an
objects properties. That is, identifying an object as, for example,
a hammer means that we know that this is an object used to
pound nails, so that we do not have to rediscover this property
each time the object is encountered.
Several neurological conditions can result in a relatively
global or general disorder of conceptual knowledge. These disorders are considered general in the sense that they cut across
multiple category boundaries; they are not category specific.
Many of these patients suffer from a progressive neurological
disorder of unknown etiology referred to as semantic dementia
(SD). SD is most commonly associated with extensive atrophy

738

of the temporal lobes, especially to the anterior and lateral


regions of temporal lobe cortex. General disorders of semantic
memory are also prominent in patients with Alzheimers disease
(who typically have a greater episodic memory impairment than
SD patients) and can also occur following left hemisphere
stroke, prominently involving the left temporal lobe. The
defining characteristics of this disorder, initially described by
Elizabeth Warrington (1975), are deficits on measures designed
to probe knowledge of objects and their associated properties.
These deficits include impaired object naming (typically semantic errors retrieving the name of another basic level object
from the same category or retrieval of a superordinate category
name), impaired generation of the names of objects within a
superordinate category, and an inability to retrieve information about object properties, such as sensory-based information
(shape, color) and functional information (motor-based properties related to the objects customary use, while including other
kinds of information not directly related to sensory or motor
properties). The impairment is not limited to stimuli presented
in a single modality like vision but, rather, extends to all tasks
probing object knowledge regardless of stimulus presentation
modality (visual, auditory, tactile) or format (words, pictures).
The semantic deficit is hierarchical in the sense that broad levels of knowledge are often preserved while specific information
is impaired. Thus, these patients can sort objects into superordinate categories, having, for example, no difficulty indicating
which are animals, which are tools, which are foods, and the like.
The difficulty is manifest as a problem distinguishing among the
basic level objects as revealed by impaired performance on measures of naming and object property knowledge.
Disorders of semantic memory can also be quite circumscribed. Perhaps the best known of these types of disorders are
the so-called category-specific disorders of semantic memory
(Warrington and Shallice 1984). Category-specific disorders
have the same functional characteristics as SD, except that the
impairment is largely limited to members of a single superordinate object category. For example, a patient with a category-specific disorder for animals will have greater difficulty naming
and retrieving information about members of this superordinate
category relative to members of other superordinate categories
(e.g., tools, furniture, flowers, etc.). Similar to patients with SD,
patients with category-specific disorders have difficulty distinguishing among basic level objects (e.g., between dog, cat, horse,
etc.), thereby suggesting a loss or degradation of information
that uniquely distinguishes among members of the superordinate category (e.g., four-legged animals).
Although a variety of different types of category-specific disorders have been reported (e.g., for fruits and vegetables), most
common have been reports of patients with relatively greater
knowledge deficits for animate entities especially animals than
for a variety of inanimate object categories. While less commonly
reported, other patients show the opposite dissociation of a
greater impairment for inanimate manmade objects including
common tools than for animals and other living things.
There is considerable variability in the location of lesions associated with category-specific disorders for animate and inanimate
entities. Nevertheless, some general tendencies can be observed.
In particular, category-specific knowledge disorders for animals

Semantic Memory
are disproportionately associated with damage to the temporal
lobes. The most common etiology is herpes simplex encephalitis, a viral condition with a predilection for attacking antero-medial and inferior or ventral temporal cortices. Category-specific
knowledge disorders for animals also have been reported following focal, ischemic lesions to the more posterior regions of the
ventral temporal cortex associated with visual object perception.
In contrast, category-specific knowledge disorders for tools and
other manmade objects most commonly occur with focal damage to lateral frontal and parietal cortices of the left hemisphere associated with object manipulation. These anatomical
findings are broadly consistent with property-based views of
knowledge organization that maintain that category-specific
disorders occur when a lesion disrupts information about a
particular property or set of properties critical for defining, and
for distinguishing among, category members. Thus, damage
to regions that store information about object form, and formrelated properties like color and texture, will produce a disorder
for animals. This is because visual appearance is assumed to be a
critical property for defining animals and because the distinction
between different animals is assumed to be heavily dependent
on knowing about subtle differences in their visual forms. In a
similar fashion, damage to regions that store information about
how an object is used should produce a category-specific disorder for tools, and all other categories of objects defined by how
they are manipulated.
It is important to stress, however, that the lesions in patients
presenting with category-specific knowledge disorders are often
large and show considerable variability in their location from
one patient to another. As a result, these cases have been relatively uninformative for questions concerning the organization
of object memories in the cerebral cortex. In contrast, recent
functional neuroimaging studies of the intact human brain
have begun to shed some light on this issue.

The Organization of Conceptual Knowledge:


Neuroimaging Evidence
Functional neuroimaging studies have consistently identified a
region in the left inferior frontal cortex, the left ventrolateral prefrontal cortex (VLPFC), during performance of a wide variety of
semantic memory tasks. Detailed investigations of left VLPFC
have shown that this region is critically involved in the top-down
control of semantic memory. Specifically, left VLPFC is responsible for guiding retrieval and postretrieval selection of conceptual
information stored elsewhere in the brain. This neuroimaging
finding is consistent with studies of patients with left inferior frontal lesions who have word retrieval difficulties but retain conceptual knowledge of the words they have difficulty retrieving.
In addition, a large body of neuroimaging evidence indicates that semantic memory representations especially
concerning representations of specific object properties are
stored in the cerebral cortex. These studies have documented
that information about different types of object-associated properties (e.g., color, action) is represented in different anatomical
regions. Moreover, these regions directly overlap with sites that
mediate perception of these object properties. For example, asking subjects to generate the name of a color associated with an
object (e.g., saying yellow in response to an achromatic picture

of a pencil or its written name) yields activity in neural systems


engaged during color perception; asking subjects to generate the
name of an action associated with an object (e.g., saying write
in response to pencil) yields activity in neural systems engaged
during object manipulation and motion perception (Martin
et al. 1995). These and similar findings suggest that the same
neural systems are involved, at least in part, in perceiving, acting
on, and knowing about specific object properties.
Studies that contrast patterns of neural activity associated
with performing conceptual tasks with different categories of
objects have provided additional evidence for this propertybased view of semantic memory organization. Many of these
studies were motivated by the neuropsychological evidence for
category-specific disorders discussed here. As a result, these
studies have concentrated on the neural systems for perceiving
and knowing about two broad domains: animate agents (living
things that move on their own) and tools (manmade objects with
a close association between their function and the motor movements connected with their use).
These studies have provided evidence for four major points
about the neural organization of object concepts. First, information about a specific object category is not represented in a single
cortical region but, rather, is represented by a network of discrete
regions that may be widely distributed throughout the brain.
Second, consistent with the property-generation studies mentioned, the informational contents of these regions are related
to specific properties associated with the object. Third, some
of these property-based regions are automatically active when
objects are identified, thus suggesting that object perception is
associated with the automatic retrieval of a limited set of associated or inferred properties that may be necessary and sufficient
to identify that object. Fourth, this object property-based information is stored in sensory and motor systems active when that
information was acquired. These properties prominently include
information about what the object looks like (stored in systems
associated with object form perception), how it moves (stored
in systems associated with object motion perception), and, for
tools, information about its use (stored in systems associated with
object grasping and manipulation). Thus, functional neuroimaging studies of object concept representation provide strong
evidence for embodied cognition accounts of knowledge representation, which claim that conceptual knowledge is grounded in
the neural systems that support perception and action.
Alex Martin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Barsalou, Larry W. 1999. Perceptual symbol systems. Behavioral and
Brain Sciences 22: 63760.
Forde, Emer M. E., and Glyn W. Humphreys. 2002. Category Specificity in
the Brain and Mind. Hove and New York: Psychology Press.
Martin, Alex. 2007. The representation of object concepts in the brain.
Annual Review of Psychology 58: 2545.
Martin, Alex, and Alfonso Caramazza. 2003. The Organization of
Conceptual Knowledge in the Brain: Neuropsychological and
Neuroimaging Perspectives. Hove and New York: Psychology Press.
Martin, Alex, James V. Haxby, Francois M. Lalonde, Cheri L. Wiggs, and
Leslie G. Ungerleider. 1995. Discrete cortical regions associated with
knowledge of color and knowledge of action. Science 270: 1025.

739

Semantic Primitives (Primes)


Murphy, Gregory L. 2002. The Big Book of Concepts. Cambridge, MA: MIT
Press.
Tulving, Endel. 1972. Episodic and semantic memory. In Organization
of Memory, 381403. London: Academic Press.
. 2002. Episodic memory: From mind to brain. Annual Review of
Psychology 53: 125.
Warrington, Elizabeth K. 1975. The Selective Impairment of Semantic
Memory. Quarterly Journal of Experimental Psychology 27: 63557.
Warrington, Elizabeth K., and Tim Shallice. 1984. Category specific
semantic impairments. Brain 107: 82954.

SEMANTIC PRIMITIVES (PRIMES)


At its broadest, semantic primitives refers to a set of postulated
irreducible meanings from which complex meanings are composed. (See semantics and lexical semantics.) The traditional view is that they are the simplest word meanings in
ordinary language. Antoine Arnauld expressed a common view
among seventeenth-century philosophers, including Descartes,
Leibniz, Locke, and Pascal, when he wrote:
I say it would be impossible to define every word. For in order
to define a word it is necessary to use other words designating
the idea we want to connect to the word being defined. And if we
again wished to define the words used to explain that word, we
would need still others, and so on to infinity. Consequently, we
necessarily have to stop at primitive terms which are undefined.
(Arnauld and Nicole ([1662] 1996: 64)

In more recent times, similar reasoning led to the advocacy


of semantic primitives by Holger Steen Srensen (1958), Andrzej
Bogusawski (1965, 1970), and Anna Wierzbicka (1972, 1996, and
other works), among others. As Srensen (1958, 423) put it: A
sign belonging to the smallest set of signs from which all the signs
of V [i.e., vocabulary] can be derived is a semantically primitive
sign. The most prolific and definitive work on semantic primitives in modern times has been carried out by Wierzbicka and
her colleagues. This is dealt with in subsequent sections, but
before that we survey some other contributions.
The early work of the Moscow School linguists included some
concrete proposals about semantic primitives. Juri D. Apresjan
([1974] 1992) insisted on the theoretical importance of a semantic language or conceptual language, with a vocabulary of elementary senses. Building on proposals by A. K. Zolkovskij, he
advanced an inventory of about 28 semantic primitives, including
I, person, want, think, thing, property, number, more, time, space,
no, set/group, all, and, or, if-then, true, act, have, be able, can, manner, and one. Since the 1970s, however, semantic primitives have
largely slipped off the agenda of the Moscow School linguists.
The term semantic (or conceptual) primitive is sometimes used
to designate postulated elementary meanings (atomic concepts,
etc.) that do not necessarily correspond to the meanings of ordinary words. Here, the most prominent tradition is that of componential analysis, in both its structuralist version (e.g., Hjelmslev,
Greimas, Pottier, Coseriu) and its generative incarnation (e.g.,
Bendix, Bierwisch, Katz, McCawley, Fillmore, Jackendoff,
Pustejovsky). Some artificial intelligence researchers (e.g., Wilks,
Schank) have also upheld a componential approach toward
semantic primitives. Among the more plausible commonly proposed primitives from work of this kind are cause, act (or do),

740

Table 1. Semantic primes English exponents


Substantives:

I, YOU, SOMEONE, SOMETHING/THING, PEOPLE,


BODY

Relational substantives:

KIND, PART

Determiners:

THIS, THE SAME, OTHER/ELSE

Quantifiers:

ONE, TWO, MUCH/MANY, SOME, ALL

Evaluators:

GOOD, BAD

Descriptors:

BIG, SMALL

Mental predicates:

THINK, KNOW, WANT, FEEL, SEE, HEAR

Speech:

SAY, WORDS, TRUE

Actions, events,
movement, contact:

DO, HAPPEN,

Location, existence, possession, specification:

BE (SOMEWHERE), THERE IS,

Life and death:

LIVE, DIE

Time:

WHEN/TIME, NOW, BEFORE, AFTER, A LONG TIME,

MOVE, TOUCH

HAVE, BE (SOMEONE/SOMETHING)

A SHORT TIME, FOR SOME TIME, MOMENT

Space:

WHERE/PLACE, HERE, ABOVE, BELOW, NEAR, FAR,


SIDE, INSIDE

Logical concepts:

NOT, MAYBE, CAN, BECAUSE, IF

Intensifier, augmentor:

VERY, MORE

Similarity:

LIKE/AS

Notes: Primes exist as the meanings of lexical units (not at the level of lexemes).
Exponents of primes may be words, bound morphemes, or phrasemes. They can be
formally complex. They can have different morphosyntactic properties, including word
class, in different languages. They can have combinatorial variants (allolexes). Each
prime has well-specified syntactic (combinatorial) properties.

inch (or become), not, and exist. Some took the form of binary
features, for example, +human, +contact. Often, componentialists employed plainly complex and/or arbitrary features, with no
plausible claim for primitive status. The componential tradition
did not yield a comprehensive set of semantic primitives that
could sustain any claim to be adequate for the semantic analysis
of the entire vocabulary. Most of the work was advanced on a provisional basis and in relation to small sets of data. Ray Jackendoff
(1990, 2001) is a partial exception. He has continued to elaborate
and diversify his account of abstract conceptual primitives over
some years, but without yet proposing a well-defined set.

Semantic Primes: The NSM Model


The most highly developed model of semantic primitives is the
natural semantic metalanguage (NSM) approach, inaugurated
by Wierzbicka in her 1972 book Semantic Primitives. In recent
work in this approach, the term semantic prime is now standard, and will be adopted from here on. The twin hallmarks of
the NSM approach are the convictions that i) semantic primes
and their elementary syntax exist as a subset of ordinary
natural language, and ii) the semantic primes of different languages coincide, as does their elementary syntax (see semantics, universals of). The current NSM inventory of primes
stands at 63 in number. They are listed in Table 1. Although this

Semantic Primitives (Primes)


table is given in English, comparable tables have been drawn
up for a range of typologically and genetically different languages, including French, Spanish, Polish, Russian, Malay, Lao,
Mandarin Chinese, Mbula (PNG), Korean, East Cree, Amharic,
and Japanese (Goddard and Wierzbicka 2002; Peeters 2006;
Goddard 2008).
Semantic primes are identified by a trial-and-error process of
reductive paraphrase. For example, how could one paraphrase
the meaning of say, in a sentence like Mary said something to me,
using simpler words? An expression like verbally express would
not do, because the words verbally and express are more complex and difficult to understand than say is in the first place. The
only plausible line of explication would be something like Mary
did something because she wanted me to know something; but
this fails since there are many actions a person could undertake
because of wanting someone to know something, aside from
saying. On account of its resistance to paraphrase, say is a good
candidate for the status of semantic prime. Furthermore, say is
clearly required for the explication of many other lexical items
involving speaking and communication, such as speech-act
verbs and many discourse particles, and, on available evidence,
exponents of say can be found in all languages.
A crucial part of the NSM account of semantic primes is
that they have an inherent grammar (a conceptual grammar)
that is the same in all languages, though its formal realization
(marking patterns, word order, etc.) of course differs from language to language. The syntax of semantic primes includes i)
basic combinatorics; for example, substantives can combine
with specifiers this thing, someone else, the same place,
one part, many kinds; ii) basic and extended valencies; for
example, do can occur not only in the minimal frame someone does something but also with valency extensions for a
patient, instrument, or comitative (do something to someone/
something, do something with something, do something
with someone) (see thematic roles); and iii) propositional
complement possibilities of primes like know, think, and
want; for example, think can take an embedded propositional
complement (think that ) or a quasi-quotational complement (think like this: ).

Using Semantic Primes


NSM researchers claim that the semantic primes in Table 1
provide the resources for adequately paraphrasing the entirety
of the vocabulary of any language. A large body of descriptiveanalytical work exists in the framework. To give an impression of
how complex semantic explications can be composed inside the
small vocabulary of semantic primes, consider explication [A] for
the English verb to miss (someone).
[A] Someone X misses someone else Y:

(a) when this someone X thinks about someone else Y, this


someone feels something bad
(b) like people can feel when they think like this about
someone:
(c) I was with this someone before,
(d) when I was with this someone, I felt some good things
(e) I know that I cant be with this someone now

Many emotion concepts have a similar semantic structure


(Wierzbicka 1999). Notice that the analogy clause in line (b),
introduced by like, brings in a prototypical element: a reference to how people can feel when they think like this about
someone. The occurrence of think in its quasi-quotational
frame allows the presentation of a set of subjective components,
labeled as (c)(e), which set out the prototypical cognitive scenario linked with the emotion verb in question.
As an example from the concrete lexicon, consider the
English causative verb break. In the general literature, it is frequently analyzed as cause to become broken, where, needless
to say, the presence of the self-evidently complex term broken is
an embarrassment. An NSM explication is given in [B]. English
break is, of course, polysemous; the explication applies only to
the sense found in examples like to break a stick (egg, lightbulb,
vase). The explication gives a much more articulated account
of the event structure than the standard analysis. It depicts an
action by agent X with a concurrent effect on patient Y, resulting in the cessation of a prior state (this was not one thing anymore). It includes an aspectual specification, that the effect
on Y happened in one moment, and a subjective component
indicating that the result can be seen as irreversible. Consistent
with the somewhat schematic nature of this explication, many
languages lack any comparably broad term that would subsume
many different manners of breaking.
[B] Someone X broke thing Y:

(a) this someone X did something to thing Y


(b) because of this something happened to this thing at the
same time
(c) it happened in one moment
(d) because of this afterwards this thing was not one thing
anymore
(e) people can think about it like this: it cant be one thing
anymore
As mentioned, the corpus of NSM semantic analyses is
extensive. In terms of length and complexity, the explications
presented here are toward the simpler end of the scale. Many
explications run to 20 or more lines of semantic text.
The metalanguage of semantic primes can be used not only
for lexical and grammatical semantics but also as a notation
for writing cultural scripts, that is, hypotheses about culturally
shared assumptions, norms, and expectations that help regulate interaction in different cultural settings (Goddard ed. 2006).
It can also be used to spell out the meanings conveyed by nonverbal signals, such as facial expressions, gestures, and bodily
postures.
Cliff Goddard
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Apresjan, Juri D. [1974] 1992. Lexical Semantics: Users Guide to
Contemporary Russian Vocabulary. Ann Arbor. MI: Karoma. Originally
published as Leksiceskaja Semantika Sinonimeceskie Sredstva Jazyka.
Moscow: Nauka.
. 2000. Systematic Lexicography. Trans. Kevin Windle. Oxford: Oxford
University Press.

741

Semantics
Arnauld, Antoine, and Pierre Nicole. [1662] 1996. Logic or The Art of
Thinking. Trans. Jill Vance Buroker. Cambridge: Cambridge University
Press.
Boguslawski, Andrzej. 1965. Semantyczne pojecie licebnika. Wroclaw:
Ossolineum.
. 1970. On semantic primitives and meaningfulness. In Sign,
Language and Culture, ed. R. J. A. Greimas, M. R. Mayenowa, and
S. Zolkiewski, 14352. The Hague: Mouton.
Goddard, Cliff, ed. 2006. Ethnopragmatics: Understanding Discourse in
Cultural Context. Berlin: Mouton de Gruyter.
. 2008. Cross-Linguistic Semantics. Amsterdam: John Benjamins.
Goddard, Cliff, and Anna Wierzbicka, eds. 2002. Meaning and
Universal Grammar Theory and Empirical Findings. Vols, 1 and 2.
Amsterdam: John Benjamins.
Jackendoff, Ray. 1990. Semantic Structures. Cambridge, MA: MIT Press.
. 2002. Foundations of Language. Oxford: Oxford University Press.
Peeters, Bert, ed. 2006. Semantic Primes and Universal Grammar:
Evidence from the Romance Languages. Amsterdam: Benjamins.
Schank, Roger C. 1972. Conceptual dependency: A theory of natural language understanding. Cognitive Psychology 3: 552631.
Srensen, Holger Steen. 1958. Word-Classes in Modern English.
Copenhagen: Gad.
Wierzbicka, Anna. 1972. Semantic Primitives. Frankfurt: Athenum.
. 1996. Semantics: Primes and Universals. Oxford: Oxford University
Press.
. 1999. Emotions Across Languages and Cultures. Cambridge:
Cambridge University Press.
. 2006. English: Meaning and Culture. New York: Oxford University
Press.
An extensive bibliography on semantic primes is available online
at: http://www.une.edu.au/bcss/linguistics/nsm.

SEMANTICS
Semantics, the study of linguistic meaning, has been traditionally
distinguished from syntax, the study of the grammatical structures of languages (see also grammaticality), and pragmatics, the study of the uses of language, though there are important
interconnections between them.
As natural languages like English, Chinese, and Arabic have an
infinite number of nonsynonymous sentences, some expressions
must be semantically primitive and others complex, given the finitude of speakers. An expression is a semantical primitive provided
the rules which give the meaning for the sentences in which it
does not appear do not suffice to determine the meaning of the
sentences in which it does appear (Davidson [1984] 2001, 9). A
language with such a division of terms is compositional and this
feature of it compositionality. In a compositional language,
complex expressions are understood on the basis of understanding primitive expressions and rules for their combination. This
is not a matter of explicit propositional knowledge but is, rather,
exhibited in a speakers competence in using words appropriately in response to his or her environment and what others say.
lexical semantics focuses on word meaning. Lexical
semantics is concerned with the semantical categories of words,
their structures, what individual words mean, meaning connections between words, and, more generally, what it is for words
to having meaning. Compositional semantics focuses on how we
understand complex expressions on the basis of their significant
parts. Lexical semantics contributes to compositional semantics

742

because a words semantical category determines how it contributes to the complex expressions in which it appears and also
because not all words are semantical primitives verb inflection
for tense, for example, involves a structural element.
Although we understand sentences on the basis of their contained words, there is a sense in which the sentence is the basic
unit of meaning in a language. Sentences are the main vehicle
for linguistic communication, and words have their point only
in giving sentences theirs. Sentences have their point, in turn, in
what they are used to do. It follows that speech-acts are the
basic unit of linguistic evaluation. A speech-act is an intentional
action with an illocutionary force and content. The illocutionary force of a speech-act is the dimension along which
we distinguish asserting, commanding, promising, and the like
(Searle 1979). Although speech-act theory is part of the subject
matter of pragmatics, broadly speaking, it must also be part of
the subject of semantics, since sentence meaning cannot be
understood apart from the central purposes it serves. The connection with language use is also seen in the need for rules to
handle context-sensitive elements in sentences, such as tense
and indexicals. For example, I am hungry expresses a different proposition depending on who uses it and at what time.
The rule attaching to I is that it refers to the person using it. The
rule for present tense is that it is about the time of the utterance of
the sentence. Speaker and time are two basic features of speech
contexts. These arguably suffice to determine other elements of
context relevant to interpretation. That refers to what the speaker
demonstrates in using it, there refers to the location the speaker
intends when using it, and so on.
The most basic question of semantics What is meaning? is best approached by way of asking other, more specific
questions. What different aspects can be distinguished in and
between the meanings of expressions? How do we understand
complex expressions on the basis of their constituents? What is
it to understand a word? What is it to speak a language? How is
word and sentence meaning determined in the mouth of a particular speaker? How is it determined across a linguistic community? The rest of this entry considers theories that address these
questions, beginning with formal compositional semantic theories (see formal semantics).
A compositional semantic theory gives the meanings of complex expressions on the basis of their primitive constituents
and mode of combination. A reasonable requirement is that it
produce for each declarative sentence of the studied language
the object language (L) a theorem of the form (M), where s is
replaced by a description of the object language sentence as
formed from its significant parts, and p is replaced by its translation into the metalanguage. (Something parallel is needed
for nondeclaratives if they are not reducible to declaratives see
Boisvert and Ludwig 2006.)
(M) s means in L that p.

This presupposes a recursive syntax for the language because


rules assigning meanings to complexes operate over grammatical expressions.
Compositional semantic theories divide into extensional and
intensional theories (see also intension and extension).
The distinction has a long history. A foundational expression of

Semantics
it for modern semantics is found in Gottlob Freges distinction
between sense and reference ([1892] 1997). An intensional
theory deals with the senses of expressions. Senses were traditionally held to determine, given how the world is, expressions
extensional properties. Frege introduced sense to account for the
cognitive significance of identity statements like The Morning
Star is The Evening Star. Since the referent of each term is Venus,
the significance has to be accounted for by appeal to something
else. Frege assigned each expression The Morning Star and
The Evening Star a distinct mode of presentation of Venus,
which he called their senses, and then generalized the notion to
expressions in other categories. New information is conveyed
about Venus, on this view, by way of the association of distinct
modes of presentation with one object. If, generally, extensions
are determined by terms intensions (a matter of controversy),
an intensional theory suffices for an extensional one. Intensional
theories, following Frege, typically reify expression senses and
assign them to expressions. These abstract intensional entities
are to be as finely individuated as meaning. They may be concepts (Fregean senses), properties, relations, propositions, or
similar objects (for example, functions from possible worlds to
extensions). Intensions of complex expressions are assigned on
the basis of the intensions of their parts. An extensional semantic
theory, in contrast, assigns referents to referring terms, extensions to predicates (sets of things or ordered n-tuples they are
true of, or something determinative of these), and truth values
to sentences (denoted with True or False). An extensional
semantic theory is not a full semantic theory, but a promising approach (discussed later) suggests that with appropriate
constraints, an extensional theory can achieve the aim of a full
semantic theory.
An influential formal framework in linguistics is Montague
semantics (Montague 1974). Montague semantics employs formal logic and the ontology of possible worlds (introduced to
provide a semantics for statements about necessity and possibility [Kripke 1963]; see also possible worlds semantics)
to develop formal intensional compositional theories of natural
languages. The approach originates in Frege ([1892] 1997) and
was developed in important work by Alonzo Church (1951) and
Rudolf Carnap (1947). The basic idea is that semantic compositionality is functional application: When significant expressions
are concatenated, one is a functional term and the other an argument term. Each expression has an intension and an extension.
The intension is a function from possible worlds to extensions.
This formally models expression sense as a determinant of extension relative to how the world is. The extension of an expression
is an entity of a sort appropriate for its semantic type.
To provide a rough idea of the approach, take as an example
the sentence Every ball rolled. The extension (referent, essentially) of each word, in Montagues approach, is a function. The
sentence is formed by first concatenating Every with ball,
where ball is an argument for the function denoted by Every,
which yields as its value another function denoted by Every ball
(thus, it is of the form Every(ball) where Every() represents
the function and ball supplies its argument). Every ball is then
concatenated with rolled, which is treated as supplying the
argument for the function denoted by Every ball. Thus, Every
ball rolled is treated as having the form (Every(ball))(rolled).

The extensions of ball and rolled are functions from individuals (or for Montague, strictly speaking, individual concepts) to
truth values (the function determines a set of individuals the word
is true of, and so a predicate extension). Their intensions, in turn,
are functions from possible worlds to such functions. Every is
treated as referring to a second-order function. It takes the intensions of common nouns like ball as arguments to yield a function
that takes intensions of verbs like rolled as their arguments and
yields truth values. The intension of Every, in turn, is a function from possible worlds to functions of the sort it denotes. The
extension of Every(ball), in particular, is a function that takes
the intension of a predicate F (e.g., rolled) to True if, and only
if, F (e.g., rolled) is true of anything that ball is true of (for our
example, if, and only if, every ball rolled).
The point of taking argument terms generally to supply their
intensions is that it provides a uniform treatment that handles
so-called intensional contexts, in which one cannot intersubstitute on the basis of sameness of extension. Consider an intensional transitive verb such as want. One can want a basketball
without wanting an orange ball, even if all and only basketballs
are orange balls (which means that a basketball and an orange
ball have the same extension). Want then must not take extensions as arguments, for a function yields one value per argument.
However, the intensions of a basketball and an orange ball
differ, since there are possible worlds in which some basketballs
are not orange. So if want takes the intensions of the terms
that follow it as arguments, we can explain why we cant substitute in the context and be guaranteed to preserve truth value.
On this approach, compositionality is understood as a matter
of the meanings of complex expressions being a function of the
meanings of their syntactic constituents; that is, it requires only
that there be a mapping from the intensions of the constituents
of complex expressions (in a corresponding order) to the intensions of those complex expressions.
Two points are worth noting about this framework. First, treating intensions as functions from possible worlds to extensions
does not allow us to distinguish between the meanings of necessarily coextensive expressions, such as triangle and trilateral. This might be overcome by some refinement of the notion
of intension (e.g., Thomason 1980). Second, the approach does
its work by assigning a referent to every significant expression,
whether or not it appears to be a referring term. Two questions
arise about this feature of the approach (and others that assign
intensional entities to ostensibly nonreferring expressions, e.g.,
situation semantics). The first is whether it is necessary. The
second is whether it is sufficient. We consider both questions
by looking at a more ontologically parsimonious approach proposed by Donald Davidson ([1984] 2001, Chap. 2).
Davidsons approach also exploits formal logic. The central
idea is that an axiomatic truth theory for a language that is
known to satisfy certain constraints can be used to provide interpretations of its sentences in a way that reveals how we understand complex expressions on the basis of their parts. Alfred
Tarski ([1934] 1983) showed how to define a truth predicate for a
formal language a predicate that had in its extension all and only
the true sentences of the language. The desideratum he placed
on a successful definition (Tarskis Convention T) required
that it entail all instances of the schema (T), where is T is the

743

Semantics
defined predicate, s is replaced by a structural description of an
object language sentence and p by a metalanguage sentence
that translates it. (S), for example, is an instance of (T), where
is the symbol for concatenation (AB =df the concatenation of A with B).
(T) s is T if and only if p
(S) Laneigeest blanche is T if and only if snow is white.

Davidson noted that if we had a theory that met Tarskis constraints and could identify formally the sentences of the form (T)
satisfying them, then we would have in effect a way of providing
an interpretation for every sentence of the language. For if p
translates s, means that can replace is T if and only if without loss of truth, yielding a sentence of the form (M) for every
object language sentence. Moreover, if the axioms meet a similar
constraint, the proof of the theorem for a sentence would reveal
its compositional structure, that is, how its meaning giving truth
conditions were determined by meaning giving axioms and its
structure.
Tarskis approach can be modified to apply to natural languages containing context-sensitive expressions (Larson and
Segal 1995; Lepore and Ludwig 2005, 2007). It can also be
extended to nondeclarative sentences (Boisvert and Ludwig
2006). Since the approach gives a central place to a truth theory for a language, it is often called truth-theoretic semantics or
truth conditional semantics, though this label is also
applied to Montague semantics. Notably, it requires no more
ontology than is needed for an extensional semantic theory.
If this approach can be elaborated successfully, it shows that
assigning intensional entities to all expressions is not essential to
compositional semantics.
There are reasons to think that assigning intensional entities to
expressions is also not itself sufficient. A compositional meaning
theory must enable understanding of the object language, but the
utility of a theory that assigns intensional entities to expressions in
enabling understanding depends on the use of terms we already
understand to denote their meanings. This shows that reference to
the meaning entities per se is not sufficient. A simple example illustrates the difficulty. Let an italicized expression denote its meaning. Let f(x, y) be a function that takes the meaning of an adjective
in the place of x and of a noun in the place of y to the meaning
of the expression gotten from adjectival modification of the noun.
Suppose as axioms: i) The adjective rouge means red, ii) the noun
boule means ball, and iii) for any adjective A and noun N, for any
meanings x and y, if x is the meaning of A and y is the meaning of
N, then NA means f(x, y). It follows that boule rouge means
f(red, ball). We see that f(red, ball) denotes the meaning of red
ball, which we understand, and so come to understand boule
rouge. However, if instead the meanings were denoted using the
meaning of rouge and the meaning of boule instead of red
and ball in our axioms, the theorem would still assign the right
meaning, but knowing it would not suffice to understand boule
rouge. Thus, understanding is not conveyed (where it is) by the
reference to entities but by appropriate matching of understood
terms with terms of the object language understood to be the same
in meaning (see Lepore and Ludwig 2006).
What semantical primitives mean in a natural language is ultimately a matter of their use in a linguistic community. Further

744

illumination should be sought in understanding how word


use determines meaning. The two most important approaches
to this question are the intention-based and interpretational
approaches.
The first was introduced by Paul Grice (1957). Grice aimed
to analyze what it was for an utterance act to mean something,
and to mean something in particular, in terms of the intentions a
speaker had in performing it. He initially proposed that
A speaker S means something by an utterance u if and only if S
utters u intending (1) an auditor A to produce a response r, (2) A
to recognize S intends (1), and (3) A to produce r on the basis of
his fulfillment of (2).

On this approach, utterance meaning is determined by the


communicative intention with which it is produced.
Grice refined the proposal further in response to counterexamples (1969, 1989). Another intention-based approach, though
detached from the context of communication, can be found in
Searle (1983, Chap. 6). These accounts aim to explain utterance
meaning first. Expression meaning is then to be understood in
terms of conventions governing expression types that aid in the
recognition of speakers intentions.
The interpretational approach, elaborated by W. V. O. Quine
(1960) and then Davidson ([1984] 2001, Chap. 9), aims to illuminate
the family of concepts used in interpreting others, not by reduction to other concepts, but by showing how to marshal our most
basic evidence for an interpretation theory for another speaker.
The motivation is that it is conceptually central to being a speaker
that one is interpretable by another. Quine and Davidson argue
that meaning facts and related matters, such as what propositional attitudes a speaker has, must therefore be, in principle,
recoverable from intersubjectively available behavioral evidence
that presupposes no prior knowledge of what the speaker means
or thinks. While compatible with an intention-based approach,
this starts with more basic evidence. An interpreter who starts
from this evidential base is a radical interpreter (see radical
interpretation). On Davidsons approach, a central aim of the
interpreter is to confirm a truth theory for the speaker that meets
an analog of Tarskis Convention T. Davidson argues that the radical interpreter must accept the principle of charity: A speaker is
mostly right about the world and about his or her meanings and
thoughts (see charity, principle of). If we read interpretability in a strong vein, it follows that any two speakers are mutually
interpretable and mostly right about their thoughts, meanings,
and environments, guaranteeing a shared, largely true, view of the
world. It is controversial whether the evidential base of the radical
interpreter suffices for correct interpretation. However, the interpretational strategy is promising, even if we relax the restriction on
evidence, because it draws attention to constraints that fall out of
the central function of language, which must structure any fundamental understanding of linguistic meaning.
Kirk Ludwig
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Boisvert, Daniel, and Kirk Ludwig. 2006. Semantics for nondeclaratives.
In The Oxford Handbook of the Philosophy of Language, ed. Ernest
Lepore and B. Smith, 86492. New York: Oxford University Press.

Semantics, Acquisition of
Carnap, Rudolf. 1947. Meaning and Necessity. Chicago: University of
Chicago Press.
Church, Alonzo. 1951. A formulation of the logic of sense and denotation. In Structure, Method and Meaning: Essays in Honour of H. M.
Sheffer, ed. P. Henle, H. Kallen, and S. Langer, 324. New York: Liberal
Arts.
Davidson, Donald. [1984] 2001. Inquiries into Truth and Interpretation.
2d ed. New York: Clarendon.
Frege, Gottlob. [1892] 1997. On Sinn and Bedeutung. In The Frege
Reader, ed. M. Beaney. 15171. Oxford: Blackwell.
Grice, Paul. 1957. Meaning. Philosophical Review 66: 37788.
. 1969. Utterers meaning and intentions. Philosophical Review
78: 14777.
. 1989. Studies in the Way of Words. Cambridge: Harvard University
Press.
Kripke, Saul. 1963. Semantical considerations on modal logics. Acta
Philosophica Fennica 16: 8394.
Larson, Richard, and Gabriel Segal. 1995. Knowledge of Meaning.
Cambridge, MA: MIT Press.
Lepore, Ernest, and Kirk Ludwig. 2005. Donald Davidson: Truth, Meaning,
Language and Reality. New York: Oxford University Press.
. 2006. Ontology in the theory of meaning. International Journal
of Philosophical Studies 14.3: 32131.
. 2007. Donald Davidsons Truth-Theoretic Semantics. New
York: Oxford University Press.
Montague, Richard. 1974. Formal Philosophy: Selected Papers of Richard
Montague. Ed. R. Thomason. London: Yale University Press.
Quine, Willard Van Orman. 1960. Word and Object. Cambridge, MA: MIT
Press.
Searle, John. 1979. A taxonomy of illocutionary acts. In Expression and
Meaning, 129. Cambridge: Cambridge University Press.
. 1983. Intentionality: An Essay in the Philosophy of Mind.
Cambridge: Cambridge University Press.
Tarski, Alfred. [1934] 1983. The concept of truth in formalized
languages. In Logic, Semantics, Metamathematics, 152278.
Indianapolis: Hackett.
Thomason, R. 1980. A model theory for propositional attitudes.
Linguistics and Philosophy 4: 4770.

SEMANTICS, ACQUISITION OF
Without special training or carefully sequenced input, by the
age of four or five children are effectively adults in their abilities
to produce and understand novel sentences and to judge the
truth or falsity of endlessly many statements. There are two main
accounts of this remarkable acquisition scenario, one emphasizing the contribution of innate knowledge (see innateness
and innatism)and the other emphasizing childrens abilities
to extract generalizations from the input. The alternatives can
be traced back to the nature versus nurture debate about how
knowledge is acquired in any cognitive domain.
One approach views language acquisition on a par with the
acquisition of social skills, learning to count, learning to read,
and so on. This nurture approach views language development
as recruiting domain-general learning mechanisms and highlights the availability of relevant cues in the input to children.
These cues serve as the basis for the linguistic generalizations
that children form. The alternative nativist approach attributes
domain-specific knowledge to children and highlights the contributions of human nature to the emergence of linguistic knowledge. On this view, there is a considerable gap between childrens

experience and the knowledge they achieve. Innate principles


that are specific to language are thought to fill this gap, enabling
children to rapidly acquire any human language despite the considerable latitude in experience for different children.
The acquisition of semantic knowledge is a good testing
ground in adjudicating between these alternative approaches
because the principles of semantic interpretation are quite complex, and the input corresponding to these principles is sparse at
best. The nurture approach, therefore, anticipates that knowledge
of semantic principles should have a protracted time course. The
nature approach anticipates that children will know these principles at an early age. We describe experimental investigations
concerning what children know about the semantics of human
languages and when they know it. Due to space limitations, our
discussion concentrates on one central set of phenomena in
semantic development: childrens knowledge of the meanings of
several logical expressions (for other aspects of semantic development, see constraints in language acquisition). We
report research findings on the acquisition of the meaning of the
logical connectives for disjunction (or) and for negation (not), as
well as the interpretation of the universal quantifier every (see
quantification) and the focus operator only. It turns out
that the meaning of disjunction is a useful diagnostic in assessing childrens interpretation of other logical words, and so we
begin there.

Disjunction
In classical logic, disjunction has the truth conditions associated with inclusive-or, such that a statement of the form A or
B is true if only A is true, if only B is true, and if both A and B
are true. However, the findings from several studies have been
interpreted as showing that even children 1012 years old assign
the truth conditions associated with exclusive-or to natural language or (e.g., Braine and Rumain 1983). Although Martin Braine
and Barbara Rumain acknowledge the view that equates or with
standard logic, they ultimately reject this view on the grounds
that coherent judgments of the truth of or-statements emerge
relatively late and are not universal in adults (1983, 291). The
conclusion that children assign the exclusive-or (not both)
interpretation to disjunctive statements is unwarranted, however, because it rests largely on tasks that involve requests to perform actions, which encourage a choose one interpretation.
But an act-out methodology has limited value in determining the
full range of truth conditions that are associated with a particular linguistic expression (Crain and Thornton 1998). At most, the
act-out task provides evidence that the subjects grammar allows
one interpretation (the one that is acted out), but findings from
this task cannot be used to infer that children do not have other
interpretations available to them. Children may simply favor one
reading over others in a certain experimental context.
A research methodology has been devised to distinguish
between a preference for one interpretation and the availability of alternative interpretations. It is called the truth value
judgment task (Crain and Thornton 1998). In this task, children
judge whether or not statements made by a puppet, played by
one experimenter, are accurate descriptions of what happened
in stories acted out by a second experimenter. For example, suppose that Bunny Rabbit ate a carrot and a pepper. The puppet

745

Semantics, Acquisition of
could then produce a variety of different statements: Bunny
Rabbit ate a carrot and a pepper versus Bunny Rabbit ate a
carrot or a pepper. Notice that the statement with disjunction is
true from a logical point of view since the truth conditions associated with disjunction in logic are those of inclusive-or. In fact,
children younger than six or seven consistently judge or-statements to be true in circumstances where and-statements are also
true (e.g., Chierchia et al. 2004). By contrast, adults consistently
judge or-statements to be false in circumstances where andstatements are true. Adult judgments are influenced by a pragmatic principle of cooperation. This principle dictates that
a speakers intended meaning should be conveyed as directly
as possible. In circumstances in which both or-statements and
and-statements are true, and-statements convey the intended
meaning more perspicuously since or-statements would be true
in other circumstances as well. Since hearers infer that a cooperative speaker would use an and-statement if he or she wanted
to convey the both interpretation, the use of or is taken to imply
exclusivity (not both).

Negation
One of the laws of classical logic governs negative statements
with disjunction: (A B) A B. According to this law, a
negative disjunctive statement, (A B), generates a conjunctive
entailment, A B. This entailment only holds if disjunction is
interpreted as inclusive-or. In human languages, too, we can determine whether or not disjunction (e.g., English or, Japanese ka)
corresponds to inclusive-or by asking whether negative statements with disjunction license conjunctive entailments.
Consider (1).
(1)

Suzi didnt see Max eat sushi or pasta.


Suzi didnt see Max eat sushi and she didnt see him eat
pasta

English-speakers judge (1) to be true in circumstances (as indicated by ) in which Suzi did not see Max eat sushi and she
did not see Max eat pasta. So, in English, negative statements
with disjunction generate a conjunctive entailment, just as a
negative disjunctive statement generates a conjunctive entailment in classical logic. In (1), the clause that contains disjunction, Max eat sushi or pasta, is embedded in the clause with
negation, Suzi didnt see. In the syntax, the clause with
negation is higher than the clause that contains disjunction.
Semantically, the critical observation is that (1) generates a conjunctive entailment.
Cross-linguistic research has found that conjunctive entailments are also licensed in other languages in statements in
which negation appears in a higher clause than the clause
that contains disjunction; that is, in statements of the form not
S[A or B]. Statements of this form are interpreted as excluding both the possibility of A and of B. This is clearly the case
in English, as (1) illustrates. Moreover, when such statements
are translated into Japanese, German, Russian, Hungarian,
and so forth, the corresponding statements also carry the same
conjunctive entailments. The cross-linguistic findings invite us
to conclude that disjunction has the truth conditions associated with inclusive-or in all human languages (see Crain and
Thornton 2006).

746

Things are more complicated than this, however, because


languages differ in the interpretation of sentences in which both
disjunction and negation appear in a single clause, rather than
in different clauses in two-clause sentences. In one class of languages, which includes English, a simple negative statement like
Max didnt eat sushi or pasta licenses a conjunctive entailment that Max didnt eat sushi and Max didnt eat pasta. This
is further evidence that English disjunction is inclusive-or, as in
classical logic. However, if a statement of this form is translated
into Japanese, Russian, or Chinese, the corresponding sentences
are not typically judged by adult speakers to generate a conjunctive entailment. This is in contrast to two-clause sentences.
Example (2) illustrates a simple negative disjunctive statement
in Japanese.
(2)

Butasan-wa ninjin ka piiman-wo tabe-nakat-ta


pig-TOP carrot or pepper-ACC eat-NEG-PAST.

Adult speakers of Japanese interpret (2) to mean that the pig


didnt eat the carrot or the pig didnt eat the pepper. That is, a
statement of the form not A or B implies exclusivity (not both)
for Japanese-speaking adults. So a paraphrase of (2) would be: It
is a carrot or a pepper that the pig didnt eat.
Takuya Goro (2004) made an intriguing prediction that
young Japanese-speaking children would initially generate a
conjunctive entailment in simple negative disjunctive sentences,
in contrast to adult speakers of Japanese. The prediction was
confirmed by experimental investigation of Japanese-speaking
children by Goro and Sachie Akiba (2004), who found that young
Japanese-speaking children consistently licensed a conjunctive
entailment in response to statements like (2), whereas Japanesespeaking adults did not (see also Crain, Goro, and Thornton
2006). This finding is prima facie evidence against an experiencebased nurture account of language development inasmuch as
Japanese-speaking children are clearly not using adult input as
the basis for their interpretation of disjunction in negative sentences. Japanese-speaking children adopt an interpretation that
is appropriate for English but not for Japanese.

The Universal Quantifier


Childrens understanding of the universal quantifier (e.g.,
every in English) has been investigated since the work of Barbel
Inhelder and Jean Piaget (1964). Children are sometimes found
to produce nonadult responses to sentences containing the universal quantifier. However, studies of childrens interpretation
of disjunction in sentences with the universal quantifier reveal
a deep understanding of its semantic properties. First, let us see
how disjunction is interpreted in sentences with the universal
quantifier every, for adults.
Suppose that you, Max, and Suzi all board an international
flight. During the flight, Max and Suzi order pasta for their meals,
but you order sushi. Later, every passenger who ordered pasta,
including Max and Suzi, become ill. But, fortunately, you feel
fine. Now, suppose you overhear someone say: Everyone who
ordered pasta or sushi became ill. Would you contradict this
person, saying No, I ordered sushi, and I feel fine? Thats what
English-speaking adults would do. Moreover, if the sentence
Everyone who ordered pasta or sushi became ill is translated
in Japanese, Russian, or Chinese, the resulting sentence carries

Semantics, Acquisition of
the same conjunctive entailment that everyone who ordered
pasta became ill and everyone who ordered sushi became ill
(contrary to fact). This shows us that disjunction licenses a conjunctive entailment in the subject phrase of the universal quantifier everyone who ordered pasta or sushi.
(3)

a. Everyone who ordered pasta or sushi because ill.


b. everyone who ordered pasta became ill and
everyone who ordered sushi became ill

A conjunctive entailment is not generated, however, when


disjunction is in the predicate phrase of the universal quantifier
every. To see this, notice that (4a) and (4b) are not contradictory,
as would be the case if (4a) made the conjunctive entailment in
(4c). (The asterisk indicates that the entailment is not licensed.)
(4)

a. Everyone who became ill ordered pasta or sushi.


b. everyone who became ill ordered pasta,
but no one who became ill ordered sushi
c. * everyone who became ill ordered pasta and
everyone who became ill ordered sushi

We now turn to the literature on child language. Several studies have investigated the truth conditions that children associate
with disjunction in the subject phrase and in the predicate phrase
of the universal quantifier. Using a truth value judgment task, one
study asked children four and five years old to evaluate sentences
like those in (5) and (6) (Gualmini, Meroni, and Crain 2003).
(5)

Every woman bought eggs or bananas.

(6)

Every woman who bought eggs or bananas got a basket.

In one condition, sentences like (5) were presented to children


in a context in which some of the women bought eggs but none
bought bananas. The child subjects consistently accepted test
sentences like (6) in this condition, showing that they assigned
a disjunctive interpretation to or in the subject phrase of the
universal quantifier every. In a second condition, children were
presented with sentences like (6) in a context in which women
who bought eggs received a basket, but not women who bought
bananas. The child subjects consistently rejected the test sentences in this condition. This finding is taken as evidence that
children generated a conjunctive entailment for disjunction
in the subject phrase of every. This asymmetry in childrens
responses in the two conditions demonstrates their knowledge
of the asymmetry in the two grammatical structures associated
with the universal quantifier the subject phrase and the predicate phrase. It is difficult to see how the experience-dependent
nurture approach can explain childrens early mastery of the
asymmetry in the interpretation of disjunction or in sentences
with the universal quantifier.

Focus Expressions
The final topic is the acquisition of focus operators: English only,
Japanese dake, Chinese zhiyou. The meaning of focus operators
is quite complex. Consider the statement: Only Bunny Rabbit
ate a carrot or a pepper. This statement expresses two propositions. One is called the presupposition, and the other is
called the assertion (e.g., Horn 1969, 1996). The presupposition is
derived simply by deleting the focus expression from the original

sentence; this yields Bunny Rabbit ate a carrot or a pepper. For


many speakers, there is an implicature (see conversational
implicature) of exclusivity (not both) in the presupposition.
The second proposition in sentences with focus expressions is
the assertion. To derive the assertion, the sentence can be further
partitioned into a) a focus element and b) a contrast set. Focus
expressions are typically associated with a particular linguistic
expression somewhere in the sentence. This is the focus element.
In the sentence Only Bunny Rabbit ate a carrot or a pepper,
the focus element is Bunny Rabbit. The assertion also concerns
individuals that are not mentioned in the sentence. They form a
contrast set, which is part of the background information. In the
present example, the contrast set consists of individuals being
contrasted with Bunny Rabbit.
The assertion is about the contrast set. The assertion states
that the members of the contrast set lack the property being
attributed to the focus element. Returning to the example Only
Bunny Rabbit ate a carrot or a pepper, the assertion makes
the following claim: everyone else (being contrasted with Bunny
Rabbit) did not eat a carrot or a pepper. The critical point is that
the assertion contains disjunction under (local) negation, and
the assertion licenses a conjunctive entailment: everyone else
didnt eat a carrot and everyone else didnt eat a pepper.
The interpretation of or by English-speaking children and ka
by Japanese-speaking children was used to assess their knowledge of the semantics of only/dake in a series of experiments by
Goro, Utako Minai, and Stephen Crain (2006). In one experiment,
21 English-speaking children (mean age = 5.0) and 20 Japanesespeaking children (mean age = 5.4) participated.
The first aim of the study was to see if children assign disjunctive truth conditions to or in the presupposition of sentences with only/dake, in test sentences like the one illustrated
in (7). To assess childrens interpretation of the presupposition
in (7), the sentence was presented in a situation in which Bunny
Rabbit ate a carrot but not a pepper. The other characters in the
story, Winnie the Pooh and Cookie Monster, did not eat a carrot
or a pepper. Call this Condition I.
(7) Only Bunny Rabbit ate a carrot or a pepper.
Usagichan-dake-ga ninjin ka piiman-wo taberu-yo.
rabbit-only-NOM carrot or green pepper-ACC eat-dec
a. Presupposition: Bunny Rabbit ate a carrot or a pepper
b. Assertion: Everyone else (being contrasted with Bunny
Rabbit) did not eat a carrot or a pepper

The second aim was to see if children generate a conjunctive


entailment in the assertion of sentences with only/dake. This was
accomplished by presenting the same test sentences to children
in a second condition (Condition II). Bunny Rabbit ate a carrot
but not a pepper in Condition II, as in Condition I, but one member of the contrast set, Cookie Monster, ate a pepper. Because
the assertion introduces (covert) negation, (7) is expected to generate a conjunctive entailment that everyone being contrasted
with Bunny Rabbit did not eat a carrot and did not eat a pepper.
Therefore, children should reject the test sentences in Condition
II on the grounds that Cookie Monster ate a pepper.
Children responded as predicted. In Condition I, both
English-speaking and Japanese-speaking children accepted

747

Semantics, Acquisition of
the test sentences over 90% of the time. This finding suggests
that children assigned disjunctive (not both) truth conditions
to or in the presupposition in sentences like (7). In contrast,
the same children rejected the test sentences in Condition
II over 90% of the time. The high rejection rate by children
in Condition II suggests that they know that the disjunction
operator or creates conjunctive entailments in the assertion of
sentences with only/dake. The findings invite the inference that
children have adultlike knowledge about the semantics of only/
dake and are able to compute its complex semantic interaction
with disjunction.
To conclude, the contribution of logic to human languages
is far from transparent due to the intrusion of pragmatic factors
and due to cross-linguistic variation. Children appear to be more
logical than adults, however, because children are less sensitive to these factors, at least initially. Therefore, studies of child
language can sometimes be more revealing about the interplay
of logic and language than studies of adults. Yet there are also
examples of universal linguistic principles that reveal properties
in common between logic and human languages for both children and adults.
Stephen Crain and Rosalind Thornton
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Braine, Martin, and Barbara Rumain. 1983. Logical reasoning. In
Handbook of Child Psychology. Vol. 3: Cognitive Development. Ed.
J. Flavell and E. Markman, 261340. New York: Academic Press.
Chierchia, Gennaro, Maria Teresa Guasti, Andrea Gualmini, Luisa
Meroni, and Stephen Crain. 2004. Semantic and pragmatic competence in children and adults interpretation of or. In Experimental
Pragmatics, ed. I. Noveck and S. Wilson, 283300. London: Palgrave.
Crain, Stephen, and Rosalind Thornton. 1998. Investigations in Universal
Grammar: A Guide to Experiments in the Acquisition of Syntax and
Semantics. Cambridge, MA.: MIT Press.
. 2006. Acquisition of syntax and semantics. In Handbook of
Psycholinguistics (2d ed.), ed. M. Traxler and M. Gernsbacher, 1073
110. Oxford: Elsevier.
Crain, Stephen, Takuya Goro, and Rosalind Thornton. 2006. Language
acquisition is language change. Journal of Psycholinguistic Research
35: 3149.
Goro, Takuya. 2004. The emergence of universal grammar in the emergence of language: The acquisition of Japanese logical connectives and
positive polarity. Manuscript, University of Maryland at College Park.
Goro, Takuya, and Sachie Akiba. 2004. The acquisition of disjunction and positive polarity in Japanese. In West Coast Conference on
Formal Linguistics 23, ed. V. Chand, A. Kelleher, A. Rodrguez, and B.
Schmeiser, 25164. Somerville, MA: Cascadilla.
Goro, Takuya, Utako Minai, and Stephen Crain. 2006. Bringing out the
logic in child language. In Proceedings of North Eastern Linguistic
Society 35, ed. L. Bateman and C. Ussery, I: 24556. Amherst, MA: GLSA
Publications.
Gualmini, Andrea, Luisa Meroni, and Stephen Crain. 2003. An asymmetric universal in child language. In Proceedings of Sinn und Bedeutung
7, ed. Matthias Weisgerber, 13648. Constance, Germany: Konstanz
Linguistics Working Papers.
Horn, Laurence. 1969. A presuppositional approach to only and even.
Proceedings of the Chicago Linguistic Society 5: 98107.
. 1996. Presupposition and implicature. In Handbook
of Contemporary Semantic Theory, ed. S. Lappin, 299319.
Oxford: Blackwell.

748

Semantics, Evolution and


Inhelder, Barbel, and Jean Piaget. 1964. The Early Growth of Logic in the
Child. London: Routledge, Kegan, and Paul.

SEMANTICS, EVOLUTION AND


In the earliest forms of communication, it is the communicative act in itself and the context in which it occurs that is most
important, not the expressive form of the act. As a consequence,
the pragmatics of natural language is the most fundamental
from an evolutionary point of view. After communicative acts
during the hominid evolution become more varied and eventually conventionalized, and their contents become detached from
the immediate context, the different meanings of the acts can
start to be analyzed. Then semantics becomes salient. Finally,
when linguistic communication becomes even more conventionalized and combinatorially richer, certain markers, also
known as syntax, are used to disambiguate the contents when
the context is not sufficient.
In support of the position that pragmatics is evolutionarily
primary (see pragmatics, evolution and), it is clear that
most human cognitive functions had been chiseled out by evolution before the advent of language. Language would not be possible without these cognitive capacities, in particular having a
theory of mind and being able to represent future goals (see
Grdenfors 2003, 2004). In contrast, some researchers argue that
human thinking cannot exist in its full sense without language
(e.g., Dennett 1991). According to the latter view, the emergence
of language is a cause of many forms of thinking, such as concept
formation.
Seeing language as a cause of human thinking, however, is
like seeing money as a cause of human economics (Tomasello
1999, 94). Humans have been trading goods as long as they have
existed. But when a monetary system does emerge, it makes
economic transactions more efficient. The same applies to language: Hominids had been communicating long before they
had a language, but language makes the exchange of meanings
more effective. The analogy carries further. When money is introduced in a society, a relatively stable system of prices emerges.
Similarly, when linguistic communication develops, individuals
will come to share a relatively stable system of meanings, that
is, components in their mental spaces, which communicators
can exchange between each other. In this way, language fosters
a common structure of the mental spaces of the individuals in a
society.
In an evolutionary setting, semantics is best seen as a product
of communication meanings emerge as a result of communicative interactions. The mental space that generates the meanings
for a particular individual is partly determined from the individuals interaction with the world and partly from his or her interaction with others. Semantics can thus be seen as a meeting of
minds (Grdenfors and Warglien 2006).
The evolution of semantics must be discussed in relation to
the type of communication systems used. Here, the analysis is
based on an evolutionary series of four types of communication: signaling, dyadic mimesis, triadic mimesis, and symbols
(Zlatev, Persson, and Grdenfors 2005) and the types of semantics they involve. This is arguably the order in which the communication systems appeared in human evolution, and it also

Semantics, Evolution and


has correspondences in the communicative development of a
human infant. Instead of the traditional dichotomy between
animal communication and human language, such a
progression has strong implications concerning the line between
human communication and that of animals and where it should
be drawn.
All known natural animal communication systems involve
signaling. The function of a signal is to draw attention to something in the environment or to the emotional state of the signaler.
The reference of the signal is something that is present (or was
recently present) in the nearby environment. Animal signals, be
they the dancing of bees or the play-face of chimpanzees, are by
and large innate rather than learned (Hauser 1996). The references of the signals are mainly fixed, although infants sometimes
use a signal nondiscriminatorily and then learn to narrow down
its reference. For example, D. Cheney and R. Seyfarth (1990)
describe how young vervets overgeneralize their alarm calls but
then learn to use them for the appropriate predators.
One can distinguish between animal signals that refer to the
outer world, such as alarm calls or food calls, and signals that
refer to the inner (emotional) state of the communicator. In
the human realm, the emotional role of communication with
small children has been highlighted by A. Fernald (1992), who
has compared the ways that mothers and infants in different
cultures communicate. The mothers use different phonemes
and different words when talking to their infants. However, the
prosody expresses the emotional content of the speech, which is,
to a large extent, independent of the words that are used. Such
emotional attention drawing seems to be the first form of meaning that is understood by an infant. It would seem that it has deep
evolutionary roots.
M. Donald (1991) puts forward the hypothesis that a form of
communication that he calls mimesis mediated between those
of the common ape ancestor and modern humans. In contrast
to signaling, mimetic communication is a bodily motion or a
vocal call that is volitional, that is, under conscious control.
Furthermore, the mimetic act is referential: It refers iconically to
some object, action, or event.
It is useful to make a distinction between dyadic and triadic
mimesis (Zlatev, Persson, and Grdenfors 2005). Apart from
what is required for the dyadic form, an act of triadic mimesis also has a communicative sign function, in the sense that
the sender intends the act to stand for some action, object,
or event and intends that the addressee understands this. In
contrast to dyadic mimesis, the triadic form presumes that the
addressee understands the intentions of the sender (see communicative intention ), which involves a fairly advanced
form of theory of mind (Grdenfors 2003; Tomasello et al.
2005). In the wild, one finds examples of dyadic mimesis, for
example, gestures in chimpanzees (Pika and Mitani 2006), but
so far there are no clear examples of triadic referential mimesis
among nonhuman animals. However, language-trained apes
exhibit triadic mimesis, as well as limited symbolic forms of
communication.
Perhaps the best illustration of the differences among signaling, dyadic mimesis, and triadic mimesis comes from the different stages of pointing as they are found in human infants (Bates,
Camaioni, and Volterra 1975; Brinck 2001, 2004). The first stage is

reaching, which is a form of signaling. In the first six months, the


human child coordinates his/her hand movements and reaches
toward objects. Ape infants reach out to objects that attract
their attention similarly to human children, but less so because
they can move around by themselves already at the age of three
months.
The second stage is imperative pointing. At about eight months
of age, a human infant begins to alternate its gaze between the
object that is pointed to and an addressee. The pointing has an
imperative function of attempting to make the addressee perform some desired action. Hence, imperative pointing can be
classified as a dyadic mimetic gesture.
The third stage in the development of human childrens pointing, mastered around 14 months, is declarative pointing (Bates,
Camaioni, and Volterra 1975; Brinck 2004). The crucial difference with respect to imperative pointing is that the child need
not desire the object pointed at, but rather that the act of achieving joint attention with the addressee on the object is the goal in
itself. Declarative pointing involves a communicative intention,
and the pointing is thus triadic mimetic.
I. Brinck (2004) proposes that at an early stage of a childs use,
an additional purpose of declarative pointing is for the addressee
to evaluate the indicated object, achieved by an exchange of emotional information about it. The main benefit for the child of this
kind of exchange is that it can learn about objects vicariously. It
is possible that the evaluative side of declarative pointing was the
original cause of its proliferation among the hominids (Brinck
2001).
The fundamental role of human communication is to affect
the states of mind of others. Normally, the goal is to make the
minds of the communicators meet so that successful joint action
can arise. In contrast, animal signaling always refers to the external world. Here, minds meet when they attend to something that
is present in the environment.
Most of the contents of human communication, however,
concerns our inner worlds: our judgments, memories, plans,
fantasies, and dreams. C. F. Hockett (1960) proposes that this
kind of displacement is one of the main factors that distinguish
symbolic language from signaling. The primary use of symbols is
to refer to objects that are not present on the scene of communication. Therefore, a semantic theory for communication by a
symbolic system that starts from reference to the external world
seems unnatural.
Another difference is that signaling and pointing concern single referents objects, places, or actions. When inner worlds are
coordinated, single referents are not sufficient, but concepts
become necessary. Words refer to types of entities: nouns to types
of objects, verbs to types of actions, and adjectives to properties
of objects and relations between them (see Grdenfors 2000,
151202). Only names refer directly to single entities. Symbols
are grounded via pointing and other means for direct reference.
However, only a small part of the meanings of words is learned
by ostension (Bloom 2000).
J. Freyd (1983) argues that the fact that knowledge is shared
in a language community imposes constraints on individual cognitive representations. The structural properties of individuals
mental spaces have evolved because they provide for the most
efficient sharing of concepts. She proposes that a dimensional

749

Semantics, Evolution and

Semantics, Neurobiology of

structure with a small number of values on each dimension will


be especially sharable (cf. Grdenfors 2000). This interplay
between individual and social structures creating shared meanings is continually ongoing. The effects are magnified when communication takes place among many individuals.
The constraints of sharing concepts can be discussed in relation to the image schemas of cognitive linguistics. If the
image schema corresponding to a particular expression is markedly different for two individuals, it is likely that this will lead to
problems of communication. A desire for successful communication will, therefore, lead to a gradual alignment of the image
schemas among the members of a linguistic community. After
all, we can communicate even if we do not have identical mental
representations. For example, in communication between children and adults, children often represent their concepts using
fewer dimensions and dimensions that are different from those of
the adults (see conceptual development and change).
Peter Grdenfors
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bates, E., L. Camaioni, and V. Volterra. 1975. Performatives prior to
speech. Merrill-Palmer Quarterly 21: 20526.
Bloom, P. 2000. How Children Learn the Meaning of Words. Cambridge,
MA: MIT Press.
Brinck, I. 2001. Attention and the evolution of intentional communication. Pragmatics and Cognition 9: 25572.
. 2004. The pragmatics of imperative and declarative pointing.
Cognitive Science Quarterly 3: 42946.
Cheney, D., and R. Seyfarth. 1990. How Monkeys See the World: Inside the
Mind of Another Species. Chicago: University of Chicago Press.
Dennett, D. 1991. Consciousness Explained. Boston: Little, Brown.
Donald, M. 1991. Origins of the Modern Mind. Cambridge: Harvard
University Press.
Fernald, A. 1992. Meaningful melodies in mothers speech to infants.
In Nonverbal Vocal Communication: Comparative and Developmental
Approaches, ed. H. Papousek, U. Jrgens, and M. Papousek, 26282.
Cambridge: Cambridge University Press.
Freyd, J. 1983. Shareability: The social psychology of epistemology.
Cognitive Science 7: 191210.
Grdenfors, P. 2000. Conceptual Spaces. Cambridge, MA: MIT Press.
. 2003. How Homo Became Sapiens: On the Evolution of Thinking.
Oxford: Oxford University Press.
. 2004. Cooperation and the evolution of symbolic communication. In The Evolution of Communication Systems, ed. K. Oller and U.
Griebel, 23756. Cambridge, MA: MIT Press.
Grdenfors, P., and M. Warglien. 2006. Cooperation, conceptual
spaces and the evolution of semantics. In Symbol Grounding and
Beyond: Proceedings of the Third International Workshop on the
Emergence and Evolution of Linguistic Communication, ed. P. Vogt
et al., 1630. Berlin: Springer Verlag.
Hauser, M. 1996. The Evolution of Communication. Cambridge, MA: MIT
Press.
Hockett, C. F. 1960. The origin of speech. Scientific American 23: 8996.
Pika, S., and J. C. Mitani. 2006. Referential gestural communication in
wild chimpanzees (Pan troglodytes). Current Biology 16.6: 1912.
Tomasello, M. 1999. The Cultural Origins of Human Cognition.
Cambridge: Harvard University Press.
Tomasello, M., M. Carpenter, J. Call, T. Behne, and H. Moll. 2005.
Understanding and sharing intentions: The origins of cultural cognition. Behavioral and Brain Sciences 28: 67591.

750

Zlatev, J., T. Persson, and P. Grdenfors. 2005. Bodily Mimesis as the


Missing Link in Human Cognitive Evolution. Lund, Sweden: Lund
University Cognitive Studies 121.

SEMANTICS, NEUROBIOLOGY OF
semantics is the study of meaning, and unsurprisingly, for
different disciplines the term itself has come to mean different
things. The relation between neurobiology and semantics can
also be construed in any number of ways. Here, we refer to it
as the study of the ways in which knowledge is established and
organized in the brain.
Currently, most data in the field are derived from studies
of language. These studies can be summarized in the context
of two main goals: a) understanding the principles by which
semantic knowledge that is, information about objects and
words is organized, and b) understanding how the brain
enables compositional operations over this knowledge. With
respect to the first goal, researchers have addressed the organization of conceptual knowledge in the brain and its relation
to taxonomic ontology, as well as the neural mechanisms for
operating on it. With respect to the second goal, neurobiological investigations of compositional processes address issues
such as comprehension of nominal combinations, metaphor,
and sentences. At the level of discourse, empirical research
has examined the comprehension of sentences in context, and
the relation between sentence comprehension and the evaluation of sentence meaning. Brain research has also addressed
the comprehension of meaningful nonlinguistic semiotic entities, such as graspable objects, hand gestures, environmental
sounds, pantomime, and music. Furthermore, the neural processes underlying compositional processes in language may
be, at least partially, subserved by networks that perform more
basic functions, such as the monitoring of predictability in an
input stream. Thus, investigating issues both within and outside the purely linguistic realm could lead to a better understanding of semantic processes in language and interpretive
processes in general.
In what follows, we review the research in a) the organization of conceptual knowledge, b) sentence-level and discourselevel processes, and c) research in nonlinguistic domains. Our
goal is to outline a general framework for the neurophysiology
of semantics.

Organization of Semantic Knowledge


From the earliest patient accounts outlined in the seminal work
of Paul Broca and Karl Wernicke, a central finding in the study
of neural bases of language and semantic organization was that
damage to specific brain regions differentially impairs semantic competence. This was initially demonstrated in the domain
of language. Damage to the posterior two-thirds of the inferior
frontal gyrus (often referred to as brocas area) was initially
associated with production deficits (Brocas aphasia), whereas
damage to the posterior superior temporal gyrus and adjacent
regions (wernickes area) was found to be associated with
reduced comprehension and semantic deficits, such as production of less meaningful speech (Wernickes aphasia; also see
left hemisphere language processing).

Semantics, Neurobiology of
Later research revealed that Wernickes area was not a concept center (i.e., a repository of semantic information) but
had a more complex role. Indeed, current research suggests that
no single cortical area serves as a centralized store of semantic
knowledge. This distributed nature of semantic memory has
been revealed by two scientific methods: lesion studies and
neuroimaging. Until the advent of neuroimaging, the neural basis of semantic knowledge was evaluated by establishing
associations between lesions to specific brain regions and the
concomitant behavioral deficits. This lesion deficit model, developed in the late nineteenth and early twentieth centuries, was
advanced significantly by a series of pivotal studies by Elizabeth
Warrington and her colleagues showing that multifocal lesions
(e.g., as caused by herpes simplex encephalitis) and focal lesions
(e.g., stroke, tumors) could result in larger deficits for some
semantic categories than others (for reviews, see Forde and
Humphreys 2002). The literature reports (relative) category-specific deficits along various axes: for example, abstract versus concrete, living versus nonliving, or nouns versus verbs (cf. Gainotti
2000, for meta-analysis). The existence of such categorical deficits from focal lesions, albeit of a graded nature, has led to multiple interpretations and is the focus of ongoing imaging studies.
These neuroimaging studies have shown that accessing semantic
information for such categories as tools, faces, or body parts is
associated with peaks of activity in particular regions that differ
across categories (e.g., Yovel and Kanwisher 2004). Attesting to
the conceptual essence of this knowledge, certain regions are
active whether the items for judgment are presented as words or
pictures (cf. Martin 2001). However, it is important to note that
many of these studies were designed to evaluate whether there
are neural regions that are more relatively sensitive for one category than another; the extent of common activity is typically of
less interest, though it is of major theoretical importance. Indeed,
a number of studies have shown that the distributed population
codes in less active regions also demonstrate sensitivity to categories (e.g., Haxby et al. 2001).
Not only is knowledge about different categories distributed,
but accessing different features of the same item can also lead
to activity in different cortical areas. For instance, naming a referent is associated with neural activity patterns in cortical areas
different from those implicated in access to other types of conceptual knowledge. Although deficits in naming are more likely
to be associated with lesions to the left hemisphere than to the
right, deficits in recognition, for example, the ability to say meaningful things about an object, are associated with more evenly
distributed lesions (H. Damasio et al. 2004; Gainotti 2000).
Thus, in the brain, there is a distinction between knowing object
names a very unique feature of a referent and more general
sorts of knowledge. Feature knowledge may also be distributed
among different regions, in particular, those associated with
perception and motor function. In support of this possibility, it
has been shown that verifying whether a certain color or a certain action holds for an object activates different brain regions
(Martin 2007).
Are these perceptual- or motor-related units of information
independent, or are they linked via a more abstract, modalityindependent form of semantic knowledge? A recent case study
(Coccia et al. 2004) argues for the latter. It reported a gradual

decline in semantic abilities of two individuals diagnosed with


semantic dementia. The decline was observed for object use,
object naming, and the ability to give definitions for objects.
These findings are inconsistent with models positing independently accessible, modality-specific knowledge organization.
The study also showed that interacting with an object improved
naming performance, which implies that use representations
kinesthetic information can interact with name representations, supporting the notion of a central system.
The principles underlying the organization of conceptual
knowledge in the brain are the topic of ongoing investigation.
On one approach, knowledge is organized according to distinct
domains due to evolutionary constraints (e.g., living vs. nonliving things; Caramazza and Shelton, 1998). A second view is that
categories differ on the extent to which their representation
relies on visual versus motor memories or sensory versus functional memories (Farah and McClelland 1991). This account
could explain why lesion-based deficits sometimes collapse
across, rather than partition between, taxonomically related
categories. For instance, lesions may result in a combined deficit for food, musical instruments, and living things (all of which
depend on visual information) or in a deficit for both man-made
artifacts and body parts (left frontal or parietal regions; perhaps related to motor knowledge; cf. Gainotti 2000). This view
is also supported by the finding that lesions in premotor/prefrontal regions are more associated with impairments to naming actions than with impairments to naming concrete entities
using nouns (A. R. Damasio and Tranel 1993). A third approach
(A. R. Damasio and Damasio 1994) is that knowledge of nonunique objects those that can be easily recognizable independent of context or awareness of distinct features is coded in
more posterior regions, whereas knowledge of unique objects is
coded in more anterior temporal regions. This account is supported by findings showing that naming objects at the basic level
(e.g., dog) versus domain level (e.g., a living thing) is also associated with increased activity in anterior temporal regions (Tyler
et al. 2004). How such semantic memory is accessed and manipulated is an important independent issue that extends beyond
the current discussion, but has been addressed by a number of
recent reviews (e.g., Badre et al. 2005).

Single Sentence and Discourse Comprehension


Several studies have examined the neural processes underlying sentence- and discourse-level compositional processes.
In examining compositional processes at the single-sentence
level, researchers have examined the underpinnings of both
literal and metaphorical utterances (see metaphor, neural
substrates of). Dissociating the neural correlates of semantic processes from those associated with syntactic processes or
working memory is not trivial. A recent study identified temporal regions sensitive to the meaningfulness of sentences, but not
their syntactic structure (Humphries et al. 2006; see temporal
lobe). The left inferior frontal gyrus (IFG) may also be involved
in such processes, as it demonstrates more activity to sentences
with semantic violations (e.g., Dutch trains are sour; Hagoort et
al. 2004). However, studies utilizing this violation paradigm
have also reported other frontal regions (see Van Petten and
Luka 2006, for review).

751

Semantics, Neurobiology of
While frontal and temporal regions are essential to semantic
processing during sentence comprehension, recent data suggest that modality-specific brain regions also play an important
role. M. Tettamanti and colleagues (2005) found that sentences
describing actions performed by the human body evoke greater
activity in posterior left IFG than do abstract sentences, suggesting that this region codes for action on a level that is sufficiently
abstract to be accessible to language. This finding is notable
because the homologue of this region in the monkey contains
mirror neurons, which fire during both observation and execution of certain goal-directed actions (see mirror systems). The
researchers also show that sentences referring to mouth, hand,
and leg actions evoked neural activity in premotor regions associated with the performance of the described actions (cf. AzizZadeh et al. 2006 for similar results; cf. Buccino et al. 2005 and
Hauk, Johnsrude, and Pulvermller 2004 for related findings).
Language comprehension, at least in the action domain, may
therefore be supported by motor systems used to perform the
actions referenced in those sentences. These findings are consistent with theoretical approaches in the embodied cognition
framework (see cognitive linguistics; embodiment).
The interpretation of sentences relies on an integration of
world knowledge with the proposition conveyed by the sentence.
A central issue is whether at any stage of comprehension there is
a level of semantic interpretation that corresponds to the state of
affairs denoted by a sentence, but that is relatively encapsulated
from world knowledge or belief. P. Hagoort and colleagues (2004)
showed that false statements (in virtue of world knowledge) and
nonsensical statements (in virtue of lexical knowledge) are recognized as anomalous equally fast. Specifically, an electrophysiological signature associated with the ease of integrating a word
into context (the N400) was almost identical for the word white
in the false statement Dutch trains are white as it was for the
word sour in Dutch trains are sour. This suggests that lexical
and world knowledge are integrated during the same time frame.
Functional magnetic resonance imaging (fMRI) showed that both
violations were associated with increased activity in anterior left
IFG. Related research (Berkum, Hagoort, and Brown 1999) has
shown that the recent discourse context similarly constrains the
integration of words during sentence comprehension. Thus, the
construction of sentence representation seems to immediately
incorporate world knowledge and prior discourse context.
The integration of meta-logical information with proposition
content has not been extensively researched. One exception is a
study that investigated the neural mechanisms underlying successful encoding of propositions said to be either true or false
(Mitchell, Dodson, and Schacter 2005). This study began from
behavioral findings showing that false sentences are more likely to
be (mis)recalled as true than vice versa, which suggests that their
encoding requires additional cognitive effort (cf. Gilbert 1991).
In an fMRI study, J. P. Mitchell, C. S. Dodson, and D. L. Schacter
found that accurate recall of falsity (i.e., correctly remembering
that a statement was false) was associated with increased activity
in IFG (more on the left) and left medial temporal regions during
encoding (see meaning and belief).
Several neuroimaging studies have examined processes of
meaning integration at the discourse level, but this domain of
research is in its initial stages. Discourse comprehension at the

752

multiple-sentence level demands integration of ongoing discourse with prior context and world knowledge. An early study
(St. George et al. 1999) implicated the right hemisphere in such
processes, demonstrating that middle temporal regions in this
hemisphere showed reduced activity when a confusing text was
clarified by an informative title (interestingly, the left hemisphere
homologue showed increased activity). However, another study
employing a conceptually similar manipulation where texts
were clarified by pictures did not replicate this result, finding
increased activity in medial brain regions (Maguire, Frith, and
Morris 1999). Other studies show that the integration of consecutive discourse ideas is associated with activity in the anterior
temporal pole (bilaterally), IFG (bilaterally), and perhaps also
the dorso-medial prefrontal cortex. These regions differentiate
between narratives and unlinked sentences and also differentiate between sentences that differ in their consistency with prior
context (Ferstl, Rinck, and von Cramon 2005; Xu et al. 2005;
see coherence, discourse). Transitions between narrative
events have also been associated with activity in the right temporal regions (Speer, Zacks, and Reynolds 2007).
Two studies (Kuperberg et al. 2006; Mason and Just 2004)
have attempted to identify cortical regions mediating the integration of world knowledge by examining neural activity during
comprehension of causal statements. Both reported that comprehension of statements related by a causal relation of intermediate strength was associated with greater neural activity than
comprehension of statements in which the causal relation was
very weak or very strong. These findings are consistent with the
notion that scenarios with intermediate causal strength demand
particular enrichment with world knowledge. However, the
regions identified by these studies did not overlap. The picture
emerging from the study of discourse-level relations is that these
rely additionally on brain regions not typically involved in the
comprehension of single, context-independent sentences (JungBeeman 2005).

Neurobiology of Semantics: Specialized for Natural


Languages?
Does the processing of words and sentences rely on brain networks specialized for processing information communicated
via language? Or is it the case that those networks also subserve
comprehension of other types of symbolic meanings, such as
meaningful environmental sounds (e.g., the sound of a drill),
gestured emblems (an OK sign), pantomimes (e.g., showing
how to use scissors), or even mathematical statements? The
emerging picture from imaging and lesion studies is that there
is good evidence to support a degree of modality-independent
semantic representation in the brain.
Deficits in language comprehension may be accompanied
by deficits in the comprehension or production of nonverbal,
symbolic, or pantomime gestures (see gesture ). For instance,
brain lesions could result in the ability to imitate pantomime
gestures but not understand their meaning (Rothi, Mack, and
Heilman 1986). Similarly, aphasia can impair both verbal and
nonverbal semantic competence. A. P. Saygin and colleagues
(2003) reported that individuals with aphasia showed a strong
correlation between the ability to match environmental
sounds to pictures and the ability to match verbal phrases to

Semantics, Neurobiology of
pictures. Both deficits were associated with damage to posterior regions in the left superior temporal gyrus (STG) and the
inferior parietal lobule. A. Cummings and colleagues (2006)
used a method with very high temporal resolution (EEG/ERP)
to assess whether matching pictures to words or sounds is
associated with similar time lines of neural processing. When
either words or sounds mismatched a picture, they found a
typical index of mismatch processing (an N400 ERP component). Attesting to the high similarity between meaningful
words and meaningful sounds, there was no difference in the
onset time of this component, and if anything, it peaked earlier
for mismatching sounds than for mismatching pictures. These
results are consistent with other findings (Saygin et al. 2003)
showing that environmental sounds are recognized faster than
the corresponding word referents, which suggests that they are
not recoded as lexical items. As a whole, these studies demonstrate that similar networks process meaningful auditory
information, whether expressed verbally or not. Knowledge
of environmental sounds may also be organized in a distributed manner. Certain regions are particularly sensitive to such
categories as animal sounds or sounds associated with actions
performed by body parts (Gazzola, Aziz-Zadeh, and Keysers
2006; Kraut et al. 2006).
Compositional processes underlying language comprehension may also be instantiations of more basic processes that are
not specific to language. C. Humphries and colleagues (2001)
demonstrated that both meaningful sentences and meaningful
environmental sound sequences were associated with neural
activity in bilateral temporal regions associated with semantic
processing, including posterior aspects of the superior temporal gyrus and sulcus. Sentence comprehension was associated
with increased activity in several temporal regions. However,
as discussed, the increased activity for sentences as compared
to environmental sounds could be due to easier semantic access
afforded by environmental sounds. Studies of the neurobiology of music processing similarly suggest some overlap with a
semantic system subserving language comprehension. Listening
to structured musical pieces as opposed to scrambled versions
results in increased activity in a region of IFG known to play
an important role in the integration of information over time
(Levitin and Menon 2003). And music, like spoken language, can
activate broad semantic domains (Koelsch 2005). More basic
functions can also potentially explain activity in certain regions
during more and less surprising linguistic information: It has
been shown that decreased predictability of events in a nonverbal input stream is associated with greater activity in Wernickes
area and its right homologue, as well as posterior IFG bilaterally
(Bischoff-Grethe et al. 2000). Thus, the neural regions subserving
semantic processes in language may have a more general role in
interpreting meaningful communication streams.
Uri Hasson and Steven L. Small
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aziz-Zadeh, L., S. M. Wilson, G. Rizzolatti, and M. Iacoboni. 2006.
Congruent embodied representations for visually presented
actions and linguistic phrases describing actions. Current Biology
16.18: 181823.

Badre, D., R. A. Poldrack, E. J. Par-Blagoev, R. Z. Insler, and A. D. Wagner.


2005. Dissociable controlled retrieval and generalized selection
mechanisms in ventrolateral prefrontal cortex. Neuron 47.6: 90718.
Berkum, J. J. A., P. Hagoort, and C. M. Brown. 1999. Semantic integration in sentences and discourse: Evidence from the N400. Journal of
Cognitive Neuroscience, 11.6: 65771.
Bischoff-Grethe, A., S. M. Proper, H. Mao, K. A. Daniels, and G. S. Berns.
2000. Conscious and unconscious processing of nonverbal predictability in Wernickes area. Journal of Neuroscience 20.5: 197581.
Buccino, G., L. Riggio, G. Melli, F. Binkofski, V. Gallese, and G. Rizzolatti.
2005. Listening to action-related sentences modulates the activity
of the motor system: a combined TMS and behavioral study. Brain
Research and Cognitive Brain Research 24.3: 35563.
Caramazza, A., and J. R. Shelton. 1998. Domain-specific knowledge
systems in the brain: The animate-inanimate distinction. Journal of
Cognitive Neuroscience 10.1: 134.
Coccia, M., M. Bartolini, S. Luzzi, L. Provinciali, and M. A. Lambon Ralph.
2004. Semantic memory is an amodal, dynamic system: Evidence
from the interaction of naming and object use in semantic dementia.
Cognitive Neuropsychology 21.5: 51327.
Cummings, A., R. Ceponiene, A. Koyama, A. P. Saygin, J. Townsend,
and F. Dick. 2006. Auditory semantic networks for words and natural
sounds. Brain Research 1115: 92107.
Damasio, A. R., and H. Damasio. 1994. Cortical systems for retrieval
of concrete knowledge: The convergence zone framework. In LargeScale Neuronal Theories of the Brain, ed. C. Koch, 6174. Cambridge,
MA: MIT Press.
Damasio, A. R., and D. Tranel. 1993. Nouns and verbs are retrieved with
differently distributed neural systems. Proceedings of the National
Academy of Sciences 90.11: 495760.
Damasio, H., D. Tranel, T. J. Grabowski, R. Adolphs, and A. Damasio.
2004. Neural systems behind word and concept retrieval. Cognition
92.1/2: 179229.
Farah, M. J., and J. L. McClelland. 1991. A computational model of
semantic memory impairment: Modality specificity and emergent
category specificity. Journal of Experimental Psychology: General
120.4: 33957.
Ferstl, E. C., M. Rinck, and D. Y. von Cramon. 2005. Emotional and temporal aspects of situation model processing during text comprehension: An event related study. Journal of Cognitive Neuroscience 17.5,
72439.
Forde, E. M. E., and G. W. Humphreys. 2002. Category Specificity in Brain
and Mind. New York: Psychology Press.
Gainotti, G. 2000. What the locus of brain lesion tells us about the nature
of the cognitive defect underlying category-specific disorders: A
review. Cortex, 36.4: 53959.
Gazzola, V., L. Aziz-Zadeh, and C. Keysers. 2006. Empathy and the
somatotopic auditory mirror system in humans. Current Biology
16: 18249.
Gilbert, D. T. 1991. How mental systems believe. American Psychologist,
46.2: 10719.
Hagoort, P., L. Hald, M. Bastiaansen, and K. M. Petersson. 2004.
Integration of word meaning and world knowledge in language comprehension. Science 304.5669: 43841.
Hauk, O., I. Johnsrude, and F. Pulvermller. 2004. Somatotopic representation of action words in human motor and premotor cortex.
Neuron 41.2: 3017.
Haxby, J. V., M. I. Gobbini, M. L. Furey, A. Ishai, J. L. Schouten, and P.
Pietrini. 2001. Distributed and overlapping representations of faces
and objects in ventral temporal cortex. Science 293.5539: 242530.
Humphries, C., J. R. Binder, D. A. Medler, and E. Liebenthal. 2006. Syntactic
and semantic modulation of neural activity during auditory sentence
comprehension. Journal of Cognitive Neuroscience 18.4: 66579.

753

Semantics, Universals of
Humphries, C., K. Willard, B. Buchsbaum, and G. Hickok. 2001. Role
of anterior temporal cortex in auditory sentence comprehension: An
fMRI study. NeuroReport 12.8: 174952.
Jung-Beeman, M. 2005. Bilateral brain processes for comprehending
natural language. Trends in Cognitive Sciences 9: 51218.
Koelsch, S. 2005. Neural substrates of processing syntax and semantics
in music. Current Opinion in Neurobiology 15: 20712.
Kraut, M. A., J. A. Pitcock, V. Calhoun, J. Li, T. Freeman, and J. Hart, Jr.
2006. Neuroanatomic organization of sound memory in humans.
Journal of Cognitive Neuroscience 18.11: 187788.
Kuperberg, G. R., B. M. Lakshmanan, D. N. Caplan, and P. J. Holcomb.
2006. Making sense of discourse: An fMRI study of causal inferencing
across sentences. NeuroImage 33: 34361.
Levitin, D. J., and V. Menon. 2003. Musical structure is processed in
language areas of the brain: A possible role for Brodmann area 47 in
temporal coherence. NeuroImage 20.4: 214252.
Maguire, E. A., C. D. Frith, and R. G. M. Morris. 1999. The functional neuroanatomy of comprehension and memory: The importance of prior
knowledge. Brain 122: 183950.
Martin, A. 2001. Functional neuroimaging of semantic memory. In
Handbook of Functional Neuroimaging of Cognition, ed. R. Cabeza and
A. Kingstone, 15386. Cambridge, MA: MIT Press.
. 2007. The representation of object concepts in the brain. Annual
Review of Psychology 58: 2545.
Mason, R. A., and M. A. Just. 2004. How the brain processes causal inferences in text: A theoretical account of generation and integration component processes utilizing both cerebral hemispheres. Psychological
Science 15.1: 17.
Mitchell, J. P., C. S. Dodson, and D. L. Schacter. 2005. fMRI evidence for
the role of recollection in suppressing misattribution errors: The illusory truth effect. Journal of Cognitive Neuroscience 17.5: 80010.
Rothi, L. J., L. Mack, and K. M. Heilman. 1986. Pantomime agnosia.
Journal of Neurology, Neurosurgery and Psychiatry 49.4: 4514.
Saygin, A. P., F. Dick, S. M. Wilson, N. F. Dronkers, and E. Bates. 2003.
Neural resources for processing language and environmental
sounds: Evidence from aphasia. Brain 126.4: 92845.
Speer, N. K., J. M. Zacks, and J. R. Reynolds. 2007. Human brain activity time-locked to narrative event boundaries. Psychological Science
18.5: 44955.
St. George, M., M. Kutas, A. Martinez, and M. I. Sereno. 1999. Semantic
integration in reading: Engagement of the right hemisphere during
discourse processing. Brain 122: 131725.
Tettamanti, M., G. Buccino, M. C. Saccuman, V. Gallese, M. Danna, P.
Scifo, et al. 2005. Listening to action-related sentences activates
fronto-parietal motor circuits. Journal of Cognitive Neuroscience
17.2: 27381.
Tyler, L. K., E. A. Stamatakis, P. Bright, K. Acres, S. Abdallah, J. M. Rodd, et
al. 2004. Processing objects at different levels of specificity. Journal of
Cognitive Neuroscience 16.3: 35162.
Van Petten, C., and B. J. Luka. 2006. Neural localization of semantic context effects in electromagnetic and hemodynamic studies. Brain and
Language 97.3: 27993.
Xu, J., S. Kemeny, G. Park, C. Frattali, and A. Braun. 2005. Language in
context: Emergent features of word, sentence, and narrative comprehension. NeuroImage 25.3: 100215.
Yovel, G., and N. Kanwisher. 2004. Face perception: Domain specific,
not process specific. Neuron 44.5: 88998.

SEMANTICS, UNIVERSALS OF
A semantic universal is any aspect of meaning that is somehow represented in all languages. To first characterize the two terms of this
subject separately, semantics pertains to conceptual material as

754

it is organized by language. It ranges from single linguistically represented concepts to principles of conceptual organization.
Universality has properties varying along three parameters.
The level of a universal is absolute (Joseph Greenbergs
term, e.g., 1963) if it is realized in every individual language (see
absolute and statistical universals). But its level is here
termed abstractive if the universal is part of the language faculty
but not of every language. This second abstractive level can in
turn be realized in the three following ways, each exemplified
later in this entry. An abstractive universal is inventorial if each
language draws its own selection of elements of a certain type
from a relatively closed inventory of such elements. An abstractive universal is typological if languages fall into a typology
on the basis of manifesting one or another out of a relatively
small set of (combinations of) certain elements in an otherwise
common system. An abstractive universal is implicational
(Greenbergs term) if one linguistic feature can appear in a language only if another feature itself abstractive also appears
there (see implicational universals).
Another parameter of universality is its weighting. At the
extremes of weighting, a universal is either positive or negative.
At such extremes, an absolute universal feature is either present
in or absent from all languages, while an abstractive universal
feature is either part or not part of the language faculty. But while
some features pertaining to an abstractive universal may appear
in all or no languages, other features can range from appearing in most down to few languages behavior that must also be
regarded as part of the language faculty. Traditionally, such nonextremes have been termed tendencies. But to promote the idea
of a gradient hierarchy from positive down to negative, and to
highlight the fact that such tendencies are themselves properties
of the language faculty, such nonextremes will here be termed
positively tending abstractive universals or negatively tending
abstractive universals with whatever further modifiers it might
take to indicate the degree.
A final parameter of universality is its subject the particular
linguistic phenomenon that manifests the universal property.
For a semantic universal, this can range from a specific concept,
through a conceptual category or set of such categories, to a conceptual system and can extend as well to principles of conceptual
organization.
All the forms of universality in this framework presumably
have a cognitive basis, and this is addressed where feasible in the
following.
The framework can be illustrated by much within the present authors work. A positive absolute universal is that the morphemes of every language are divided into two subsystems,
the open class, or lexical, and the closed class, or grammatical
(see Talmy (2000a, Chap. 1). Open classes have many members
and can readily add more. They commonly include (the roots
of) nouns, verbs, and adjectives. Closed classes have relatively
few members and are difficult to augment. They include bound
forms inflections, derivations, and clitics and such free forms
as prepositions, conjunctions, and determiners. Closed classes
can also be implicit, as with word order patterns, lexical categories, and grammatical relations.
A semantic difference correlates with this formal difference.
The meanings that open-class forms can express are almost

Semantics, Universals of
unrestricted, whereas those of closed-class forms are highly constrained. This constraint first applies to the conceptual categories
to which they can refer. For example, many languages around
the world have closed-class forms in construction with a noun
that indicate the number of the nouns referent, but no language
has closed-class forms indicating its color. A positive abstractive
universal, accordingly, is that the grammatical morphemes of a
language can represent an approximately closed set of conceptual categories, such as those for number, gender, tense, aspect,
causality, and status. Excluded are color, and indefinitely many
more such as food and religion the corresponding negative
absolute universal.
A further semantic constraint on grammatical forms is that,
even within represented conceptual categories, only certain
member concepts can be grammatically expressed and not
others. Thus, a positive abstractive universal is that, within the
category of number, a bound closed-class form can represent
such concepts as singular, dual, plural, and paucal, while a free
closed-class form can also represent such concepts as no, some,
many, and all. But the corresponding negative absolute universal is that no languages closed-class forms represent concepts
such as even, odd, dozen, countable, or any other concept pertaining to number.
Open-class morphemes are not subject to these same semantic constraints on categories or member concepts. This is shown
by the existence of such morphemes as food, even and odd.
Even open-class morphemes exhibit a few semantic constraints, however. Thus, on the one hand, the meaning of a morphemic verb can incorporate particulars of aspect that can, in
turn, interact with external closed-class aspectual forms. On the
other hand, it is a negative absolute universal that the meaning of
no morphemic verb incorporates a tense that can, in turn, interact with external closed-class tense forms. If not for this exclusion, we might expect to find a verb like (to) went that could be
used in a construction like I am wenting to mean I was going,
or in a construction like I will went to mean I will have gone.
Comparably, proper nouns like Manhattan or Shakespeare
exist that refer to a specific bounded portion of the space-time
continuum. But a negative absolute universal is that there are no
proper verbs or proper adjectives with the same property.
That is, both can be type specific but they are token neutral. Thus,
there is never a verb like (to) Deluge referring uniquely to the soconceived spatio-temporally bounded event of the biblical flood,
as in some sentence like After it Deluged, Noah landed the ark.
Due to the semantic constraints on the closed-class subsystem, the total set of conceptual categories and their respective sets of member concepts that can ever be represented by
closed-class forms constitutes an approximately closed inventory. This inventory is universally available. No language has
closed-class forms representing all of the conceptual categories
and member concepts in the inventory. Rather, each language
draws in a unique pattern from the inventory for its particular set
of grammatically expressed meanings. Accordingly, this inventory is a positive abstractive universal of the language faculty,
not an absolute universal overtly manifested in all languages.
In turn, the pattern in which individual languages select their
particular sets of grammatically expressed conceptual categories and member concepts from the inventory is governed by a

principle of semantic representativeness. No language draws all


of its grammatically expressed concepts from one category, say,
from aspect alone, but rather draws them from across the range
of available categories. The specifics of this principle are not yet
clear, but its realization in all languages makes it a positive absolute universal.
Leonard Talmy (2006) observes that closed-class forms representing spatial schemas cross-linguistically draw their conceptual categories and member concepts from only a portion of the
general inventory, hence, from what can be considered a spatially relevant subinventory. For example, out of all the member
concepts within the number category earlier cited as available
to languages for various grammatical specifications, only four
ever play a role in closed-class spatial schemas. These are one,
two, several, and many. Accordingly, the spatially relevant subinventory is also a positive abstractive universal, and is embedded within the general inventory, itself having the same type of
universality.
Although the main closed-class inventory as a whole is
abstractive, that is, with its components merely available for
inclusion in individual languages, some components of it might
well be represented grammatically in all languages, hence may
constitute a positive absolute universal. One candidate might
be the concept negative along with the conceptual category
of polarity to which it belongs. Between positive absolute universals like these and such negative absolute universals as the
exclusion of color from grammar, the components of the inventory lie along a gradient hierarchy. Thus, the category of number
may be a positively tending abstractive universal, represented
in many but perhaps not all languages. And the category of rate
with the member concepts fast and slow is a negatively tending
abstractive universal, represented in only a few languages.
The next issue is what determines the conceptual categories and member concepts included in the inventory, as against
those excluded from it. No single global principle is evident, but
several semantic constraints with broad scope have been found.
One of these, the topology principle, applies to the meanings
or schemas of closed-class forms referring to space, time, or
certain other domains. This principle excludes Euclidean properties such as absolutes of distance, size, shape, or angle from
such schemas, and thus constitutes a negative absolute universal. Instead, these schemas exhibit such topological properties as
magnitude neutrality, shape neutrality, and bulk neutrality the
corresponding positive absolute universal. To illustrate magnitude neutrality, the spatial schema of the English preposition
across prototypically represents motion along a path from one
edge of a bounded plane perpendicularly to its opposite. But this
schema is abstracted away from magnitude. Hence, the preposition can be used equally well in The ant crawled across my palm,
and in The bus drove across the country. Apparently, no language
has two different closed-class forms whose meanings differ only
with respect to magnitude for this or any other spatial schema.
Another semantic constraint on concepts available in the
inventory pertains to the meanings of conjunctions that head a
subordinate clause in a complex sentence (Talmy 2000a, Chap. 5).
Where such conjunctions relate two clauses whose events are in
temporal sequence and often also in a causeeffect sequence,
with one exception the conjunctions are lexicalized to take the

755

Semantics, Universals of
earlier (and causal) event in the subordinate clause, leaving
the later (and caused) event in the main clause, and never the
other way around. The exception is that in addition to an aftertype conjunction, which obeys the constraint languages can
also have a before-type conjunction. Thus, beside We left after
we ate, English has We ate before we left. But all other cases follow a negative absolute universal. Thus, beside a because-type
conjunction that obeys the constraint, as in English We stayed
home because they arrived, English has no inverse conjunction
lexicalized to express the hyphenated phrase in *They arrived
to-the-occasioning-of-the-event-that we stayed home and seemingly no other language does either. Comparably, beside an
although-type conjunction, as in We went out even though they
arrived, there is never a conjunction lexicalized to represent the
hyphenated phrase in *They arrived in-ineffective-counteractingof-the-event-that we went out. Talmy (2000a, Chap. 5) bases a
cognitive account for this unidirectional lexicalization on an
earlier events natural function as ground for a later event as figure. And it details a similar unidirectionality in conjunctions for
the temporal inclusion of one event in another, the contingency
of one event on another, and the substitution of one event for
another.
Based on their formal and semantic differences, treated so
far, a further major finding is that the two types of form classes exhibit a functional difference. In the conceptual complex
evoked by any portion of discourse, the open-class forms contribute most of the content, while the closed-class forms determine most of the structure. This division of labor in cognitive
function amounts to a positive absolute universal: The openclass subsystem as a whole represents conceptual content and
the closed-class subsystem as a whole represents conceptual
structure (for illustration, see Talmy 2000a, Chap. 1). Although
individual open- and closed-class forms in general or in a particular sentence may perform the opposite functions, the subsystems overall are universally dedicated to their respective content
and structure functions. Further, the function of the closed-class
subsystem to structure conceptual content presumably accounts
for the negative absolute universals that constrain its semantics.
The crucial conclusion again a positive absolute universal is
that the closed-class subsystem is perhaps the most fundamental conceptual structuring system of language.
The main focus so far has been on semantic abstractive universality that involves an inventory, but we now switch to a type
that involves a typology. Talmy (2000b, Chaps. 1, 2, 3) examines
such typologies for an extended Motion event (the capitalized
form covers both motion and location). This larger event consists of a Motion event proper and a co-event that usually represents the manner or the cause of the Motion. In turn, the event
of Motion consists of four components the moving or stationary figure, its state of Motion (moving or being located), its path
(path or site), and the ground that serves as its reference point.
What may be a positive absolute universal is that every language
has coordinated lexicalization patterns and syntactic constructions to represent all the components of an extended Motion
event directly and colloquially over at most two clauses. But the
particular lexical and syntactic categories in which the semantic
components are characteristically represented is not an absolute
universal. Rather, each language has a characteristic pattern of

756

such semantic-formal associations and unlike the inventory


case where each language is unique the patterns range over
a relatively small set and so constitute a typology. The framing
typology is based on where the Path characteristically appears. It
divides languages into two main types. In a satellite-framed language such as English, the path is characteristically expressed by
a satellite and/or preposition like the into in The bottle floated
into the cave. But in a verb-framed language such as Spanish, it
characteristically appears in the main verb like the entr in La
botella entr (flotando) a la cueva: The bottle entered (floating)
to the cave.
Finally, we turn to the implicational type of abstractive universality. It in turn has two forms. In the only form Greenberg
posited what might be called other directed the presence of
one feature in a language licenses the presence of a certain different feature. It was seen previously that, among subordinating conjunctions that relate two events in time, a before-type
conjunction was unusual in placing the later event in the subordinate clause. A possible implicational universal is that only
if a language has a conjunction expressing after can it have one
expressing before.
An abstractive implicational universal is self-directed if a certain feature that is not an absolute universal but occurs in only
some languages always exhibits a certain characteristic one
from a range of potential characteristics when it does occur.
This can be seen in the phenomenon of fictivity, where the
meanings of morphemes in a sentence belong literally to one
semantic category, but function systematically to represent the
opposite semantic category (Talmy 2000a, Chap. 2). In the type
called fictive motion, a sentence with morphemes referring literally to motion instead depicts a static scene. In one type of fictive
motion, emanation, the motion formulation evokes the conceptualization of something intangible emerging from a source traveling in a straight line through space, and impinging on a distal
object, where in fact nothing can be perceived as moving. Many
languages, including English, can use a fictive formulation to
represent various types of emanation. For example, English can
express a radiation path, as in Light shone from the sun into the
cave; a shadow path, as in The pole threw its shadow against the
wall, or in The poles shadow fell on the wall; and a sensory path,
as in I looked into/past the valley.
Not all languages, though, can represent such situations
in terms of fictive motion, so that fictivity here is not an absolute universal. Such languages, instead, tend to use nonfictive
formulations like The sun illuminated the inside of the cave,
The poles shadow is on the wall, and I regarded the interior
of the valley. But if a language does use fictive motion to depict
an emanation, the path of the emanation is always in the same
single direction. Thus, radiation is never represented as moving
from an object toward the sun, as in *Light shone from my hand
onto the sun; nor as moving outward from a third point, as in *The
light shone onto my hand and the sun from a point between us.
Comparably, a shadow is never represented as moving from the
silhouette to the object, as in *The shadow jumped from the wall
onto the pole. And a sensory path is never represented as moving
from a perceived object to a perceiver acting agentively (though
it can be for an unintentional perceiver), as in *That distant valley
looked into my eyes. This semantic pattern the unidirectionality

Semantics, Universals of

Semantics-Phonology Interface

of fictive emanation is thus a positive abstractive universal that


is implicationally self-directed.
Further, an active-determinative principle appears to govern this universal direction of emanation. Of the two objects,
the more active or determinative one is conceptualized as the
source. Thus, relative to my hand, the sun is brighter, hence,
more active, and must be treated as the source of radiative emanation. My agency in looking is more active than the inanimate
perceived object, and so I am treated as the source of sensory
emanation. And the pole is more determinative I can move
the pole and the shadow will also move, but I cannot perform
the opposite operation of moving the shadow and getting the
pole to move and so the pole is treated as the source of shadow
emanation. In turn, this principle might itself derive from the
unidirectionality of agency, as detailed further in Talmy (2000a,
Chap. 2).
Many further semantic universals can be cited. In fact, most
semantic findings of cognitive linguistics in general and of Talmy
(2000a, 2000b) in particular are universal in character. For example, universals for the representation of force interaction and
causality are set forth in Talmy (2000a, Chaps. 4, 7, 8) of temporal
aspect in Talmy (2000a, Chap. 3); and of figure-ground organization in Talmy (2000a, Chap. 5). The domains treated here were
chosen for this short entry because together they illustrate all the
parameters initially outlined for universality.
To mention one further tradition of universalist semantics, the
natural semantic metalanguage (NSM) of Anna Wierzbicka (e.g..
1996) and Cliff Goddard (e.g.. 2001) is prominent and extensively
developed. This NSM theory posits that a specific set of fundamental concepts semantic primes exists; that it is represented
in every language by specific morphemes of that language; and
that all other morphemically expressed concepts in the language
can be represented by syntactically well-formed combinations of
the morphemic primes. In terms of the initial framework, NSM
as a whole thus represents positive absolute universality.
Leonard Talmy
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Comrie, Bernard. 1981. Language Universals and Linguistic Typology.
Chicago: University of Chicago Press.
Goddard, Cliff. 2001. Lexico-semantic universals: A critical overview.
Linguistic Typology 5.1: 165.
Greenberg, Joseph. 1963. Some universals of grammar with particular reference to the order of meaningful elements. In Universals of
Language, ed. Joseph Greenberg. Cambridge, MA: MIT Press.
Greenberg, Joseph H., ed. 1978. Universals of Human Language. Vols.
14. Stanford, CA: Stanford University Press.
Talmy, Leonard. 2000a. Toward a Cognitive Semantics. Vol. 1. Concept
Structuring Systems. Cambridge, MA: MIT Press.
. 2000b. Toward a Cognitive Semantics. Vol. 2. Typology and Process
in Concept Structuring. Cambridge, MA: MIT Press.
. 2006. The fundamental system of spatial schemas in language. In
From Perception to Meaning: Image Schemas in Cognitive Linguistics,
ed. Beate Hampe, 199234. Berlin: Mouton de Gruyter.
Wierzbicka, Anna. 1996. Semantics: Primes and Universals. Oxford: Oxford
University Press.
Zaefferer, Dietmar, ed. 1991. Semantic Universals and Universal
Semantics. Berlin: Foris.

SEMANTICS-PHONOLOGY INTERFACE
Systems theories in widespread use within the cognitive and
neurosciences (see hippocampus) face two major challenges: determining when systems differ and determining
where and how different systems interact. Both problems have
been solved for the phonological and sentential-semantic
systems. The semantics-phonology interface has been established via three complementary criteria independent activity,
connectivity, and error frequence that may help resolve other
system-boundary disputes. Under these theoretical and empirical criteria, the sentential-semantic versus phonological systems
are language-memory and comprehension-production systems.
That is, the sentential-semantic system contains units for comprehending, storing, retrieving, and producing morphemes,
words, phrases, and propositions, and the phonological
system contains units for comprehending, storing, retrieving,
and producing syllables, phonological compounds, and segments. These are the three criteria for viewing the semantic-sentential versus phonological systems in this way.

The Independent Activation Criterion


Independent activation is a system-differentiation criterion.
Current theory (e.g., MacKay et al. 2007) has used K. S. Lashleys
(1951) distinction between activation versus priming to distinguish between systems (see also spreading activation).
Activated units automatically prime, or prepare, for activation
all units to which they are connected, regardless of the system
that houses the units. However, primed units dont necessarily become activated: Application of a system-specific activating
mechanism is necessary to activate a primed unit. For example,
when a speaker familiar with the noun desk sees a desk, units
in visual systems prime or ready for activation the lexical unit
representing the noun desk in the sentential-semantic system.
However, the speaker seeing a desk doesnt necessarily activate the primed unit representing desk: We dont go through life
naming whatever we see. To produce the noun desk, an activating mechanism specific to the sentential-semantic system must
activate the primed content unit representing desk (see MacKay
1987, 1992).
Because functionally independent activating mechanisms
activate the representational or content units in different systems, content units in one system can be activated independently
from content units in another system, and content units that are
independently activatable are part of different systems under
the independent activation criterion. The phonological versus
muscle movement systems clearly satisfy this independent activation criterion because we can produce internal speech without
overt muscle movement: Internal speech occurs when we activate phonological units without activating corresponding muscle movement units, indicating that phonological and muscle
movement units occupy separate systems under the independent activation criterion. Similarly, we can produce sequentially
organized thought internally without becoming aware of inner
speech sounds and without overt movement, indicating a third
independently activatable system under the independent activation criterion: the sentential-semantic system. Of course, only
units within the phonological and muscle movement systems

757

Semantics-Phonology Interface

Semantics-Pragmatics Interaction

become activated when we learn and produce experimentally


constructed nonsense syllables, whereas units in all three systems become activated in concert during full-blown sentence
articulation (see MacKay 1992).

The Connectivity Criterion


Connectivity is a systems-interaction criterion. Content units for
perceiving and producing sentences are functionally (but not
structurally) hierarchic (see MacKay 1987, 23; also Jackendoff
2003, 534), and differing patterns of connectivity for units at
the highest versus lowest levels in a system indicate how systems interface under the connectivity criterion. In general, the
highest-level units in a system only receive bottom-up connections from within the same system, whereas the lowest-level units
in a system receive bottom-up and lateral connections from
outside the system. For example, syllable units only receive bottom-up connections originating within the phonological system,
whereas lexical units receive bottom-up and lateral connections
from outside the sentential-semantic system. The bottom-up
extrasystemic connections come from orthographic and phonological systems and enable speakers to produce a word such
as apple on the basis of hearing or seeing the word apple. The
lateral extrasystemic connections come from visual and other
sensory systems and enable speakers to produce apple solely
on the basis of seeing, smelling, or tasting an apple (see MacKay
1987, 1438). The dividing line between phonological versus sentential-semantic systems, therefore, falls between syllables and
lexical/morphemic units under the connectivity criterion, with
syllable units as the highest level in the phonological system and
lexical/morphemic units as the lowest level in the sententialsemantic system.

The Error Frequency Criterion


Error frequencies provide converging evidence for both system
differentiation and boundary determination. Evidence based
on error frequencies has established hundreds of subsystems
known as sequential domains, which are functionally distinct
sets of content units that share the same activating mechanism,
for example, proper nouns (see MacKay 1987, 445). Error frequencies also reinforce the syllable-word/morpheme interface
as the dividing line between the phonological versus sententialsemantic systems. For reasons related to the speedaccuracy
trade-off (see MacKay 1987, 61), speech errors are relatively
more common for units at low rather than high levels within a
system. This means that error frequencies can indicate where
one system ends and another begins. Consider substitution
errors involving words versus syllables. Word substitutions
greatly outnumber syllable substitutions in everyday speech, a
quantum jump in error frequency that provides converging evidence for establishing syllables versus words/morphemes as the
boundary between the phonological versus sentential-semantic
systems.
The pressing problem for future research is to resolve boundary disputes afflicting other putative cognitive systems using
principles similar to the independent activation, connectivity,
and error frequency criteria for establishing the phonology versus sentential-semantic interface (see MacKay et al. 2007).
Donald G. MacKay

758

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Jackendoff, R. 2003. Foundations of Grammar: Brain, Meaning, Grammar,
Evolution. Oxford: Oxford University Press.
Lashley, K. S. 1951. The problem of serial order in behavior. In Cerebral
Mechanisms in Behavior, ed. L. A. Jeffress, 11246. New York: Wiley.
MacKay, D. G. 1987. The Organization of Perception and Action: A Theory
for Language and Other Cognitive Skills. New York: Springer-Verlag.
. 1992. Constraints on theories of inner speech In Auditory
Imagery, ed. D. Reisberg, 12149. Hillsdale, NJ: Erlbaum.
MacKay, D. G., A. Allport, W. Prinz, and E. Scheerer. 1987. Relationships
and modules within language perception-production. In Language
Perception and Production: Relationships Among Listening, Speaking,
Reading and Writing, ed. A. Allport, D. G. MacKay, W. Prinz, and
E. Scheerer, 115. London: Academic Press.
MacKay, D. G., L. E. James, J. K. Taylor, and D. E. Marian. 2007. Amnesic
H.M. exhibits parallel deficits and sparing in language and memory: Systems versus binding theory accounts. Language and Cognitive
Processes 22: 377452.

SEMANTICS-PRAGMATICS INTERACTION
It seems unlikely that there will ever be consensus about the
extent to which we can reliably distinguish semantic phenomena from pragmatic phenomena. But there is now broad agreement that sentence meaning can be given in full only when
a sentence is studied in its natural habitat: as part of an utterance by an agent who intends it to communicate a message.
Here, we document some of the interactions that such study
has uncovered. In every case, in order to achieve even a basic
description, it is necessary to pool semantic information, contextual information, speaker intentions, and general pragmatic
pressures.
Space limitations preclude discussion of presuppositions
and speech-acts, two important classes of phenomena for
which semantics and pragmatics are so thoroughly intertwined that analyses of them invariably draw information from
both domains.
In a broad range of cases, pragmatic information is required
just to obtain complete and accurate meanings for the words and
phrases involved. indexical expressions are clear examples. In
order to determine the proposition that is expressed by an utterance of (1), we must look to the context to fix the speaker:
(1)

I am here.

We must also appeal to the context to obtain the intended meaning of here (in this room, in this city, etc.). Which meaning we
select will be shaped by considerations of informativity and
relevance. For example, (1) is likely to be trivially true if here is
construed as picking out planet Earth, and speakers will therefore avoid that interpretation until interplanetary travel becomes
routine.
Similar factors influence anaphora resolution. If a speaker
utters (2), his or her addressees will be guided to a referent for
she by their assumptions about intentions as well as discourse
coherence:
(2)

She left on a bicycle.

The interpretive steps resemble those for indexicals, but pronouns tend to be more ambiguous in practice, and, perhaps as a

Semantics-Pragmatics Interaction
result, languages depend on complex systems of conventions to
narrow their referents.
This kind of context dependency is so widespread in language
that one might feel hard-pressed to find an example that lacked
it. We do not know what is expressed by Its cloudy until we fix a
time and place (and a standard for cloudiness). We cant be sure
of the truth or falsity of the counterfactual If kangaroos had no
tails, they would fall over unless we know which possibilities to
consider. (Suppose they had crutches or jet packs.) If a politician says Many people support my proposal, we need to know
whether there are additional implicit restrictions (people in the
city, people who own cats), and we need to have a sense for the
current numerical standards for many. The sentence Wood is
strong enough can be evaluated only if we can flesh out its meaning so that it specifies what wood is claimed to be strong enough
for (Bach 1994). Comparable examples can be constructed for
just about any area of natural language.
Pragmatic information can enrich a speakers message in
ways that extend far beyond determining its central descriptive
content. The primary meaning classification here is the conversational implicature. The dialogue in (3), based on one
from Grice (1975), illustrates:
(3)

A: Does Smith have a new girlfriend?


B: Hes been spending a lot of time in New York lately.

From a semantic perspective, B fails to answer As question. We


do not, though, perceive the answer as irrelevant. We posit conversational implicatures that successfully address As query. If it
is shared knowledge that B keeps track of Smiths personal life,
then Bs answer will convey something like Yes, and she lives
in New York. If it is shared knowledge that B never dates New
Yorkers, then it might convey No, his trips prevent him from
finding companionship.
As this example indicates, conversational implicatures are
highly malleable. Changes to the context can deliver subtle meaning changes, even as the utterance remains the same and the
core semantic content stays fixed. This malleability is not shared
by semantic meanings, and so it provides a reliable method for
diagnosing a meaning as semantic or pragmatic. Theorists have
occasionally posited conversational implicatures that lack this
malleability (e.g., Sadock 1978), but there remains broad agreement that it is a distinguishing feature.
Conversational implicatures might sometimes play a direct
role in determining semantic content. For instance, it seems generally true that the semantic meaning of and does not impose
a temporal ordering on its conjuncts. We tend to interpret sentences like Ali fell out of bed and woke up as conveying that Ali
first fell out of bed and then woke up, but we can cancel this
meaning (but not in that order!) without contradicting ourselves. This suggests conversational implicature status, as does
our ability to derive the meaning from general pragmatic pressures. What, then, are we to make of (4), in which this pragmatic
meaning must be construed as part of the semantic content?
(4)

Driving home and drinking three beers is better than drinking


three beers and driving home. (Levinson 2000)

Here again, we seem to require high-level pragmatic information


just to determine the denotation of the sentence. (For a variety of

Semiotics
responses, see Levinson 2000, Chierchia 2004, Russell 2006, and
Bach 2004.)
The facts discussed here are important for the principle of
compositionality, which says that the meaning of a complex
expression is a function of the meanings of its parts and their
mode of composition. In order to fix the meaning of here in an
utterance of (1), for instance, we had to look far beyond the item
itself. We gathered information from the context, and we drew
inferences about the speakers intentions. Compositionality thus
demands that the context itself be one of the parts involved in
the construction of complex meanings. The example is not especially remarkable; we find intricate context dependency of this
sort throughout the compositional semantic system.
Christopher Potts
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bach, Kent. 1994. Conversational impliciture. Mind and Language
9: 12462.
. 1999. The semanticspragmatics distinction: What it is and why it
matters. In The SemanticsPragmatics Interface from Different Points
of View, ed. Ken Turner, 6584. Oxford: Elsevier.
. 2004. Context ex machina. In Semantics vs. Pragmatics, ed.
Zoltn Szab, 1544. New York: Oxford University Press.
Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena,
and the syntax/pragmatics interface. In Structures and Beyond: The
Cartography of Syntactic Structures. Vol. 3. Ed. Adriana Belletti, 39103.
New York: Oxford University Press.
Grice, H. Paul. 1975. Logic and conversation. In Syntax and Semantics.
Vol. 3: Speech Acts. Ed. Peter Cole and Jerry Morgan, 4358. New
York: Academic Press.
Horn, Laurence R. 2005. The border wars. In Where Semantics Meets
Pragmatics, ed. Klaus von Heusinger and Ken P. Turner, 2148.
Oxford: Elsevier.
Kadmon, Nirit. 2001. Formal Pragmatics. Oxford: Blackwell.
Levinson, Stephen C. 2000. Presumptive Meanings: The Theory of
Generalized Conversational Implicature. Cambridge, MA: MIT Press.
Russell, Benjamin. 2006. Against grammatical computation of scalar
implicatures. Journal of Semantics 23: 36182.
Sadock, Jerrold M. 1978. On testing for conversational implicature.
In Syntax and Semantics. Vol. 9: Pragmatics. Ed. Peter Cole, 28197.
New York: Academic Press.

SEMIOTICS
This term is remarkable in the history of intellectual life because
it was coined, almost simultaneously, by a French-speaking
Swiss linguist, Ferdinand de Saussure, and an English-speaking
American logician, Charles Sanders Peirce, around the turn of
the twentieth century. In this, the advent of semiotics as a field of
study rivals the almost simultaneous creation of the calculus by
Isaac Newton and Gottfried Leibniz in the late seventeenth century with the difference that it does not possess the precision
of Newtons and Leibnizs mathematical method but remains an
area of inquiry that is subject to great contestation. Moreover,
while Newtons and Leibnizs calculus was essentially the same,
Saussure and Peirce emphasize different aspects of semiotics. Still, they both aim at a science of meaning, and one aim of
this entry is to weave together their work and legacies. I begin,
then, with Peirces definition of semiotics as the study of signs.

759

Semiotics
Peirce noted that semiotic was the formal doctrine of signs
that is closely related to logic: a sign, Peirce writes, is something which stands to somebody for something in some respect
or capacity (193158, 2: 227, 2: 228). Moreover, signs, like logic,
exist within a system (or systems) that can be discerned and analyzed: Semiotics seeks to discover and describe the ways that
signs give rise to and communicate meaning.

Peirce, Semiotics, and Phenomenology


Nevertheless, a wider sense of semiotics is implicit within the
work of Peirce (and Saussure as well). Peirce is notorious for
multiplying technical terms, but throughout his work he presents triadic descriptions of the sign, which are best captured in
his discussion of the modes of relationship between sign vehicles
and their referents (see icon, index, and symbol). My view,
Peirce wrote early in his career, is that there are three modes of
being, and [I] hold that we can directly observe them in elements
of whatever is at any time before the mind in any way. They are
the being of positive qualitative possibility, the being of actual
fact, and the being of law that will govern the future (193158,
1: 23). The qualitative possibility he describes is the sheer sensation of phenomena, the redness of experience before it is
associated with an object or a meaning: the icon, he says excites
analogous sensations in the mind (193158, 2: 299). The being
of actual fact is apprehended by the indexical (or indicative)
nature of signs: the fact that signs point or refer to objects in
the world. No matter of fact, he writes, can be stated without
the use of some sign serving as an index (193158, 2: 305); an
index, he says, stands unequivocally for this or that existing
thing (193158, 2: 531); anything which focuses the attention is
an index (2: 285). The difference between icon and index, then,
is related, I suspect, to Ludwig Wittgensteins observation that
pointing to the shape of an object is different from pointing
to [its] color (1967, 33). Peirces third modality, law that will
govern the future, is the modality of symbolism, and it nicely
comports with the Danish linguist Louis Hjelmslevs redefinition
of meaning as purport: future-oriented signaling. For Peirce,
a symbol is a sign which refers to the object that it denotes by
virtue of a law, usually an association of general ideas, which
operates to cause the symbol to be interpreted as referring to that
object (2: 249). Needless to say, all signs and experience itself
participate in all of these modalities, even if one modality seems
to dominate in any particular case.
With its focus on the being of actual fact, Peirces semiotics addresses the issue of reference in its definition of the sign in
ways that Saussures semiotics with its focus on social psychology does not. (It is for this reason that Peirce has been influential in anthropology and social semiotics [see Merrell 1998 and
Daniel 1984]). Rather than Saussures model of the sign in terms
of dyadic form (signified/signifier), to be examined in a moment,
Peirce offers a triadic model, a representamen (which is the
form the sign takes), an interpretant (which is the sense made
of the sign), and an object (which is that to which the sign
refers). A sign in the form of a representamen, he writes,
is something which stands to somebody for something in some
respect of capacity. It addresses somebody, that is, creates in
the mind of that person an equivalent sign, or perhaps a more

760

developed sign. That sign which it creates I call the interpretant


of the first sign. The sign stands for something, its object. It stands
for that object, not in all respects, but in reference to a sort of idea,
which I have sometimes called the ground of the representamen.
(193158, 2: 228)

Finally, Peirce describes the interaction among the representamen, the interpretant, and the object as semiosis (193158,
5: 484). Since the interpretant, as Peirce says, is an equivalent
or more developed sign, this model of the sign lends itself to
what Umberto Eco describes as unlimited semiosis (1976, 69).
Both Peirce and Saussure were working in the context of the
emergence of philosophical phenomenology, and it is clear from
these early remarks of Peirce that his starting point (as it was
Saussures as well) was the phenomenology the experience of
meaning in human affairs. On first view, the scientific abstractions and formal systematizations seem at odds with the tracing
of phenomenal experience, but semiotics attempts to discover
a systematic accounting for the everyday experiences of meanings in the world. Just as it is virtually impossible for a literate
person not to apprehend meanings in the alphabetic markings
on a page, it is virtually impossible for people not to experience
their lives and the world in which they live in terms of meaning.
Peirces pure sensuousness of qualities, and even the pure whatness of unequivocal objects or unequivocal attention, almost
always (perhaps always) present themselves in relation to the
future-oriented signaling of signs that will be meaningful for
someone else (including ones momentary future self).
In his posthumous Course in General Linguistics ([1916] 1959),
Saussure coins the term semiology as part of his examination of
the systematic (i.e., structural) comprehension of language
(see structuralism). It is possible, he writes, to conceive
of a science which studies the role of signs as a part of social life.
It would form part of social psychology and, hence, of general
psychology. We shall call it semiology (from the Greek semeon,
sign). He goes on to say that linguistics is only one branch of
this general science. The laws which semiology will discover will
be laws applicable in linguistics ([1916] 1983, 1516).
Already here in Saussures exposition, one can see areas of
contest for the science of semiotics: While Saussure claims that
linguistics is a subset of the more general study of the sign, others Roland Barthes, in particular claim that linguistics is the
defining case of semiotics insofar as linguistics is the most systematic organization of signs and, thus, most precisely isolates
the defining features of any sign system and creates a vocabulary
for its understanding. Moreover, in this exposition, the contest of
situating semiotics in relation to language, as Saussure does, or
in relation to logic, as Peirce does, also arises. Peirce, in contrast
to Saussure, conceived of semiotics as the science of representation of which logic was a branch, though almost coextensive
with it; the aim of semiotics for Peirce was to distinguish the possible kinds of sign functions (or processes of semiosis), particularly in mathematics and the sciences. Because of his interest in
empirical sciences he focused not only on physics and chemistry but even spent years as a surveyor and cartographer he was
far more interested in sign functions in relation to reference than
was Saussure: No version of his category of index exists within
Saussures linguistics.

Semiotics
Any scientific organization isolates the defining features
of any system of understanding and creates a vocabulary for its
comprehension (in phonology, Roman Jakobson and N. S.
Trubetzkoy called them distinctive features); in this way, systematic science seems the antithesis to phenomenology. More
specifically, the issue of the functioning of a vocabulary or (in
mathematics) a notational system, rather than simple experience, is closely tied to semiotics insofar as the sign system of a
particular science helps delineate its object(s) of study. Thus,
Bertrand Russell has argued, speaking of the periodic table, that
it has been found that, when the order derived from the periodic
law differs from that derived from the atomic weight, the order
derived from the periodic law is much more important (1923,
23; see also Schleifer 2000, 167). That is, Russell like Peirce, a
philosophical mathematician and logician asserts that the sign
system of the periodic table reveals more truth than empirical
measurement.
In A Theory of Semiotics (1976), Eco offers a related description
of semiotics in relation to lying, rather than truth. Semiotics, he
says,
is concerned with everything that can be taken as a sign. A sign
is everything which can be taken as significantly substituting for
something else. This something else does not necessarily have to
exist or to actually be somewhere at the moment in which a sign
stands in for it. Thus semiotics is in principle the discipline studying everything which can be used in order to lie. If something cannot be used to tell a lie, conversely it cannot be used to tell the
truth: it cannot be used to tell at all. (1976, 7)

With this statement, we can see why semiotics is so useful for literary studies and the humanities in general since literature (and
perhaps philosophy and history as well) describes possible
worlds. Yet semiotics aspires to be a science, which is to say,
to systematically account for phenomena in a manner that is, at
once, accurate, simple, and generalizable. Eco seeks such a complete system when he claims that a theory of the lie should be
taken as a pretty comprehensive program for a general semiotics (1976, 7).

Saussure, Semiotics, and Linguistics


In part, it is because of his work in systematic linguistics that
Saussure, more than Peirce, has been most influential in the
development of semiotic science in the twentieth century. In fact,
it could be argued that Saussure himself is one of the founders
of systematic linguistics, what he called synchronic linguistics
(see synchrony and diachrony) and Jakobson, after him,
called structural linguistics. (Besides continental structural
linguistics, two other schools of linguistics have developed in the
twentieth century: empirical linguistics, growing out of anthropological work with Native Americans in the United States and
philosophical positivism (see logical positivism) and focusing on particular languages; and the transformational linguistics, growing out of Noam Chomskys work in mathematics
and based on syntactical sentential analysis. Neither school
has embraced semiotics.)
Saussures system of linguistics is based upon his definition of the sign which is neither a positive fact nor
essentially syntactical, but systematically (and structurally)

relational creating a working connection between what Claude


Lvi-Strauss calls the tangible and the intelligible: I have
tried, Lvi-Strauss says,
to transcend the contrast between the tangible and the intelligible by operating from the outset at the sign level. The function of
signs, is, precisely, to express the one by means of the other. Even
when very restricted in number, they lend themselves to rigorously organized combinations which can translate even the finest
shades of the whole range of sense experience. (1975, 14)

Lvi-Strauss is working in the tradition of the linguistics and


semiotics of Saussure, and, for both Lvi-Strauss and Saussure
(as for Peirce), the complex relationship between perception
and meaning between the tangible and the intelligible is
woven closely into everything we do. A striking part of being
human is the ability to discern meaning to apprehend signs
wherever we look: not only in stories we hear or experiences we
remember but also in things as subtle as the tone of voice, facial
expressions, the accidental ways light might strike a tree, color of
cloths, sequences of sounds. Saussure describes this by defining
the nature of the linguistic sign as the union of a concept and a
sound pattern ([1916] 1983, 66; an earlier translation calls this a
sound image [1959, 66]), creating a relationship between the
signified (French: signifi) and the signifier (signifiant, a verbal
noun which could be translated as the signifying). A linguistic
sign, Saussure notes, is not a link between a thing and a name,
but between a concept and a sound pattern. The sound pattern is
not actually a sound, for a sound is something physical. A sound
pattern is the hearers psychological impression of a sound, as
given to him by the evidence of his senses ([1916] 1983, 66).
For Saussure, the signified is the meaning, but, more importantly, the signifier in language is not simply some physical phenomenon (such as the wavelengths of sound, the ink on a page)
but, rather, a sound pattern, itself a psychological phenomenon. This is a distinguishing difference between linguistics and
semiotics: In linguistic science, the signifiers which is to say, the
combinations of phonemes of a particular language are themselves systematically organized and apprehended. Phonemes
are the smallest distinguishing phenomena in a language in
English, we have about 40 that are more or less represented by
our phonetic alphabet and they are called distinguishing
phenomena because they function to mark differences that are
apprehended as meaningful, such as the distinction between toe
and doe. The greatest scientific achievement of Saussurean linguistics was, in fact, the systematic description of the phonemes
of language and their further analysis into distinctive features
by Jakobson and Trubetzkoy in the 1930s. Thus, the English phonemes /t/ and /d/ (in toe and doe) share all their features except
one: /d/ is voiced, which means that its articulation includes
engaged vocal chords, while /t/ is unvoiced, as the vocal chords
do not vibrate. Not all features of sound are distinctive: thus, in
English, whether or not one aspirates the pronunciation of /t/
adding an explosion of air at the end of its pronunciation does
not matter in relation to the meaning of a word (though it does
in other languages).
Such aspiration, however, does convey significance even
in English: In certain social contexts, it may suggest an upperclass affectation of pronunciation, perhaps even disdain. And it

761

Semiotics
is in this significance that the difference between linguistics and
semiotics can be discerned. In semiotics, unlike language, the
signifiers are not necessarily systematically developed but found
ready-to-hand in distinctions that can be appropriated to meaning. As Barthes has argued, in opposition to human language, in
which the phonic substance [the phoneme] is immediately significant, and only significant, most semiological systems probably involve a matter which has another function beside that of
being significant (bread is used to nourish, garments to protect)
(1967, 68). Barthes here is pursuing the argument that food and
fashion are semiotic systems; in fact, he suggests that a system of
meaning can take up most anything as its signs. And he is also
demonstrating (in his analogy between phonology and semiotics), as already mentioned, that linguistics has systematically
developed the scientific distinctions and discourse that must
be applied to semiotics (1985, xi), while Saussure imagined that
semiology would be an encompassing science lending its methods to linguistics.
In any case, Saussurean linguistics assumes that the basic elements of language and of signification more generally can be
studied only in relation to their function, rather than their cause.
Thus, the signifier implies a signified and vice versa, and such
reciprocal presupposition (see A. J. Greimas 1983) means that
the functioning of the sign, rather than any prior cause, governs
its understanding. Such a relational definition allows Saussure to
explain the problem of the identity of units of language and signification: The reason we can recognize different pronunciations
of the word stop as the same word (even when some aspirate
the /t/) is that the word is not defined by inherent qualities, but
rather it is defined in relation to other elements in a system the
structural whole of the English language.
This further suggests another governing assumption in
Saussurean linguistics and semiotics, what Saussure calls the
arbitrary nature of the sign. By this, he means that the relationship between the signifier and signified in language is never
necessary (or motivated): One could just as easily find the
sound signifier arbre as the signifier tree to unite with the concept of tree. But more than this, it means that the signified is
arbitrary as well: One could as easily define the concept tree
by its woody quality (which would exclude palm trees) as by its
size (which excludes the low woody plants we call shrubs). This
relationship is not necessary because it is not based upon inherent qualities of signifier or signified: The sign itself is governed
by systematic relationships. Moreover, these assumptions about
the nature of language also suggest why reference, the obvious
sense we all have that words refer to things in the world, is simply
excluded from Saussurean semiotics, which remains indeed in
many ways initiates a constructivist understanding of meaning. Unlike that of Peirce, in the Saussurean tradition of linguistics and semiotics which includes the linguistics of Jakobson,
the structuralism of Lvi-Strauss, even the post-structuralism
of Jacques Derrida (see deconstruction) reference remains
a problem.

The Scope of Semiotics


These working definitions of signs have been brought to bear
on a wide range of phenomena: in literature, film, written and
oral discourse, gesture, animal behavior (zoosemiotics); in

762

Sense and Reference


particular areas of study, such as olfactory signs, tactile communication, visual communication; in specifying cultural codes like
those of medicine, music, writing, text theory, aesthetic texts, codes
of taste, and mass communication (see Culler 1981 for this taxonomy). Even psychoanalysis, in the work of Jacques Lacan, has
been seen as quintessentially a semiotic science. In each case,
writers in these fields begin with the view, shared by Peirce and
Saussure, that a science of the seeming self-evident phenomena
of meaning is possible and useful.
Ronald Schleifer
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Barthes, Roland. 1967. Elements of Semiology. Trans. Annette Lavers and
Colin Smith. New York: Hill and Wang.
. 1985. The Fashion System. Trans. Matthew Ward and Richard
Howard. London: Jonathan Cape.
Chandler, Daniel. 1994. Semiotics for Beginners. Available online
at: http://www.aber.ac.uk/media/Documents/S4B/.
Culler, Jonathan. 1981. The Pursuit of Signs. Ithaca, NY: Cornell University
Press.
Daniel, E. Valentine. 1984. Fluid Signs: Being a Person the Tamil Way.
Berkeley: University of California Press.
Derrida, Jacques. 1976. Of Grammatology. Trans. Gayatri Spivak.
Baltimore: Johns Hopkins University Press.
Eco, Umberto. 1976. A Theory of Semiotics. Bloomington: Indiana
University Press.
Greimas, A. J. 1983. Structural Semantics. Trans. Daniele McDowell,
Ronald Schleifer, and Alan Velie. Lincoln: University of Nebraska
Press.
Hjelmslev, Louis. 1961. Prolegomena to a Theory of Language. Trans.
Francis Whitfield. Madison: University of Wisconsin Press.
Lvi-Strauss, Claude. 1975. The Raw and the Cooked. Trans. John
Weightman and Doreen Weightman. New York: Harper.
Merrell, Floyd. 1998. Sensing Semiosis: Toward the Possibility of
Complementary Cultural Logics. New York: St. Martins.
Peirce, Charles Sanders. 193135, 1958. Collected Papers. Vols. 16, ed. C.
Hartshorne and P. Weiss; Vols. 78, ed. A. Burks. Cambridge: Harvard
University Press.
Russell, Bertrand. 1923. The ABC of Atoms. London: Kegan Paul, Trench,
Trubner.
Sampson, Geoffrey. 1980. Schools of Linguistics. Stanford, CA: Stanford
University Press.
Saussure, Ferdinand de. [1916] 1959. Course in General Linguistics. Trans.
Wade Baskin. New York: McGraw Hill.
. [1916] 1983. Course in General Linguistics. Trans. Roy Harris.
Chicago: Open Court Press.
Schleifer, Ronald. 1987. A. J. Greimas and the Nature of Meaning: Linguistics, Semiotics, and Discourse Theory. London: Routledge [Croom
Helm]. Lincoln: University of Nebraska Press.
. 2000. Modernism and Time: The Logic of Abundance in Literature,
Science, and Culture 18801930. Cambridge: Cambridge University
Press.
Sebeok, Thomas. 1991. American Signatures: Semiotic Inquiry and
Method. Norman: University of Oklahoma Press.
Wittgenstein, Ludwig. 1967. Philosophical Investigations. Trans.
G. E. M. Anscombe. Oxford: Oxford University Press.

SENSE AND REFERENCE


Of the many ideas that have been bequeathed to us from the
thought of Gottlob Frege, none has captured the philosophical

Sense and Reference


imagination so substantially as the distinction between sense and
reference. To a large extent, this is because the distinction has been
seen as the philosophical foundation of a theory of meaning. The
notion that information, encapsulated in sense as a mode of presentation, underlies in some way our referential use of language
is ubiquitous within contemporary philosophy, being embedded
within a wide range of approaches to reference and meaning. The
distinction, however, does not emerge in Freges own work in the
context of philosophical reflections on meaning, but rather more
as part of the conceptual foundations of his grand mathematical
project logicism, the reduction of arithmetic to logic. Central
to bringing this project to fruition was the identification of arithmetic equalities as identity statements: Identities, Frege tells us
in The Foundations of Arithmetic ([1884] 1950), are, of all forms
of proposition, the most typical of arithmetic. But this must be
accomplished in a way that does not reduce the content of arithmetic statements to triviality: 2 + 3 = 5 must mean something
different than 5 = 5 if arithmetic propositions are to have significant mathematical content, and it is in order to establish this
significance that Frege introduces the distinction between sense
and reference.
Although there are identifiable antecedents of this view (Lotze
1888), it is from Frege that the key insight for understanding the
significance of identities emerges; it is the notion that their significance arises from there being distinct ways of something being
given to us that are judged to present one and the same thing: 2
+ 3 = 5 has greater significance than an instance of the law of
self-identity just because 2 + 3 and 5 each express different
ways of giving the number 5. Each expresses, Frege would say,
a different sense, although each has the same reference, hence,
the truth of 2 + 3 = 5. But now the questions weighs on us of
just what senses are, and what relation they bear to references.
Answering these questions requires that we delve into Freges
view of content, and how he understands the nature of logical
structure. For recall that it is Freges thesis that arithmetic propositions are logical propositions, and so arithmetic content, and
hence the significance of arithmetic propositions, are reflections
of logical content and the structure of propositions of logic.
With regard to content, there is a persistent view that runs
through Freges thinking from beginning to end; it is that we
have no direct cognitive purchase on the structure of content.
Rather, to gain a grip on content, and come to have knowledge of the truths that obtain of it, there must be some system
through which we are related to content, through which we can
view content in a structured manner. It is in this system that
propositions about content are formed, and their formal structure will reflect the logical structure of content, its division into
function and argument, of which the distinction of concept and
object is the central instance. What does evolve over the course
of Freges thinking is his view of the system that imposes this
structure. Initially, in Begriffsschrift, Frege took this system to
be language, function and argument being linguistic notions,
which have nothing to do with conceptual content; it cocerns
only our way of looking at it ([1879] 1952, 12). Language, on this
view, is an external purveyor of structure for content: It carves
content, imposing a formal structure on it; because we can view
content as so carved, we are provided with logical access to content, by the truth of propositions complexes of functions and

argument formed in the language. But if language is structuring


of content on Freges earliest view, its role changes on his considered view, for what specifies the structure of content is to be no
longer taken as an external system, but rather as something itself
internal to content. Thus, Frege says in a well-known remark,
that content has now split for me into what I call thought
and truth-value, as a consequence of distinguishing between
the sense and reference of a sign ([1893] 1964, 67), where the
role of the former is to specify the structure of the latter. Thought
now takes on the role previously played by language; sense
determines reference, that is, determines that it has a structure of
function and argument.
On Freges considered view, then, thoughts, the loci of discrimination of structure in content, are themselves inherent
aspects of content. As such, thoughts are complexes, composed
of constituent senses, which present functions and objects standing as their arguments as the elements of content. If each component sense presents a reference, then senses grouped together
to form a thought can present a truth value as a reference: the
true just in case the object presented falls under the concept presented, otherwise the false. Moreover, senses present references
with respect to particular modes of presentation; but while each
sense encapsulates a distinct mode, distinct modes may or may
not present distinct references. There may be more than one
way of picking out the same point in content. Finally, thoughts
are expressed by sentences of a language, forms in a conceptual
notation. By expressing thoughts, sentences of a language represent the structure of content determined by thoughts; language
mirrors thought in this regard, the outer form, the clothing of
thought, which brings thought into the realm of the sensory.
In one of his last essays, entitled Thoughts ([1918] 1977),
Frege goes to considerable lengths to distinguish thoughts as
just described from what he labels ideas. For Frege, ideas are
the contents of our minds, and he was highly skeptical that we
could know the contents of minds (perhaps other than our own);
given the fundamentally private and subjective nature of ideas,
the contents of minds are in a significant way unknowable and,
hence unsuitable as the objects (propositions) of logical inquiry.
But thoughts are not ideas. Thoughts are not inherently mind
contents but are objective, mind independent entities, whose
existence can be established by specifying their constitution,
how they are composed as complexes of senses as constituent
parts. But if the demands of logic, the science of thoughts, for
certainty about the existence of its subject matter can be satisfied by characterizing the logical role of thoughts in terms of
their composition, this is not also a requirement on the building
blocks of thoughts, namely, senses. While senses, or the modes
of presentation they encapsulate, are often given as identifying
descriptions, an expediency that Frege himself appeals to, he
nevertheless does not hold that by doing so we thereby establish
the existence of senses. Rather, with regard to senses, he proceeds primarily by elucidating the roles they play as components
of thoughts and in determining reference, although he does give
one argument directly to the existence of senses. Since, for Frege,
that which can be established as part of content exists, if there
is a systematic context in which senses are references, then the
existence of senses can be inferred. The context he observes that
has this property is the oblique context of propositional attitude

763

Sense and Reference


ascriptions, wherein an expressions customary sense stands as
its (indirect) reference. And this is enough to establish the existence of senses, independently of how they may be constituted
so as to carry out the function of presenting a reference.
Although Frege is insistent that objective thoughts be clearly
distinguished from subjective ideas, this is not meant to imply
that we do not have cognitive relations to thoughts, or senses
more generally. Indeed, it is central to his view that we do, for
it is in virtue of our grasp of senses and how they combine into
thoughts that we can have cognitive access to the logical structure of content. To grasp the sense of a word is to understand
the semantic role it plays. If a sense is expressed by a name, it
is that the sense presents a certain object; if by a predicate, it is
that a certain concept is presented. Grasping their combination
is to understand what it would be for the thought expressed by
the resulting sentence to be a true thought, that is, for the object
to fall under the concept. Thus, to grasp a thought is to be in a
cognitive relation to a thought, such that the thought is causal
of being in a certain cognitive state, the state of knowing truth
conditions. From this perspective, then, the sense/reference
distinction is a division between that to which we are in a direct
cognitive relation that is, grasp and that to which we have cognitive access only through that which is grasped. In particular,
we have no cognitive grip on objects aside from their being given
to us via modes of presentation encapsulated within senses.
Senses themselves, however, are the exceptions; even though by
Freges lights they too are objects we can speak of the sense of
an expression, referring to a sense they are not given to us in this
way. Rather, senses as existent, albeit abstract, objects are given
to us in a direct, unmediated fashion; we grasp them. (Bertrand
Russell found this aspect of senses problematic, as he identified grasp with his notion of direct acquaintance. According to
Russell, we cannot be directly acquainted with objects, only
with sense-data of them; but insofar as senses are reified, it is as
abstract objects, of which there is no sense-data. His response
was to dispense with senses, his denoting-concepts, in favor of
the theory of descriptions [1903, 5365; 1905].)
For Frege, the components of content are related in two ways.
The first is what we have been discussing thus far, the semantic
relation of presentation of reference by sense. The second is judgment. Frege describes judgments as advances from thoughts to
truth value ([1893] 1952, 65) by which we inwardly recognize
that a thought is true ([1897] 1979, 139). A thought in this regard
is the objective basis for a subjective cognitive act, namely, judgment, by which we come to know whether the thought grasped is
a true thought or not. The cognitive value of a thought relates to
this role played in judgment; it is a measure of what the thought
itself contributes to the cognitive act of judgment. Thus, a thought
is cognitively valuable to the extent that it can give rise to a judgment that can extend our knowledge. On this view, Cicero is a
Roman and Tully is a Roman, although they express different
thoughts, nevertheless have the same cognitive value; judging
either one involves the same sort of cognitive act, but based on
different information, the different senses expressed by Cicero
and Tully (Frege [1891] 1952). This is not true, however, of
Cicero is Tully and Cicero is Cicero. Pairs like these express
different sorts of thoughts, in virtue of which they differ in cognitive value. A thought for which we must take into account the

764

Sentence
particulars of the determination of reference by the constituent
senses (i.e., their modes of presentation) to advance to truth will
have greater cognitive value than thoughts for which this is not
required, and any thoughts that so differ in cognitive value will
consequently enter into distinct sorts of judgments, differing in
whether they can extend our knowledge or not. Thus, since to
judge its truth we do not have to attend to the semantic connection of sense and reference, the thought expressed by Cicero
is Cicero lacks the modicum of cognitive value possessed by
Cicero is Tully.
The point is completely general; 2 + 3 = 5 and 5 = 5 also
differ in cognitive value in just this way. And so Frege now has the
result he needs: 2 + 3 = 5 and 5 = 5 have different mathematical content; the difference in their cognitive values belies that
one has significant mathematical content, while the other is no
more than an expression of the law of self-identity. Consequently,
assuming that arithmetic equalities are a species of identities is
no bar to reduction of arithmetic to logic, given the distinction,
founded in Freges understanding of content and our epistemic
access to it, of sense and reference.
Robert May
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Dummett, Michael. 1973. Frege: Philosophy of Languge.
London: Duckworth.
. [1975] 1978. Freges distinction between sense and reference.
In Truth and Other Enigmas, 11645. Cambridge: Harvard University
Press.
Frege, Gottlob. [1879] 1952. Begriffsschrift, a Formalized Language of
Pure Thought Modelled upon the Language of Arithmetic. Trans. Peter
Geach. In Translations from the Philosophical Writings of Gottlob Frege,
ed. Peter Geach and Max Black, 120. Oxford: Basil Blackwell.
. [1884] 1950. The Foundations of Arithmetic. Trans. J. L. Austin.
Oxford: Basil Blackwell.
. [1891] 1952. Function and concept. Trans. Peter Geach. In
Translations from the Philosophical Writings of Gottlob Frege, ed. Peter
Geach and Max Black, 2141. Oxford: Basil Blackwell.
. [1893] 1952. On sense and reference. Trans. Max Black. In
Translations from the Philosophical Writings of Gottlob Frege, ed. Peter
Geach and Max Black, 5678. Oxford: Basil Blackwell.
. [1893] 1964. The Basic Laws of Arithmetic. Trans. Montgomery
Furth. Berkeley and Los Angeles: University of California Press.
. [1897] 1979. Logic. Trans. Peter Long and Roger White. In
Posthumous Writings, ed. Hans Hermes, Friedrich Kambartel, and
Friedrich Kaulbach, 12651. Chicago: University of Chicago Press.
. [1918] 1977. Thoughts. Trans. P. T. Geach and R. H. Stoothof. In
Logical Investigations, ed. P. T. Geach, 130. Oxford: Basil Blackwell.
Heck, Richard, and Robert May. 2006. Freges contribution to philosophy of language. In The Oxford Handbook of Philosophy of Language,
ed. Ernest Lepore and Barry Smith, 339. Oxford: Oxford University
Press.
Lotze, Hermann. 1888. Logic. Oxford: Oxford University Press.
Russell, Bertrand. 1903. Principles of Mathematics. London: George Allen
and Unwin.
. 1905. On Denoting. Mind 14: 47993.

SENTENCE
This entry starts with analytical issues associated with the concept of sentence and then turns to more general issues.

Sentence
Traditionally, the sentence is the domain within which purely
syntactic relations, constituent structure, and grammaticality are defined. It is the one independent syntactic entity.
It expresses a predication and is often regarded as the linguistic
vehicle for the expression of a propositonal attitude and
thus the performance of a speech-act.
That much would generally be held to be true cross-linguistically, though how it is realized in syntax and morphology
differs from language to language. We focus here on English.
Traditionally, English sentences consist of one expression functioning as subject and another as predicate, the latter centered
on a verb which must be finite (i.e., inflected for tense/subject
agreement). What further expressions must figure in the predicate depends on the type (subcategory) of the verb: none with
intransitive verbs (Tom [laughed]), one with transitives (Tom
[ate the pies]), two with ditransitives (Tom [gave them the
pies]), for example. Sentences are declarative (all of those examples), interrogative (What did Tom eat?), exclamative (What a
long meeting that was!), or imperative (Eat those pies!), where
the subject is not overt because understood as referring to the
addressee.
Sentences may contain sentence-like constituents, traditional subordinate clauses. In these, the subject may be nonovert
because understood (compare Tom wants [Anna to go] with
Tom wants [to go], where Tom is the subject of to go), and the
verb need not be finite (to is the infinitive particle). Subordinate
clauses contribute to the speech-act performable at sentence
level but dont in themselves allow for the performance of a
speech-act. A subordinate interrogative clause, for example, is
not a question (Tom asked [who ate the pies]). Simple sentences
consist of one clause (sentence equals clause here). Complex
sentences consist of a main clause and any number of subordinate clauses. (See Burton-Roberts 2010, a basic introduction to
English sentence analysis, and Huddleston and Pullum 2002, a
comprehensive grammar, in which sentence is abandoned in
favor of clause.)
In what follows, we trace how sentence has fared in Chomskian
generative grammar (CGG, henceforth). Noam Chomsky
(1957) was traditional in taking S as the symbol to be defined
by any grammar the initial symbol and thus the root node in
any phrase structure tree. In defining S for a language L,
a grammar was said to generate the sentences of L and thereby
generate L. The definition consisted of successive rewrite rules,
the first being [S NP-VP], where NP (noun phrase) functions
as subject and VP (verb phrase) as predicate, though predicate
is seldom used in this context (see predicate and argument ). Chomsky (1965) adopted an alternative initial rule,
which informed subsequent developments: [S NP-AUX-VP].
Here, AUX is the locus of tense/agreement, as in S[NP[Tom]
AUX[has] VP[eaten the pies]]. Subordinate clauses were treated
as embedded sentences. To accommodate clause-introducing
complementiser expressions (e.g. that/whether/when in Anna
knows that/whether/when Tom laughed), an extension of S was
introduced: S (S-bar). This was defined/expanded by the rule
[S Comp-S], designed to capture the fact that [that/whether/
when [Tom laughed]] is in some sense sentential (clausal), while
distinguishing it from the basic clause itself (Tom laughed). In
terminology adopted later, S is a projection of S. This projection

idea was developed in x-bar theory, with consequences for S


as a formal category.
Pregenerative (Bloomfieldian) linguistics distinguished
between endocentric and exocentric structures. Endocentric
structures are headed, centered on a constituent of the same
category as the whole structure. Thus, Toms summary of the
argument is an NP because centered on the noun summary. In
other words, the NP is a projection of its head, N. Now, thought
of as constituted as either [NP-VP] or [NP-AUX-VP], S is a category distinct from the categories of its constituents and, hence,
not a projection of any of them. S would therefore seem to be
headless (exocentric). By contrast, X-bar theory (initiated in
Chomsky 1970) makes the constraining assumption that all categories are endocentric and have the same three-level projective
structure (XP, X, X). With S treated as the (exocentric) root node,
we miss the parallelism between the above NP and the sentence
Tom summarized the argument. If summary is head of the NP,
the inflected verb summarized should be the head of the corresponding sentence. This suggests that the root node is not S but
a projection either of V or, given NP-AUX-VP, of the tense/agreement inflection.
Sentence is therefore abandoned as a formal category, at least
in CGG. An inflection phrase system (IP, I, I) is posited, with
the traditional subject treated as the specifier of IP (Spec, IP).
Furthermore, this Infl system is itself embedded within a complementizer phrase system (CP, C, C). This goes for all clauses,
including main clauses (e.g., CP[who C[ C[did] IP[he marry t]]]).
The root node, therefore, is CP.
Although S(entence) has been abandoned as a formal category in CGG, the term sentence is still widely used even in that
context, but informally. This is how it is used in what follows,
where we turn to more general questions.
How does the structural notion of sentence (however analyzed) map onto speech-acts and speakers utterance behavior?
This can be seen as a question of competence versus performance, interfacing with pragmatics (see also semanticspragmatics interactions).
It is uncontroversial that speakers utter words, one after
the other. But do speakers utter sentences? Strings of uttered
words can be structurally ambiguous (He-watched-the-manwith-the-telescope). However, as described, a sentence has
indeed is a unique structure (generated, we assume, by a
mentally represented grammar). A structure, as such, cannot
be structurally ambiguous. This suggests that word strings can
be structurally ambiguous because they dont, in fact, have
syntactic structure. Arguably, this is why word strings require
parsing. Parsing, on these terms, is a matter of putting a
(structureless) word string into correspondence with a sentential structure. Structural ambiguity in a word string, rather than
being a matter of having more than one structure, occurs
when the string can be put into correspondence with more
than one uniquely structured sentence. This suggests that we
do not utter sentences as such.
Nevertheless, it is generally assumed, if only informally,
that uttering certain word strings counts as uttering a sentence.
On that assumption, sentence is a somewhat ambiguous term,
applicable both to mind-internal structures and to what can be
uttered/heard in speech.

765

Sentence

Sentence Meaning

Even allowing that sentences can be uttered, sentence and


utterance are not isomorphic. For example, utterances may
include parenthetical elements, some of which are not clearly
constituents of sentence structure (Its I dont know lets see
now about twenty miles) and some whose status as sentence
constituents is controversial for example, appositive relative
clauses. Notice that in uttering Tom, who eats all the pies, is getting fat, we perform two (assertive) speech-acts. (See BurtonRoberts 2005 on parentheticals.)
Furthermore, speech-acts performable in uttering sentences
are performable by nonsentential utterances. Yes in answer
to Did Tom laugh? communicates whats communicated by
Tom laughed. Is the utterance of yes the utterance of a sentence? Possibly if yes is a sentential pro-form (a pro-sentence,
parallel with pronoun and the pro-VP do so). The question arises
more urgently with nonsentential utterances that are clearly not
pro-sentences: Possibly (just used), What a day! Ready?
and Sleep in reply to What did you do today? A syntactic
approach, appealing to ellipsis, would analyse them as utterances of sentential structures (e.g., Are you ready? or What I did
today was sleep) in which words, or at least their phonological
features, are deleted. A nonlinguistic, pragmatic approach would
treat them as utterances of just the words heard, explaining what
they communicate by appeal to distinct conceptual structures
in mentalese (see language of thought). (See Stainton
[2006] for discussion and references.) The issue arises even with
utterances not generally regarded as elliptical. Is Eat! the utterance of an imperative sentence or just a verb? Its raining generally communicates that its raining here (where the speaker is).
But does the sentence uttered include a covert location variable
whose value is given in the context of utterance, as a matter of
sentence semantics, or is the location necessitated (and supplied) independently by conceptual structure? (See Stanley 2000;
Recanati 2002.)
At issue is the relation of sentences to utterances, on the one
hand, and thoughts, on the other. How sentence is understood
depends on how sentences are felt to be related to and distinguished from thoughts and utterances.
Noel Burton-Roberts
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Burton-Roberts, Noel. 2010. Analysing Sentences. Harlow, UK: Pearson.
. 2005. Parentheticals. In Encyclopedia of Language and Linguistics.
2d ed. Oxford: Elsevier.
Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton.
. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT
Press.
. 1970. Remarks on nominalization. In Readings in English
Transformational Grammar, ed. R. Jacobs and P. Rosenbaum, 184
221. Waltham, MA.: Blaisdell.
Huddleston, Rodney, and G. K. Pullum. 2002. The Cambridge Grammar
of the English Language. Cambridge: Cambridge University Press.
Recanati, Franois. 2002. Unarticulated constituents. Linguistics and
Philosophy 25: 299345.
Stainton, R. 2006. Words and Thoughts: Subsentences, Ellipsis, and the
Philosophy of Language. Oxford: Clarendon.
Stanley, Jason. 2000. Context and logical form. Linguistics and
Philosophy 23: 391434.

766

SENTENCE MEANING
This expression occurs in two different contexts, in contrast with
word meaning or in contrast with speakers meaning.

Sentence Meaning Versus Word Meaning


The opposition word meaningsentence meaning points to the
contrast between lexical semantics and compositional
semantics, or the semantics of syntax. Different approaches to
semantics have different conceptions of what meanings are,
but all must address both the meanings of the smallest parts
and how the meaning of a sentence is built up from the meanings of its parts. Sentence meaning and compositionality
are central concerns of formal semantics. In classical montague grammar, sentence meaning rests on the distinction
between intension and extension (Frege 1892; Tarski 1944).
The intension of a sentence is a function from possible states of
affairs (see possible worlds semantics) to its extension in
that state of affairs, its truth value. Two sentences have the same
intension if they have the same truth value in every possible state
of affairs; equating intensions with meanings gives truth conditional semantics. Many formal semanticists advocate a
more dynamic notion of sentence meaning: Irene Heim ([1982]
1989) and Hans Kamp (1981) analyze sentence meaning as a
function from contexts to contexts, with truth conditions relativized to contexts.
Despite differences, all theories of sentence meaning address
how the same words, put together with different syntactic structures, yield different meanings, as with No one that John dates
likes his mother versus His mother likes that John dates no one.

Sentence Meaning Versus Speakers Meaning


The opposition sentence meaningspeakers meaning is part
of the distinction between meaning and use and is sometimes
expressed as semantic meaning versus speakers meaning. In this
context, sentence meaning is understood as the literal meaning
of a sentence abstracted away from any particular context, and
speakers meaning is the intended interpretation of a particular utterance of a given sentence. This distinction presupposes
a particular, sometimes disputed, boundary between semantics
and pragmatics (see semantics-pragmatics interaction).
A vivid borderline problem concerns the distinction between
attributive and referential uses of definite descriptions
(Donnellan 1966, Kripke 1977) in Smiths murderer is insane: Are
there two different sentence meanings here, one in which Smiths
murderer picks out whoever committed the murder (attributive)
and another in which it picks out an intended referent (who may
be innocent)? Or is there just one sentence meaning but two possible speakers meanings? The debate continues.
Barbara H. Partee
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Donnellan, Keith. 1966. Reference and definite descriptions.
Philosophical Review 75: 281304.
Frege, Gottlob. 1892. ber Sinn und Bedeutung. Zeitschrift fr
Philosophie und philosophische Kritik 100: 2550.
Heim, Irene. [1982] 1989. The Semantics of Definite and Indefinite Noun
Phrases. New York: Garland.

Sexuality and Language


Kamp, Hans. 1981. A theory of truth and semantic representation. In
Formal Methods in the Study of Language, Mathematical Centre Tracts
135, ed. J. A. G. Groenendijk et al., 277322. Amsterdam: Mathematical
Centre.
Kripke, Saul. 1977. Speakers reference and semantic reference.
In Contemporary Perspectives in the Philosophy of Language, ed.
P. A. French et al., 25576. Minneapolis: University of Minnesota Press.
Tarski, Alfred. 1944. The semantic conception of truth and the foundations of semantics. Philosophy and Phenomenological Research
4: 34175.

SEXUALITY AND LANGUAGE


The English term sexuality has both a generic and a more
restricted meaning. Generically, it refers to the social organization of sexual desires, relationships, and behaviors. In presentday usage, however, it frequently has the narrower sense that the
online Oxford English Dictionary glosses as a persons sexual
identity in relation to the gender to which he or she is typically
attracted; the fact of being heterosexual, homosexual or bisexual. In Western societies since the nineteenth century, this
aspect of sexuality also known as sexual orientation or sexual
preference has come to be conceptualized as a core aspect of
personal/social identity. A similar conceptualization is increasingly diffusing into non-Western societies, too.
The study of sexuality and language can also be thought of
either broadly, as a multifaceted inquiry into the relationship
between language and sex as a domain of human experience, or
more narrowly, as an inquiry into the relationship between language and sexual identity. In practice, the dominant current of
research has adopted the narrower definition, concerning itself
primarily with language use among Western gay men (lesbians,
though in theory of equal interest, have been studied less often).
Recently, however, there has been more work dealing with sexual diversity across cultures (e.g., Leap and Boellstorff 2004);
researchers have also begun to highlight the linguistic construction of heterosexual identity and the part played by language in
reproducing heteronormativity the privileging of mainstream
forms of heterosexuality as natural and normal.
In addition, there have been calls for researchers in this
field to embrace the broader definition of sexuality and move
beyond an exclusive focus on language and identity to
address a greater range of issues. Don Kulick (2000) has called
in particular for more attention to the linguistic construction and
representation of erotic desire and suggested that work on psychoanalysis and language might be one potential source of
insight. This proposal remains controversial, however; for many
researchers, the linguistic expression of minority sexual identity
remains the core issue.
The earliest research dealing with what was then dubbed the
language of homosexuality was undertaken in the first half of
the twentieth century, and its results most often appeared in the
clinical literature of psychiatry and sexology (e.g., Rosanoff 1927;
Legman 1941). Typically, this research produced glossaries of the
slang terms used within homosexual subcultures. Another linguistic marker of interest to early researchers was the use of feminine gender-marked forms (personal names, third person she and
its variants, descriptors such as girl, queen, auntie) among some
male homosexuals (see gender marking). These linguistic

phenomena were presented in clinical sources as shedding light


on the sociosexual pathology of the homosexual as a type.
Slightly later researchers (e.g., Cory 1951 [a pseudonym of
the sociologist Edward Sagarin]; Sonenschein 1969) adopted a
more sociological perspective, according to which the distinctive argot of homosexual subcultures was less a reflex of their
sexual pathology than a product of the need of all outcast groups
to organize social dealings among themselves while excluding
outsiders. These socially oriented researchers were more sensitive than their clinically oriented predecessors to the existence of
intragroup variation, pointing out (to put their insight in modern
sociolinguistic terminology) that language use was affected
by the strength of a speakers ties to the group. There were homosexuals who did not know the argot because their circumstances
restricted their contact with the in-group settings in which it was
mainly used.
This may also explain why the language of homosexuality
was, in effect, the argot of male homosexuals. Gershon Legman
(1941) explicitly denied that lesbians had a parallel set of sexual
slang terms, attributing this to their tradition of gentlemanly
restraint, but David Sonenschein (1969) noted that in the area
where he carried out research, there were no bars or other social
spaces where lesbians could interact regularly in significant
numbers. Legman (who is not notable for the internal consistency of his arguments) had speculated that lesbian argot
might, after all, be found in womens prisons a suggestion to
which some support would later be given by a (criminological,
rather than linguistic) study of a womens prison (Giallombardo
1966).
The study of language and sexual identity had begun as an
inquiry into one aspect of one kind of social deviance carried on
by researchers who either were or presented themselves as outgroup members. But this changed with the advent of a militant
gay liberation movement at the end of the 1960s. Some commentators of this period contested the view that gay men constituted a linguistically defined subculture: This was in part a
reaction against the previous treatment of homosexuals as exotic
or pathological specimens, but it was also because liberationists had a political interest in discouraging the continued use of
secret in-group codes (the new norm was out and proud) or
feminine gender marking among men (gay was not supposed to
mean effeminate).
Before long, though, interest in the language-use characteristic of gay (and, in theory, lesbian) subcultures reemerged within
the community itself. Proposals were made about the nature and
typical uses of what Joseph Hayes (1981) dubbed Gayspeak,
meaning a distinctively gay male way of using language to interact, somewhat analogous to constructs like womens language
and genderlect that had been debated by feminist linguists
during the previous decade. The search for a discourse style
marking gay identity has continued (e.g., Leap 1996), though it
has never been without its critics (e.g., Darsey 1981). There were
also attempts (e.g., Gaudio 1994) to describe the so-called gaysounding voice, in other words, the phonetic cues that tend to
cause a speaker to be identified by others as a gay man though
one upshot of this research was to make clear that only some
self-identified gay men sound gay, while conversely some men
who sound gay are not.

767

Sexuality and Language


The advent of queer theory in the 1990s prompted a shift in
the way many scholars across the humanities and social sciences
thought about sexuality and identity. The philosopher Judith
Butler (1990) argued that identity was not a fixed essence located
in individuals but, in the Austinian sense, performative, an
effect produced by repeated acts that were meaningful within
particular systems of signification. The radical destabilization
of identity categories implied by Butlers argument has been
influential even among scholars who would not otherwise identify their work directly with queer theory, and it has opened up
a number of lines of inquiry in empirical research on language
and sexuality.
Whereas the Gayspeak current tended to focus on subjects
representing what was implicitly imagined as the mainstream
of (usually male) gay culture, researchers influenced by queer
theory became more interested in un- or anticonventional linguistic performances of gendered and sexual identity, like those
of drag performers and telephone sex workers (Barrett 1995; Hall
1995). These cases obviously do not fit with traditional explanations that locate the origins of linguistic identity marking in
childhood socialization; and while flamboyantly queer performances may be extreme, it is assumed within this approach that
they differ only in degree, rather than in kind, from less obviously
stylized performances of identity.
Another important development in recent language and sexuality research has seen more researchers turning their attention
to the work involved in producing normative sexual identities,
such as masculine heterosexual ones (Cameron 1998; Kiesling
2002). Some interesting work along these lines has been produced by researchers whose main intellectual debt is not to
queer theory but to the tradition of conversation analysis
(CA). Celia Kitzinger (2005) has reanalyzed some of the major
data corpora used in CA since the 1960s to show how references to heterosexual status are used routinely in conversation.
Conversationalists frequently invoke their husbands, wives, and
in-laws, their activities as a couple and with other couples, and so
on, not because heterosexuality or marital status is in any way at
issue, but rather in the course of other transactions (e.g., asking
to rent an apartment, declining an invitation, making small talk).
Kitzinger shows that what this accomplishes is the production
of the speaker as ordinary, normal, a regular and respectable
person who is worthy of whatever status he or she is currently
staking a claim to (e.g., a good potential tenant or a considerate
guest). Kitzinger also has data involving lesbian speakers suggesting that this strategy is not available to people who are not, and
do not try to pass as, heterosexual. Outside of lesbian community
settings, a lesbian who mentions her female partner casually will
often come off not as ordinary but as drawing attention to her
deviant status.
Another strand in recent language and sexuality research has
investigated the role played by language and discourse in shaping and reproducing sexual attitudes and behavior. The issue of
sexual consent which is often central in criminal proceedings
for rape and sexual assault has come under particular scrutiny,
with discourse and conversation analysts examining the linguistic strategies used by defendants and their counsel to discredit
complainants accounts of nonconsent to sex (Ehrlich 2001) and
the ways in which popular rape-avoidance advice (e.g., Just

768

say no) is at odds with everyday procedures for performing the


speech-act of refusal (Kitzinger and Frith 1999). Some investigators have taken an interest in what language users understand
by various key terms in the domain of sex and sexuality. In the
wake of the Monica Lewinsky/Bill Clinton affair, a study was
published suggesting that a majority of American English speakers would not describe fellatio as sex (Sanders and Reinisch
1999).
Research on language and sexuality has become considerably
more visible since the mid-1990s and also more internationalized, both in the sense that a broader range of languages and
cultures are being studied and in the sense that more research
is emanating from outside of Western Anglophone academe.
Recent research has had a significant impact on related areas of
inquiry, notably gender and language and those forms of
sociolinguistics and discourse analysis (see discourse analysis [linguistic] and discourse analysis [foucaultian])
or discursive psychology that are centrally concerned with questions of language and identity.
Yet one might ask how far this research can be thought of
as a cohesive body of work. There remains a great diversity
of purposes and approaches among researchers: As Kulick
(2000) comments, there has been relatively little debate on
how to theorize sexuality as an object of linguistic inquiry,
and there are also many different understandings of language
in evidence among researchers who approach the topic from
different disciplinary perspectives. The multifaceted nature of
its subject matter, which has been seen as falling within the
remit of scholarly enterprises ranging from literary criticism
to medical science, may mean that the study of language and
sexuality, broadly defined, will remain fragmented. But even if
different research communities do not converge on a common
agenda, it is likely that work in many areas will continue to be
enriched by the kind of cross-fertilization we have seen in the
last decade.
Deborah Cameron
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Barrett, Rusty. 1995. Supermodels of the world, unite! Political economy and the language of performance among African-American drag
queens. In Beyond the Lavender Lexicon: Authenticity, Imagination
and Appropriation in Lesbian and Gay Languages, ed. William L. Leap,
20726. New York: Gordon and Breach.
Butler, Judith. 1990. Gender Trouble: Feminism and the Subversion of
Identity. New York: Routledge.
Cameron, Deborah. 1998. Performing gender identity: Young mens talk
and the construction of heterosexual masculinity. In Language and
Gender: A Reader, ed. Jennifer Coates, 27084. Oxford: Blackwell.
Cameron, Deborah, and Don Kulick. 2003. Language and Sexuality.
Cambridge: Cambridge University Press.
Cameron, Deborah, and Don Kulick, eds. 2006. The Language and
Sexuality Reader. London: Routledge.
Cory, Donald W. [Edward Sagarin]. 1951. The Homosexual in America: A
Subjective View. New York: Greenberg.
Darsey, James. 1981. Gayspeak: A response. In Gayspeak: Gay
Male and Lesbian Communication, ed. James W. Chesebro, 5965.
New York: Pilgrim.
Ehrlich, Susan. 2001. Representing Rape: Language and Sexual Consent.
London: Routledge.

Signed Languages, Neurobiology of


Gaudio, Rudolph P. 1994. Sounding gay: Pitch properties in the speech
of gay and straight men. American Speech 69: 3057.
Giallombardo, Rose. 1966. Society of Women: A Study of a Womens
Prison. New York: John Wiley and Sons.
Hall, Kira. 1995. Lip service on the fantasy lines. In Gender
Articulated: Language and the Socially Constructed Self, ed. Kira Hall
and Mary Bucholtz, 183216. London: Routledge.
Harvey, Keith, and Celia Shalom, eds. 1998. Language and Desire: Encoding
Sex, Romance and Intimacy. London: Routledge.
Hayes, Joseph. 1981. Gayspeak. In Gayspeak: Gay Male and Lesbian
Communication, ed. James W. Chesebro, 4561. New York: Pilgrim.
Kiesling, Scott. 2002. Playing the straight man: Maintaining and
displaying male heterosexuality in discourse. In Language and
Sexuality: Contesting Meaning in Practice and Theory, ed. Kathryn
Campbell-Kibler, Scott Podesva, Sarah J. Roberts, and Andrew Wong,
24966. Stanford, CA: CSLI.
Kitzinger, Celia. 2005. Speaking as a heterosexual: (How) does sexuality matter for talk in interaction? Research on Language and
Social Interaction 38.3. Available online at: http://rolsi.uiowa.edu/
archive/2005/vol38no3kitzinger.html.
Kitzinger Celia, and Hannah Frith. 1999. Just say no? The use of conversation analysis in developing a feminist perspective on sexual refusal.
Discourse and Society 10: 293316.
Kulick, Don. 2000. Gay and lesbian language. Annual Review of
Anthropology 29: 24385.
Leap, William L. 1996. Words Out: Gay Mens English.
Minneapolis: University of Minnesota Press.
Leap, William L., and Tom Boellstorff, eds. 2004. Speaking in Queer
Tongues: Globalization and Gay Languages. Urbana: University of
Illinois Press.
Legman, Gershon. 1941. The language of homosexuality: An American
glossary. In Sex Variants: A Study of Homosexual Patterns. Vol. 2. Ed.
George W. Henry, 115579. New York: Paul B. Hoeber.
Livia, Anna, and Kira Hall, eds. 1997. Queerly Phrased: Language, Gender
and Sexuality. New York: Oxford University Press.
McIlvenny, Paul, ed. 2002. Talking Gender and Sexuality. Amsterdam:
John Benjamins.
Oxford English Dictionary. Available online at: http://www.oed.com.
Rosanoff, Aaron. 1927. Manual of Psychiatry. 6th ed. New York: Wiley.
Sanders, Stephanie A., and June M. Reinisch. 1999. Would you say you
had sex if ? JAMA 281: 757.
Sonenschein, David. 1969. The homosexuals language. Journal of Sex
Research 5: 28191.

SIGNED LANGUAGES, NEUROBIOLOGY OF


signed languages of the deaf are naturally evolving linguistic
systems exhibiting the full range of linguistic complexity found
in speech. The comparison of signed and spoken languages provides a powerful means of differentiating those properties associated with the modality of linguistic expression (auditory and oral
vs. visual and manual) from the modality-independent properties that uniquely underlie all human languages. Consideration
of neurolinguistic data from deaf signers provides a unique avenue for understanding the biological basis for human language,
movement, and action perception.

Sign Language Aphasia


Deaf signers, like hearing speakers, exhibit language disturbances
when left hemisphere cortical regions are damaged (e.g., Hickok,
Love-Geffen, and Klima 2002; Marshall et al. 2004; Poizner,
Klima, and Bellugi 1987; Corina 1998a, 1998b). neuroimaging

studies of healthy deaf subjects provide confirming evidence for


the importance of left hemisphere systems in the mediation
of signed language. There is good evidence that within the left
hemisphere, cerebral organization in deaf signers follows the
anterior/posterior dichotomy for language production and comprehension, respectively, which is the same pattern found for
speech.
SIGN LANGUAGE PRODUCTION. In spoken language aphasia,
chronic language production impairments are associated with
left hemisphere lesions that involve the cortical zone encompassing the lower posterior portion of the frontal lobe
(e.g., brocas area and associated white matter tracts [Mohr
et al. 1978; Goodglass 1993] and the anterior insula [Dronkers,
Redfern, and Knight 2000]).
A case study reported in Poizner, Klima, and Bellugi (1987)
provides clues to the role of left hemisphere prefrontal regions
in the mediation of signing. Patient G.D., a deaf signer with a left
anterior frontal lesion encompassing Brocas area, presented
with nonfluent signing with intact sign comprehension. G.D.s
signing was effortful and dysfluent, with output often reduced to
single-sign utterances. The signs she was able to produce were
agrammatic, devoid of the movement modulations that signal
morpho-syntactic contrasts in fluent signing. As with hearing
Brocas aphasics, this signers comprehension of others language was undisturbed by her lesion.
SIGN LANGUAGE COMPREHENSION. Fluent aphasias typically
manifest as severely impaired comprehension with fluent, but
often paraphasic (semantic and phonemic) output. These aphasias are associated with lesions to left hemisphere posterior
temporal regions. Chronic Wernickes aphasia, for example,
often follows from damage to the posterior regions of the left
superior temporal and middle temporal gyri (Dronkers, Redfern,
and Ludy 1995; Dronkers, Redfern, and Knight 2000).
Signers with left hemisphere posterior lesions also evidence fluent sign aphasia. Two cases are reported in Chiarello,
Knight, and Mandel (1982), Poizner, Klima, and Bellugi (1987),
and Corina et al. (1992). These patients presented with severe
sign comprehension difficulties in the face of relatively fluent
but paraphasic sign output. While damage to the left temporal
cortex has been demonstrated to impair sign comprehension
in some patients (Hickok, Love-Geffen, and Klima 2002), the
lesions in these two case studies did not occur in cortical wernickes area proper, but involved more frontal and inferior
parietal areas. In both cases, lesions extended posteriorly to the
supramarginal gyrus. These two cases suggest that sign language
comprehension may be more dependent than speech on left
hemisphere inferior parietal areas, a difference that may reflect
within-hemisphere reorganization for cortical areas involved
in sign comprehension (Leischner 1943; Chiarello, Knight, and
Mandel 1982; Poizner, Klima, and Bellugi 1987).
SIGN PARAPHASIAS. Language breakdown following left hemisphere damage is not haphazard, but systematically affects
independently motivated linguistic categories. An example of
this systematicity is illustrated by paraphasic errors (Corina
2000). Paraphasia describes the erroneous substitution of an

769

Signed Languages, Neurobiology of


unexpected word for an intended target. Semantic paraphasias
represent the same part of speech as the target, and have a clear
semantic relationship to it. Phonemic or literal paraphasias
refer to the production of unintended sounds or syllables in
the utterance of a partially recognizable word (Blumstein 1973;
Goodglass 1993). Phonemic sound substitution (in speech) may
result in another real word, related in sound but not in meaning (e.g., telephone becomes television). Also attested are cases in
which the erroneous word shares both sound characteristics and
meaning with the target (e.g., broom becomes brush) (Goodglass
1993).
Examples of American Sign Language (ASL) semantic
paraphasias are well documented. For example, subject P.D.
(Poizner, Klima, and Bellugi 1987) produced clear lexical substitutions: bed for chair, daughter for son, quit for depart, and so on.
The semantic errors of P.D. overlap in meaning and lexical class
with the intended targets.
Concerning phonemic errors, we may a priori expect to
find an equal distribution of phonological errors affecting the
four major formational parameters of ASL phonology: handshape, movement, location, and orientation. However, handshape configuration errors are the most widely reported, while
paraphasias affecting movement, location, and orientation are
infrequent.
INDUCED PARAPHASIAS. Additional insights into the neural control of paraphasic errors have been reported by David Corina and
his colleagues (1999), who investigated sign language production
in a deaf individual undergoing an awake cortical stimulation
mapping (CSM) procedure for the surgical treatment of epilepsy.
During the language mapping portion of the procedure, the participant was required to sign the names of line-drawn pictures.
Disruption of the ability to perform the task during stimulation
was taken as evidence that the cortical region in question was
integral to the language task (see Corina 1998a).
Stimulation to two anatomical sites led to consistent naming
disruption. One of these corresponded to the posterior aspect
of Brocas area. A second site, located in the parietal opercular
region, corresponded to the supramarginal gyrus (SMG). The
nature of these errors was qualitatively different.
Stimulation of Brocas area resulted in errors involving the
motor execution of signs. These errors were characterized by a
laxed articulation of the intended sign, with nonspecific movements (repeated tapping or rubbing) and a reduction in handshape configurations to a laxed-closed fist handshape.
With stimulation of the SMG, patient S.T. produced both formational and semantic errors. Formational errors were characterized by repeated attempts to articulate the intended targets.
Successive phonological approximations of the correct sign were
common. The semantic errors were characterized as semantic substitutions that are formationally similar to the intended
targets.
One conceptualization of these data is that stimulation to
Brocas area has a global effect on the motor output of signing,
whereas stimulation to a parietal opercular site, the SMG, disrupts the correct selection of the linguistic components (including both phonological and semantic elements) required for
naming.

770

Apraxia and Signed Language


Signed languages provide an important contrast to naturally
occurring gestures and limb actions. Unlike naturalistic movements, linguistic motor behaviors are symbolic, hierarchically
compositional, and conventionalized. The production of signs
nevertheless requires that linguistic forms stored in memory be
planned and executed, respectively, by the same somatotopic
brain regions and manual articulators that mediate physically
comparable nonlinguistic gestures. Thus, deficits in sign language usage following brain damage must be distinguished clinically from apraxic disturbances of learned movement execution.
Dissociations between sign impairments and ideational
apraxias (disturbances in conventionalized gestures and objectcentric pantomimes) are evident. Patient W.L., for example,
produced and comprehended pantomime normally, but demonstrated marked sign language production aphasia (Corina
et al. 1992). His gestures were clearly intended to convey symbolic information that he ordinarily would have imparted with
sign language. The case of W.L. emphasizes that sign language
impairments following left hemisphere damage are not simply
attributable to undifferentiated impairments in the motoric
instantiation of symbolic representations, but in fact reflect
disruptions to a manually expressed linguistic system. This
conclusion has recently been supported by Jane Marshall and
her colleagues (2004), who report on the strong dissociation
of sign and gesture in Charles, an aphasic user of British Sign
Language (BSL).

Parkinsons Disease
Studies of deaf signers with Parkinsons disease provide evidence
for the importance of extrapyramidal motor systems and basal
ganglia in the mediation of signed languages (Poizner and Kegl
1992; Brentari, Poizner, and Kegl 1995). Motor deficits observed
in subjects with Parkinsons disease are not specific to language
but are evidenced across the general domain of motor behaviors.
Signers with Parkinsons disease have been described as signing
in a monotonous fashion with a severely restricted range of temporal rates and tension in signing. Accompanying the restrictions
in limb movements are deficits in the motility of facial musculature, which further reduces expressivity in signing.

Neuroimaging Studies
Neuroimaging techniques like positron emission tomography
(PET) and functional magnetic resonance imaging (fMRI) provide unique contributions to our current understanding of the
neurological processing of signs. These studies reaffirm the
importance of left hemisphere anterior and posterior brain
regions for sign language use, and emphasize that some neural
areas appear to participate in language perception and production, regardless of the modality of the language.
LEFT HEMISPHERE. Sign production tasks are especially likely
to recruit the left hemisphere. For example, when signers name
objects (Emmorey et al. 2003), generate verbs to go with nouns
(e.g., chair sit) (McGuire et al. 1997; Petitto et al. 2000; Corina
et al. 2003), or sign whole sentences (Braun et al. 2001), their left
hemispheres show significant increases in blood flow, relative
to control tasks. This heightened blood flow reflects, in part, the

Signed Languages, Neurobiology of


activation of motor systems needed for the expression of complex linguistic actions.
Sign language comprehension also recruits the left hemisphere in some studies, for both word-level and sentence-level
tasks. For example, Brocas area has been found to be involved
in sign comprehension when subjects observe single signs
(Levnen et al. 2001; Petitto et al. 2000) and sentences (Neville
et al. 1998; MacSweeney et al. 2002).
This activation is not limited to anterior regions: When signers
of BSL view their language, posterior left hemisphere regions are
activated, including the posterior superior temporal gyrus and
sulcus, and the supramarginal gyrus (MacSweeney et al. 2004).
This heightened activation is relative to complex nonlinguistic
gestures, and does not occur for non-signers.
RIGHT HEMISPHERE. There is growing evidence that right
hemisphere regions may be recruited for aspects of sign language processing in ways that are not required in the processing of spoken languages. When deaf signers were asked to use
complex predicate structures (i.e., classifier constructions) to
describe the relative positions of objects, both left and right hemisphere regions become active (Emmorey et al. 2003). Other evidence suggests that right hemisphere posterior parietal regions
may contribute to the processing of some aspects of sign comprehension (Bavelier et al. 1998; Capek et al. 2004; Corina 1998b;
Newman et al. 2002). For instance, both left and right hemisphere cortical regions are recruited when hearing native signers of ASL passively watch ASL sentences (Newman et al. 2002).
Moreover, right hemisphere involvement may be related to the
age at which the signer first acquired a sign language: One particular structure, the right angular gyrus, was found to be active
only when hearing native users of ASL performed the task. When
hearing signers who learned to sign after puberty performed the
same task, the right angular gyrus failed to be recruited. Thus, the
activation of this neural structure during sign language perception may be a neural signature of signs being acquired during
the critical period for language (Newman et al. 2002).

Human Actions and Sign Language


Recently, attempts have been made to interpret language comprehension and production within a human analog of the mirror neuron system proposed to exist in the macaque monkey
(Rizzolatti and Arbib 1998). Human languages can be conceptualized as repertoires of goal-directed actions shared by all members of a given language community. Thus, linguistic actions are
viable candidates for the types of actions mediated by a neural
circuit for human action understanding. A recent review of the
relevant sign language data (Corina and Knapp 2007) argues for
the importance of a frontal-parietal neural network to sign language production and perception, but found challenges to conceptualizing sign language representations within a narrowly
construed mirror neuron system.
Specifically, a mirror system whereby perceived actions are
understood predominantly through reference to production routines stored in the frontal cortex does not account for the sparing
of comprehension in aphasic signers with lesions to key frontal regions of the proposed mirror neuron system (e.g., Brocas
area). Second, the dissociations of nonlinguistic gestures and

sign abilities in some aphasic signers cannot be easily accounted


for by the mirror neuron system literature, which has been relatively agnostic as to whether and how different classes of human
actions are instantiated. Third, it has been proposed that the
SMG might play a critical role in matching action observations
with the motor routines needed to execute them. Mirror neuron
research has argued for the importance of the SMG in nonlinguistic action imitation. However, we have presented evidence
that impairing the SMG spares imitative behaviors in spite of
severely disrupting the formational components of sign production (Corina et al. 1999).
This is not to discount entirely the possibility that sign language is supported by a more general action observation/
production system. Manual languages, in contrast to spoken
languages, may provide a more direct route for engaging a
human-action observation/execution system that has mirror
properties. In spoken language, the object of perception is
an acoustic event that arises through the coordination of oralmuscular sequences, whereas in sign languages, the object of
perception is an articulatory sequence of largely manual movements. Although both of these linguistic objects of perception
can be resolved by the system, the sign system requires less
translation because it falls more centrally within the purview
of a visual-action observation/execution matching system. The
robust activation of the SMG, which is observed for both sign
and human actions (but less for speech), may be one indication
of this more direct engagement.

Conclusion
Aphasic and neuroimaging data converge on a relatively consistent picture of sign language processing: Left hemisphere
regions, such as Brocas and Wernickes areas, are undisputedly
recruited in a similar fashion for both sign and speech production and comprehension. However, evidence further suggests
that sign comprehension may rely upon neural regions that
extend into the left parietal cortex and may engage the posterior
right hemisphere to an extent not observed for spoken language.
These differences may be related to demands of visual-spatial
processing in signing.
David P. Corina and Heather Patterson
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bavelier, Daphne, David Corina, Peter Jezzard, Vince Clark, Avi Karni,
Anil Lalwani, Josef P. Rauschecker, Allen Braun, Robert Turner, and
Helen J. Neville. 1998. Hemispheric specialization for English and
ASL: Left invarianceright variability. NeuroReport 9: 153742.
Blumstein, Shelia E. 1973. A Phonological Investigation of Aphasic Speech.
The Hague: Mouton.
Braun, A. R., A. Guillemin, L. Hosey, and M. Varga. 2001. The neural
organization of discourse: An H215O-PET study of narrative production
in English and American sign language. Brain 124: 202844.
Brentari, Diane, Howard Poizner, and Judy Kegl. 1995. Aphasic and
Parkinsonian signing: Differences in phonological disruption. Brain
and Language 48: 69105.
Capek, Cheryl M., Daphne Bavelier, David Corina, Aaron J. Newman,
Peter Jezzard, and Helen J. Neville. 2004. The cortical organization
of audio-visual sentence comprehension: An fMRI study at 4 Tesla.
Cognitive Brain Research 20: 11119.

771

Signed Languages, Neurobiology of


Chiarello, Christine, Robert Knight, and Mark Mandel. 1982. Aphasia in
a prelingually deaf woman. Brain 105: 2951.
Corina, David P. 1998a. The Processing of sign language: Evidence from
aphasia. In Handbook of Neurology, ed. H. Whitaker and B. Stemmer,
31329. San Diego, CA: Academic Press.
. 1998b. Studies of neural processing in deaf signers: Toward a
neurocognitive model of language processing in the deaf. Journal of
Deaf Studies and Deaf Education 3.1: 3548.
. 2000. Some Observations Regarding Paraphasia in American
Sign Language. Ed. K. Emmorey and H. Lane. Mahwah, NJ: Lawrence
Erlbaum.
Corina, David P., and Heather Knapp [Patterson]. 2007. Sign language
processing and the mirror neuron system. Cortex 42: 52939.
Corina, David P., Susan L. McBurney, Carl Dodrill, Kevin Hinshaw, Jim
Brinkley, and George Ojemann. 1999. Functional roles of Brocas
area and SMG: Evidence from cortical stimulation mapping in a deaf
signer. NeuroImage 10: 57081.
Corina, David P., Howard Poizner, Ursula Bellugi, Todd Feinberg,
Dorothy Dowd, and Lucinda OGrady-Batch. 1992. Dissociation
between linguistic and nonlinguistic gestural systems: A case for compositionality. Brain Language 43: 41447.
Corina, David P., Lucila San Jose-Robertson, Andre Guillemin, Julia
High, and Allen R. Braun. 2003. Language lateralization in a bimanual
language. Journal of Cognitive Neuroscience 15: 71830.
Dronkers, N. F., B. B Redfern, and C. A. Ludy. 1995. Lesion localization in
chronic Wernickes aphasia. Brain and Language 51: 625.
Dronkers, N. F., B. B. Redfern, and R. T. Knight. 2000. The Neural
Architecture of Language Disorders. Ed. M. Gazzaniga. Cambridge,
MA: MIT Press.
Emmorey, Karen, Hanna Damasio, Stephen McCullough, Thomas
Grabowski, Laura L. B. Ponto, Richard D. Hichwa, and Ursula Bellugi.
2003. Neural systems underlying lexical retrieval in American Sign
Language. Neuropsychologia 41.1: 8595.
Goodglass, Harold. 1993. Understanding Aphasia. San Diego,
CA: Academic.
Hickok, Gregory, Tracy Love-Geffen, and Edward S. Klima. 2002. Role
of the left hemisphere in sign language comprehension. Brain and
Language 82: 16778.
Leischner, Anton. 1943. Die Aphasie der Taubstummen. European
Archives of Psychiatry and Clinical Neuroscience 115: 469548.
Levnen, Sari, Kimmo Uutela, Stephan Salenius, and Riitta Hari. 2001.
Cortical representation of sign language: Comparison of deaf signers
and hearing non-signers. Cerebral Cortex 11: 50612.
MacSweeney, Mairad, Ruth Campbell, Bencie Woll, Vincent Giampietro,
Anthony S. David, Philip K. McGuire, Gemma A. Calvert, and Michael
J. Brammer. 2004. Dissociating linguistic and nonlinguistic gestural
communication in the brain. Neuroimage 22: 160518.
MacSweeney, Mairad, Bencie Woll, Ruth Campbell, Philip K. McGuire,
Anthony S. David, Steven C. R. Williams, John Suckling, Gemma
A. Calvert and Michael J. Brammer. 2002. Neural systems underlying
British Sign Language and audiovisual English processing in native
users. Brain 125: 158393.
Marshall, Jane, Jo Atkinson, Elaine Smulovitch, Alice Thacker, and Bencie
Woll. 2004. Aphasia in a user of British Sign Language: Dissociation
between sign and gesture. Cognitive Neuropsychology 21: 53754.
McGuire, P. K., D. Robertson, A. Thacker, A. S. David, N. Kitson,
R. S. J. Frackowiak, and C. D. Frith. 1997. Neural correlates of thinking
in sign language. NeuroReport 8: 6958.
Mohr, J. P., M. S. Pessin, S. Finkelstein, H. H. Funkenstein, G. W. Duncan,
and K. R. Davis. 1978. Broca aphasia: Pathologic and clinical.
Neurology 28: 31124.
Neville, Helen J., Daphne Bavelier, David Corina, J osef Rauschecker,
Avi Karni, Anil Lalwani, Allen Braun, Vince Clark, Peter Jezzard, and

772

Sign Language, Acquisition of


Robert Turner. 1998. Cerebral organization for language in deaf and
hearing subjects: Biological constraints and effects of experience.
Proceedings of the National Academy of Sciences 95: 9229.
Newman, Aaron J., Daphne Bavelier, David Corina, Peter Jezzard, and
Helen J. Neville. 2002. A critical period for right hemisphere recruitment in American Sign Language processing. Nature Neuroscience
5: 7680.
Petitto, Laura Ann, Robert J. Zatorre, Kristine Gauna, E. J. Nikelski,
Deanna Dostie, and Alan C. Evans. 2000. Speech-like cerebral activity
in profoundly deaf people processing signed languages: Implications
for the neural basis of human language. Proceedings of the National
Academy of Sciences of the United States of America 97: 139616.
Poizner, Howard, and Judy Kegl. 1992. Neural Basis of language
and motor behavior: Perspectives from American Sign Language.
Aphasiology 6: 21956.
Poizner, Howard, Edward S. Klima, and Ursula Bellugi. 1987. What the
Hands Reveal about the Brain. Cambridge, MA: MIT Press.
Rizzolatti, Giacomo, and Michael A. Arbib. 1998. Language within our
grasp. Trends in Neuroscience 21: 18894.

SIGN LANGUAGE, ACQUISITION OF


The great majority of children who are born with a severe or profound hearing loss have hearing parents, and only about 5% are
born into families where parents are deaf. Many deaf parents
though by no means all use a sign language as their preferred
language, and their children grow up as native signers.
Around the world, many deaf children of hearing parents grow
up with no access to language, and there are well-documented
cases of the development of home signs a systematic system of
gestures for communicating (Goldin-Meadow and Mylander
1990). These gestures can be used to share topics and communicate needs but are only understood by family members or others
who know the children well. Home signs are limited both in their
degree of abstractness and the extent to which they show systematic patterns of combination (Emmorey 2002). It is interesting
to note that where there is a community of deaf people and no
established sign language, home signs may, across generations,
become systematized into an emergent sign language, as has
been observed in Nicaragua (Senghas and Coppola 2001).

Child-Directed Signing
Child-directed speech the language that hearing mothers use
when talking to young hearing children has certain characteristic features, including simplicity, brevity, and accuracy (Snow
1977). It is also closely tied to ongoing events and activities in
which a child and a communicative partner are engaged (Harris
1992). Researchers have asked whether child-directed signing
has analogous characteristics. In order to answer this question,
they have focused on the language used by native-signing parents to their deaf children. There are now published studies of
child-directed signing in the United States, the UK, Australia,
and Japan, and their results have led to a remarkably consistent
picture of child-directed signing (Spencer and Harris 2006).
Child-directed signing, like child-directed speech, relates to
ongoing activity and events. There are, however, some important
differences between the two that arise from the fact that signing is perceived primarily through the function of sight, whereas
speech is heard. This has important implications for the way in

Sign Language, Acquisition of


which deaf mothers link their signing to ongoing activity. When
someone speaks, a hearing infant will turn to look in the direction of the speaker. By contrast, mothers of deaf children often
have to actively engage their childs visual attention, and, to this
end, they employ a set of attention-getting and directing strategies that are not often used by mothers of hearing infants (Harris
2001).
When infants are a few months old, mothers often sign in their
preexisting focus of attention so that the infants can see what is
being signed without having to redirect their gaze (Harris et al.
1989; Waxman and Spencer 1997). This can result in signs being
produced in a location that would be considered inappropriate
in a conversation with an older child or other adult signer. For
example, signs may be made on the childs body or moved away
from the mothers signing space (in front of her own body) into
the childs line of sight. As children get older, mothers tend to rely
more on redirecting attention, rather than displacing their own
signing. Thus, by 18 months of age, many signs are produced in
the mothers own signing space (Harris et al. 1989). Mothers use
a variety of strategies to redirect attention before signing. These
include tapping the child, waving a hand in the childs periphery,
banging on the floor or furniture so that the child can feel the
vibration, or moving an object through the infants visual field.
These strategies bring the infants attention to the mothers face,
at which point she usually signs the object name or provides
other relevant information (Waxman and Spencer 1997; Spencer,
Swisher, and Waxman 2004).
The link between signs and the childs focus of attention
remains remarkably constant even through signing strategies
change. Kathryn Meadow-Orlans, P. E. Spencer, and L. Koester
(2004) found that approximately 80 percent of deaf mothers signing related to the current focus of attention of infants between the
ages of 9 and 18 months. When children begin to show signs of
understanding signs, deaf mothers employ techniques to make
associations between sign and referent even more transparent.
For example, they may point repeatedly to an activity or object,
then sign its name, and then point again (Waxman and Spencer
1997). Repetitions of a single sign, with or without accompanying points, can be extensive often five to eight times in a single
utterance.
Typically, mothers signed utterances to young children are
short, with the majority being roughly equivalent to one- or twoword spoken utterances. The number of utterances produced
is also low when compared to the typical frequency of spoken
utterances (Harris 1992; Spencer and Lederberg 1997). However,
deaf mothers almost never sign until they have their childs
attention, and so the short length of utterances and relatively low
rate of signing can be seen as an accommodation to immature
visual attention. It is notable that as children mature both in their
ability to shift and maintain visual attention and to produce their
own signs, deaf mothers increase their rate of signing and produce longer utterances.

signs being produced before spoken words, and they argued for
a sign advantage over spoken language. Other investigators
(Petitto 1988; Volterra, Beronesi, and Massoni 1994) have argued
that the apparent sign advantage arises because parents overinterpret motor behaviors that are produced by both deaf and
hearing infants. For example, many infants open and close their
hand or clap their hands together in the first months of life. Since
these movements are very similar to signs, it is easy to assume
that they are signs. This presents a methodological challenge for
sign language researchers that has been resolved by applying
strict criteria for identifying sign production that mirror those
used for identifying early-acquired words
A follow-up analysis of the original Orlansky and Bonvillian
data (Bonvillian, Orlansky, and Folven [1990] 1994), which
adopted just such criteria, led the authors to conclude that there
was no sign advantage. In a longitudinal study of infants from 9
to 18 months of age, Patricia Spencer and her colleagues directly
compared the age of first signs and spoken words (MeadowOrlans, Spencer, and Koester 2004). A sign was defined as a
hand movement that was not imitated, occurred in a clearly
communicative context, and included at least two of the three
primary elements of a sign (location, handshape, movement;
see sign languages ), while making allowances for the fact
that, like spoken words, childrens early signs may differ from
the adult form (see Morgan, Barrett-Jones, and Stoneham 2007).
The rate of development for signed and spoken language proved
to be remarkably similar, with the majority of both deaf and
hearing children producing single-unit linguistic communications at 18 months and only a minority beginning to combine
units.
The content of early vocabulary in signed and spoken languages is also similar. Like the early vocabulary of young hearing
children acquiring English, that of deaf children of deaf parents
acquiring American Sign Language (ASL) refers to objects and
events in the everyday environment of infants and their families,
such as familiar people, food, clothing, and toys, and is biased
toward object names (nouns) rather than action words (verbs).
As the children reach their second birthday and vocabulary
size increases, a difference does begin to emerge, with the deaf
infants beginning to show a greater proportion of verbs than the
hearing children (Anderson and Reilly 2002). In the case of spoken languages, children are more likely to acquire verbs in their
early vocabulary when these occur relatively frequently in parental speech. Both Korean-speaking (Gopnik and Choi 1995) and
Mandarin Chinesespeaking mothers (Tardif 1996) use verbs
more frequently than do English-speaking mothers, and this is
reflected in the relatively high proportion of verbs that appear
in the early vocabularies of their children. It may well be the
case that the higher proportion of verbs in early ASL vocabulary
is related to a greater use of verbs in child-directed ASL than in
child-directed English.
Margaret Harris

Comparison of Sign Language and Spoken Language


Acquisition

WORKS CITED AND SUGGESTIONS FOR FURTHER READING

There has been considerable debate about whether first signs


appear before first words. In one of the earliest studies of sign
acquisition, Michael Orlansky and J. Bonvillian (1985) reported

Anderson, John, and J. Reilly. 2002. The MacArthur communicative


development inventory: The normative data from American Sign
Language. Journal of Deaf Studies and Deaf Education 7: 83106.

773

Sign Language, Acquisition of


Bonvillian, John, M. Orlansky, and R. Folven. [1990] 1994. Early sign language acquisition: Implications for theories of language acquisition.
In From Gesture to Language in Hearing and Deaf Children, ed. Virginia
Volterra and C. Erting, 21932. Berlin: Springer-Verlag. Washington,
DC: Gallaudet University Press.
Emmorey, Karen. 2002. Language, Cognition, and the Brain: Insights from
Sign Language Research. Mahwah, NJ: Lawrence Erlbaum.
Goldin-Meadow, Susan, and C. Mylander. 1990. Beyond the input
given: The childs role in the acquisition of language. Language
66: 32355.
Gopnik, Alison, and S. Choi. 1995. Names, relational words and cognitive development in English and Korean speakers: Nouns are not
always learned before verbs. In Beyond Names for Things: Young
Childrens Acquisition of Verbs, ed. M. Tomasello and W. Merriman,
8390. Hillsdale, NJ: Lawrence Erlbaum.
Harris, Margaret. 1992. Language Experience and Early Language
Development: From Input to Uptake. Hove, UK: Lawrence Erlbaum.
. 2001. Its all a matter of timing: Sign visibility and sign reference
in deaf and hearing mothers of 18 month old children. Journal of Deaf
Studies and Deaf Education 6: 17785.
Harris, Margaret, J. Clibbens, J. Chasin, and R. Tibbitts. 1989. The social
context of early sign language development. First Language 9: 8197.
Meadow-Orlans, Kathryn, P. E. Spencer, and L. Koester. 2004. The World
of Deaf Infants. New York: Oxford University Press.
Morgan, Gary, S. Barrett-Jones, and H. Stoneham. 2007. The first signs
of language: Phonological development in British Sign Language.
Applied Psycholinguistics 28: 322.
Orlansky, Michael, and J. Bonvillian. 1985. Sign language acquisition: Language development in children of deaf parents and implications for other populations. Merrill-Palmer Quarterly 32: 12743.
Petitto, Laura. 1988. Language in the prelinguistic child. In The
Development of Language and Language Researchers: Essays in Honor of
Roger Brown, ed. F. Kessel, 187221. Hillsdale, NJ: Lawrence Erlbaum.
Senghas, Ann, and M. Coppola. 2001. Children creating language: How
Nicaraguan Sign Language acquired a spatial grammar. Psychological
Science 12.4: 3238.
Snow, Catherine. 1977. The development of conversation between
mothers and babies. Journal of Child Language 4.1: 122.
Spencer, Patricia. 2003. Mother-child interaction. In The Young Deaf
or Hard of Hearing Child: A Family-Centered Approach to Early
Education, ed. B. Bodner-Johnson and M. Sass-Lehrer, 33371.
Baltimore, MD: Paul H. Brookes.
Spencer, Patricia, and M. Harris. 2006. Patterns and effects of language
input to deaf infants and toddlers from deaf and hearing mothers.
In Advances in the Sign Language Development of Deaf Children, ed.
B. Schick, M. Marschark, and P. Spencer, 71101. Oxford: Oxford
University Press.
Spencer, Patricia, and A. Lederberg. 1997. Different modes, different
models: Communication and language of young deaf children and their
mothers. In Communication and Language Acquisition: Discoveries
from Atypical Development, ed. L. Adamson and M. Romski, 20330.
Baltimore: Paul H. Brookes.
Spencer, P., M. V. Swisher, and R. Waxman. 2004. Visual attention: Maturation and specialization. In The World of Deaf Infants,
ed. K. Meadow-Orlans, P. Spencer, and L. Koester, 16888. New
York: Oxford University Press.
Tardif, Twila. 1996. Nouns are not always learned before verbs: Evidence
from Mandarin speakers early vocabularies. Developmental
Psychology 32: 492504.
Volterra, Virginia, S. Beronesi, and P. Massoni. 1994. How does gestural
communication become language? In From Gesture to Language in
Hearing and Deaf Children, ed. Virginia Volterra and C. Erting, 20518.
Washington, DC: Gallaudet University Press.

774

Sign Languages
Waxman, Robyn, and P. Spencer. 1997. What mothers do to support
infant visual attention: Sensitivities to age and hearing status. Journal
of Deaf Studies and Deaf Education 2.2: 10414.

SIGN LANGUAGES
When deaf people get together in a community, a sign language
emerges. The sign languages of linguistic investigation are these
naturally developing languages. They stand in contrast to the
invented sign systems developed by educators to teach language. The sign systems may be designed to represent, for example, English on the hands, using some vocabulary borrowed from
the natural sign language but following English grammar. The
natural sign languages such as American Sign Language (ASL)
in the United States, or British Sign Language (BSL) in the UK
have distinct grammatical systems. In this light, the fact that sign
languages do not necessarily correspond with spoken languages
(e.g., ASL and BSL are distinct, despite the common spoken language) should not be mysterious.
Some sign languages are related historically, just as spoken
languages have historical ties (see historical linguistics).
One fairly well-known family tree involves French Sign Language
(Langue des Signes Franaise, LSF), which has descendants in
much of Europe and the United States due to the fact that in the
nineteenth century, graduates of the French National Institution
for the Deaf, who all used a common sign language, were dispersed to a number of countries and helped to establish schools
there. In many cases, there was no common sign language across
groups of deaf people until the schools were formed and attracted
a community. Whatever signs were used previously combined
with LSF, and the national sign language grew out of this connection (Lane 1984).
In part because of these historical relations, a signer of ASL
may have an easier time communicating with a signer of, say,
Swedish SL, than hearing speakers of English and Swedish.
However, it is important to note that the sign languages are distinct, each having its own vocabulary and grammar. There is no
universal sign language, and couldnt be, for the same reasons
that there is no universal spoken language.

History of Sign Language Research


For many years, the communication between deaf persons
was not considered to be a true linguistic system. In the 1960s,
William Stokoe published a short grammar of ASL and a dictionary based on linguistic principles (Stokoe, Casterline, and
Croneberg 1965), since taken to be the groundbreaking works in
arguing for the linguistic status of a sign language. Stokoe, working in a structuralist approach (see structuralism), showed
that signs could be described as combinations of a limited set
of parts akin to the phonology of spoken languages. (He
eschewed the terms phonology and phoneme because of their
auditory bias and coined the term chereme from Greek cherhandy; but researchers since then have used the phon- terms
to emphasize the level of structural analysis.)
Many sign language researchers focused on providing more
evidence for the linguistic status of sign languages, and slowly,
more and more researchers began to treat sign languages linguistically. Starting in the 1970s, Edward Klima and Ursula Bellugi,

Sign Languages
HC

Figure 2.

MOTHER (A SL)

IDEA (A SL)

Figure 1.
working with a team of researchers at the Salk Institute, studied
ASL from both linguistic and psycholinguistic viewpoints (see,
e.g., Klima and Bellugi 1979). Other linguistic and psycholinguistic studies of various sign languages soon followed. Conferences
on sign language research were held starting in the late 1970s,
and in 1986, the main sign language research conference series,
Theoretical Issues in Sign Language Research (TISLR), began.
This meeting is the major international gathering of sign language
researchers, held in a different location every two to four years.
The growth of this meeting, the establishment of several journals
and many edited books devoted to sign language research, and
the formation of the Sign Language Linguistics Society are indications of the recent surge in growth in the field.

Areas of Sign Language Research and Their Current State


Various natural sign languages have been studied by linguists,
with several goals in mind. Some of the research aims to describe
the lexicon, phonology, morphology, syntax, and semantics of different sign languages. It is descriptive, comparative,
typological, and theoretical in approach. Some of it is in the areas
of psycholinguistics, including studies of language acquisition, parsing (see parsing [human]), and neurolinguistics.
This entry focuses on current theoretical approaches to sign linguistics (cf. sign language, acquisition of and signed languages, neurobiology of). The theoretical work has focused
on analyses of sign language phonology, morphology, and syntax.
PHONOLOGY. At the phonological level, several authors have
proposed different models to represent the structure of signed
words. The models have emphasized phonological representation, rather than phonological processes, although some information about representation has been gleaned from processes,
particularly lexical ones, including compounding and affixation.
The earliest linguistic representation of signs (proposed by
Stokoe) considered them to consist of a simultaneous combination of a specification of handshape, location, and movement. In
a sign like MOTHER, shown in Figure 1, the handshape is open
with all five fingers extended, the location is the chin, and the
movement is tapping the thumb on the chin.
Scott K. Liddell and Robert E. Johnson (1986) showed convincingly that the simultaneous model proposed by Stokoe failed
to capture significant aspects of signs, and that sequentiality is
an important component of signs. Their model analyzed signs as
sequences of movements (M) and holds (H), with information

about location and handshape specified for each M and H segment. Wendy Sandler (1989) advanced the theory of sequentiality by changing the segments to movements and locations (L),
with the typical sign consisting of a sequence of LML (movement from one location to another). In the sign IDEA (see Figure
1), then, the first segment is the temple location, and the sign
moves to a few inches away from the temple. The extended pinky
handshape (HC) is used throughout the sign. On this model, the
handshape is specified in a separate hierarchically structured
unit that connects and spreads across the LML units, as shown
in Figure 2, following the principles of autosegmental phonology. This permits phenomena such as handshape assimilation in
compounding to be efficiently accounted for by the delinking of
one handshape and spreading of another.
Diane Brentaris (1998) model starts with different assumptions. On her account, the primary division in a signs characteristics is between those elements that move (the prosodic
elements) and those that do not (the inherent features). While a
sign typically moves from one location to another, it is also possible for the handshape or orientation to change, either with path
movement or without. Some such prosodic element (path movement and/or handshape/orientation change) is required for a
(monomorphemic) sign to be licit. Brentari captures this requirement for path, handshape, or orientation to change by grouping
these prosodic elements together in the representation.
The models summarized here are based largely on data from
American Sign Language and Israeli Sign Language (ISL), but
they are intended as models of signs more generally. It is clear
that different sign languages have different inventories of phonological primitives, particularly handshapes, but strikingly different patterns of organization or phonological processes have
not been identified.
One major theoretical issue that has been addressed in sign
language phonology is the syllable. The syllable is an important component of spoken language phonology as it is an organizing unit that has both universal and language-particular
aspects. There is also an intuitive component to the syllable, and
it is often taken advantage of in poetry and other areas of language use. So it might be natural to ask whether there is an analog to this important unit in sign languages.
The answer seems to be yes and no. Researchers have identified units that serve a similar function to spoken language syllables with respect to timing, generally consisting of one LML
unit. There are constraints on these units that point out the need
for positing their existence apart from the word or morpheme.
However, sign language syllables are not bound by sonority in
the same way that spoken language syllables are, and there is no
clear equivalent to a sonority cycle. That is, in spoken languages,
the nucleus of the syllable (generally the vowel) is the most sonorous element, while elements further away (in either direction)

775

Sign Languages

Figure 3.

I-ASK-HER

SHE-ASKS-HIM

become progressively less sonorous. In sign languages, it is not


clear that there is a comparable hierarchy of degrees of sonority
that limits sequences of segments before and/or after the most
sonorous nucleus.
MORPHOLOGY. The sign languages that have been most intensively studied are clearly morphologically complex. They generally contain a number of morphological processes applying
to predicates, such as subject and object agreement, location
agreement, and aspect, as well as a system of classifier predicates by which verbs of movement and location can express
characteristics of theme, instrument, and agent arguments (cf.
thematic roles) together with the predicate. The morphological processes employed are generally nonconcatenative. For
example, they may alter the movement of the root, or the beginning or ending location, and so on, rather than adding prefixes
or suffixes. This means that a single signed syllable may express
multiple morphemes (frequently 3 to 5, occasionally as many as
10 or 12).
The process commonly known as verb agreement in sign languages is illustrated in Figure 3. Verb agreement makes use of
spatial loci, which are either abstract or determined by actual
physical locations of actual objects. In the examples, we understand a location on the signers right side to be representing one
person (female), and a location on the signers left side another
one (male). When the verb (ASK) moves from the location of the
signer to the location on the right, it is understood as I ask her.
When it moves from the location on the signers right to the location on her left, it is understood as she asks him.
This process has received much attention in the literature
on sign languages. It is intriguing because it does look in many
ways like verb agreement, yet has important differences from
typical agreement systems in spoken languages. One difference
is that there is a good deal of optionality associated with the use
of agreement; another is that it applies only to a subset of verbs
that is largely semantically determined. Recent research has
attempted to provide clear explication of the verbs that do and
do not undergo this process, and to examine the interaction of
agreement with issues of syntactic structure.
While the bulk of sign language morphology is nonconcatenative, there are sequential processes as well. Mark Aronoff,
Irit Meir, and Wendy Sandler (2005) propose that the difference between these types is related to the historical depth of
sign languages. They claim that the nonconcatenative processes
are iconically grounded and that the sequential process are

776

due to historical development, and therefore found only in sign


languages with some historical depth.
SYNTAX. There have been a number of analyses of sign language
syntax within the Chomskyan generative framework, particularly
the principles and parameters theory. Some of these
studies have focused on establishing the basic clausal structure
of ASL or another sign language. In addition to basic word order,
such studies address the order of functional projections and the
movement of constituents within the phrasal structure. Evidence
for structure dependence, hierarchy, and recursion has shown
that sign languages share these basic properties of grammar with
spoken languages.
One conclusion reached by many of these studies is that a
good number of sign languages are discourse oriented, using
changes in word order to convey information structure notions,
such as topic and focus. For example, sign languages tend
to use the sentence-initial position for topics (old information),
and at least some use the sentence-final position for focus (new
information). Constituents in these positions have particular discourse functions, and they are marked syntactically in particular
ways. As do other discourse-oriented languages, sign languages
tend to permit arguments to be nonovert, understood according
to context. The result of these two characteristics is a good deal of
surface variation in sentence structure.
One issue that has received a fair amount of attention is
the analysis of wh-questions in ASL and other sign languages.
Sometimes, the wh-phrase may show up in the sentence-final
position, as in John buy yesterday what? In other examples, the
wh-phrase shows up in the sentence-initial position, as in Why
you leave? Some researchers have argued that ASL has regular
wh-movement to the end of a sentence, making it different from
spoken languages, which, it seems, universally use the sentenceinitial position for regular wh-movement. According to the usual
generative assumptions, regular wh-movement moves wh-elements to the position called specifier of complement phrase (Spec,
CP), which is to the left of the rest of the sentence, and so this
view makes sign languages special. However, others have argued
that Spec, CP is on the left in ASL and other sign languages, and
the appearance of wh-words in the sentence-final position is due
to a focus operation. This makes the analysis of wh-questions in
ASL more like that of numerous spoken languages that permit
wh-words to undergo wh-movement or focus movement. On
the surface, wh-words appear in a variety of positions its their
analysis that causes such a debate.

Sign Languages
One issue that has come up in connection with this debate is
the role of the nonmanual marking that accompanies wh-questions and other structures. These nonmanual markings are often
described as grammatical, in contrast to emotional facial expressions. Some researchers consider them a part of prosody, akin to
intonation making them clearly related to syntactic structure
but also not strictly determined by syntax alone (see Sandler and
Lillo-Martin 2006 for review and references).
Aside from the generative approach, there has been a growing
body of research on sign linguistics from a cognitive grammar
perspective. This research attempts to account for the construction of meaning in language using reference to mental spaces,
taking into consideration gradience and optionality as well as systematicity. Since the use of spatial locations is integral to signing,
this approach has been very appealing to some researchers.

Major Research Issues and Questions


Research on sign linguistics is a young field. There are many
questions even in basic description, particularly with respect to
sign languages other than ASL and some of the European sign
languages, which have received the most attention. Linguists
need such information as a full range of phonological processes
and varieties of sentence types to formulate analyses, and are
often stymied when attempting to test predictions because of a
lack of available data.
Nevertheless, some major research questions have emerged.
An important one is the extent to which sign languages form a
group, such that all sign languages have certain characteristics in
common. While acknowledging the lack of data on more than a
handful of sign languages, sign linguists have been struck by the
remarkable similarities across these languages in certain aspects.
An explanation for the existence of these particular characteristics is an important goal of linguistic theory.
For example, all known sign languages productively use nonconcatenative morphology to express multiple morphemes in
what is typically a single syllable. Sign languages also seem to
share the characteristics of discourse-oriented languages, productively employing special processes to organize sentences
and sequences of sentences in accordance with the demands of
information structure.
Another aspect which has shown up in a number of sign language areas concerns optionality. There are a number of ways in
which sign languages show a notable ability to chose from multiple options. For example, verb agreement with the subject is
considered optional in many (all?) sign languages; most of the
phonological processes that have been identified are optional,
and several sign languages have been reported to permit either
the option of leaving wh-elements in situ or of moving them.
Although sign languages seem to be more coherently similar
to one another than spoken languages are, it is still an issue to
inquire whether sign languages show many modality-particular
characteristics. Certainly the tendency for lexical items to be
monosyllabic may be related to the modality, but just where
modality effects are to be found is still a matter of investigation.
Research on sign languages is hindered by a particular sociolinguistic situation that is rarely found in spoken language communities. Only a small percentage (perhaps 5%) of signers have
been exposed to their language by their parents from birth; fewer

Sinewave Synthesis
still have parents who were themselves exposed from birth. Most
deaf children are born to hearing parents who do not know sign
language or the deaf community. The parents may be advised to
expose their children to sign language, but even if they chose to
do so, there may not be programs available for their children to be
exposed to fluent signers for more than a few hours per week. Deaf
signers begin to learn sign language at a wide variety of ages, and
their input providers themselves have a range of skills. To make
matters more complicated, deaf signers frequently must communicate with others who are not fluent in sign language, and they
have varying degrees of multilingualism, including knowledge of
the dominant spoken language(s) in their community.
This situation leads to a number of questions concerning
how to define a native speaker and which dialects/registers of
the language to use as a model. Disagreements among linguists
about basic facts are not uncommon, and signers frequently discuss varying judgments on particular examples at sign linguistics
conferences.
One response may be to attempt to narrowly define native
signers and contexts of data collection (e.g., having only naturalistic data or having only native signers interview consultants);
another response may be to attempt to explain the range of judgments observed in linguistic terms. In any case, sensitivity to the
difficulty of collecting reliable data is of great importance.
Diane Lillo-Martin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Aronoff, Mark, Irit Meir, and Wendy Sandler. 2005. The paradox of sign
language morphology. Language 81.2: 30134.
Brentari, Diane. 1998. A Prosodic Model of Sign Language Phonology.
Cambridge, MA: MIT Press.
Emmorey, Karen. 2002. Language, Cognition, and the Brain: Insights from
Sign Language Research. Mahwah, NJ: Lawrence Erlbaum.
Klima, Edward S., and Ursula Bellugi. 1979. The Signs of Language.
Cambridge: Harvard University Press.
Lane, Harlan. 1984. When the Mind Hears: A History of the Deaf. New
York: Random House.
Liddell, Scott K., and Robert E. Johnson. 1986. American Sign Language
compound formation processes, lexicalization, and phonological remnants. Natural Language and Linguistic Theory 8: 445513.
Meier, Richard, Kearsy Cormier, and David Quinto-Pozos, eds. 2002.
Modality and Structure in Signed Language and Spoken Language.
Cambridge: Cambridge University Press.
Neidle, Carol, Judy Kegl, Dawn MacLaughlin, Benjamin Bahan, and
Robert G. Lee. 2000. The Syntax of American Sign Language: Functional
Categories and Hierarchical Structure. Cambridge, MA: MIT Press.
Sandler, Wendy. 1989. Phonological Representation of the Sign: Linearity
and Non-linearity in American Sign Language. Dordrecht, the
Netherlands: Foris.
Sandler, Wendy, and Diane Lillo-Martin. 2006. Sign Language and
Linguistic Universals. Cambridge: Cambridge University Press.
Stokoe, William C., Dorethy Casterline, and Carl Croneberg. 1965.
A Dictionary of American Sign Language on Linguistic Principles.
Washington, DC: Gallaudet College Press.

SINEWAVE SYNTHESIS
Sinewave synthesis is a technique for creating digital acoustic
signals by computationally combining simulated pure tones of

777

Sinewave Synthesis

Situation Semantics

varying frequency and amplitude. Most frequently, sinewave


synthesis has been used as a technique for generating stimuli
for experiments to help assess the aspects of the signal that are
important for the perception of speech and non-speech sounds.
The technique now commonly employed to rapidly create
these unusual sounds was originally developed by Philip Rubin at
Haskins Laboratories in 1977, as was the first software sinewave
synthesizer (SWS) (Rubin 1980). This synthesizer built on the pioneering work at Haskins Laboratories in the 1950s that used an
early hardware device, the pattern playback (Cooper, Liberman,
and Borst 1951), to discover the information critical for speech
perception by synthesizing acoustic signals based on spectrographic information. Other influences were numerous, including
sinewave demonstrations by Rod McGuire at Haskins and the
work on syllable recognition by James Cutting and Peter Bailey
and colleagues (Bailey, Sommerfield, and Dorman 1977).
The Haskins sinewave synthesis system was designed for
exploring the global, spectral-temporal properties of speech by
removing local detail from the signals while retaining other critical information. Sinewave synthesis of speech involves the generation of simulated pure tones that track the natural resonant
frequencies of the vocal tract, known as the formants. Sinewave
signals lack most of those short-term features that are characteristic of natural speech, such as harmonic structure, broadband
formants, the transient noises seen in fricatives, aspirates, and
consonant bursts, and so on. At question is whether or not such
minimal information is sufficient to support the identification
of signals as speech. Perception tests have shown that, with a
brief period of training and with appropriately designed stimuli,
speech comprehension is possible. Usually between two to four
time-varying tones occurring simultaneously are sufficient to
create a signal that can be identified as speech and transcribed
fairly accurately by most individuals (Remez et al. 1981).
Over the years, sinewave synthesis has been used by a variety
of scientists (e.g., Best, Morrongiello, and Robson 1981; Remez et
al. 1981) to explore a range of issues in the area of speech perception, phonetics, cognitive psychology, and related areas. The
most prominent work in this area has been that of Robert Remez
and his colleagues (Remez et al. 1981; Remez et al. 1994).
Philip Rubin
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bailey, P., Q. Summerfield, and M. Dorman. 1977. On the identification of sine-wave analogues of certain speech sounds. In Haskins
Laboratories Status Report on Speech Perception SR-51/52, 126,
Haskins Laboratories, New Haven, CT.
Best, C. T., B. Morrongiello, and R. Robson. 1981. Perceptual equivalence
of acoustic cues in speech and nonspeech perception. Perception and
Psychophysics 29: 191211.
Cooper, F. S., A. M. Liberman, and J. M. Borst. 1951. The interconversion
of audible and visible patterns as a basis for research in the perception
of speech. Proceedings of the National Academy of Science 37: 31825.
Rubin, Philip E. 1980. Sinewave synthesis. Internal memorandum,
Haskins Laboratories, New Haven, CT.
Remez, R. E., P. E. Rubin, S. M. Berns, J. S. Pardo, and J. M. Lang. 1994.
On the perceptual organization of speech. Psychological Review
101: 12956.

778

Remez, R. E., P. E. Rubin, D. B. Pisoni, and T. D. Carrell. 1981. Speech


perception without traditional speech cues. Science 212: 94750.

SITUATION SEMANTICS
Situation semantics was originally developed by Jon Barwise and
John Perry in Situations and Attitudes (1983). semantics is conceived as providing a systematic theory of connections between
structures of a particular language and entities in the world the
language is used to talk about.
A situation is a limited part of reality; situations are the sorts of
things we perceive, reason about, and live in, for example, what is
going on now in this office, what has happened in America since
World War II, or the games of the Chicago Cubs in 1919.
Such situations are classified by the issues provided by a
scheme of individuation and classification: spacetime locations,
properties and (other) relations including argument roles
(which here we assume to have an order) and individuals. The
general form of an issue is as follows:
Whether, at space-time location l, the relation R holds among
individuals a1,an.

For example,
Whether, on July 4 at noon in Las Vegas, Elwood kisses Gertrude.

There are two possible resolutions of an issue, each of which is a


state of affairs:
(1) On July 4 at noon in Las Vegas, Elwood does kiss Gertrude.
(2) On July 4 at noon in Las Vegas, Elwood does not kiss Gertrude.
(1) and (2) are opposites.

A state of affairs may be made factual by a situation, thus


resolving the issue according to basic principles:
Every issue is resolved by some situation or other;
If some situation or other makes a given state of affairs factual, no situation makes the opposite state of affairs factual.
If one situation is part of another, everything the part makes
factual will be made factual by the larger situation, too.
One need not assume is that there is a total situation, that is, a
situation that resolves all issues.
Situation theory allows for complex states of affairs and for
types of situations. Types are abstracted from simple and complex
states of affairs with the help of parameters. For example: the type
of situation in which Elwood kisses Gertrude at some location ;
the type of situation in which someone kisses Gertrude in the
seminar room at midnight; the type of situation in which someone
kisses someone at some location . A situation is of a type if
it makes factual a state of affairs that results from anchoring the
parameters to appropriate objects. So any situation that makes (1)
factual is of all of the types considered in this paragraph. Restricted
parameters are built from basic ones: [the oldest brother of ]
must be anchored to the oldest brother of the anchor of .
The type of situation in which kisses at involves the type
of situation in which touches at . That is, if there is a situation
of the first type, then there is a situation of the second type. States
of affairs of the form

Situation Semantics

Socially Distributed Cognition

At l, type T does involve type T

are constraints.
Meaning is a matter of constraints. Examples:
The fact that the stump has 100 rings means that the tree that
grew from it was 100 years old when cut.
The fact that the bell is ringing means that dinner is ready.

The first example implicitly appeals to constraints about the


ways trees grow and what happens when they are cut down. The
type of a stump having 100 rings at a location l involves the type
of a tree [the one from which resulted] being 100 years old at a
location l [the location, temporally prior to l, at which was cut].
The second example involves a conventional constraint, the
convention being that the bell is rung only when dinner is ready.
If a convention is in force in a community in this case, if people
who hear the bell habitually expect and are expected to expect
that dinner is ready it may generate meaning even if not strictly
factual. Perhaps, occasionally, winds or playful children make
the bell ring.
Situation semantics conceives of language as a complex
system of conventional constraints that hold between types of
utterances and types of described situations. The meanings of
declarative sentences are most straightforward:
An utterance of a speaker at a location of I am sitting
describes the type of situation in which sits at ;
An utterance by at of He kicked Fred describes the type
of situation in which the male demonstrated by kicked the
person named Fred to whom referred.
Because it avails itself of the rich structure of utterances, situation semantics has many resources for dealing with various forms
of context sensitivity. Such phenomena were a main motivation
for the theory. But versions of it have been applied to dealing
with paradox (Barwise and Etchemendy 1987; see also consistency, truth, and paradox), information structure
(Israel and Perry 1990), interrogatives (Ginzburg and Sag 2001),
anaphora (Gawron and Peters 1990), and a number of other
phenomena. Situations and Attitudes, while still a useful guide
to the original philosophical and linguistic motivations behind
the approach, is no longer a reliable guide to its formal development. Devlin (1991) provides a useful handbook; Barwise (1989)
discusses a number of ways of developing the framework.
Situation semantics contrasts in important ways with possible worlds semantics. The contrasts with David Lewisstyle
possible worlds semantics are straightforward: There is only one
concrete world in situation semantics; possibility is handled with
constraints and unexemplified types of situations rather than alternative concrete worlds and similarity relations among them. Robert
Stalnakerstyle possible worlds semantics also recognizes only one
real concrete world; differences with situation semantics center on
different approaches to handling partiality (see Perry 1986).
John Perry
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Barwise, Jon. 1989. The Situation in Logic. Stanford, CA: CSLI
Publications.

Barwise, Jon, and John Etchemendy. 1987. The Liar. New York: Oxford
University Press.
Barwise, Jon, and John Perry. 1983. Situations and Attitudes. Cambridge,
MA: Bradford/MIT Press. Reprint with new introduction by CSLI
Publications, Stanford, CA, in 1999.
Devlin, Keith J. 1991. Logic and Information. Cambridge: Cambridge
University Press.
Gawron, Jean Mark, and Stanley Peters. 1990. Anaphora and Quantification
in Situation Semantics. Stanford, CA: CSLI Publications.
Israel, David, and John Perry. 1990. What is information? In
Information, Language and Cognition, ed. Philip Hanson, 119.
Vancouver: University of British Columbia Press.
Ginzburg, Jonathan, and Ivan A. Sag. 2001. Interrogative Investigations.
Stanford, CA: CSLI Publications.
Perry, John. 1986. From worlds to situations. Journal of Philosophical
Logic 15: 83107.

SOCIALLY DISTRIBUTED COGNITION


The main storyline of the hominin lineage over the last five million years is the evolution of more and more complex brains. This
aspect of our evolved species biology underlies our remarkable
capacity to learn and transmit accumulated wisdom to subsequent generations. For most of this time, there was relatively little
social differentiation within human groups, such that individuals
of the same age/sex category learned and did virtually the same
things as their peers. Within the last half-million years, however,
our evolved cultural capacity gave rise to larger and more highly
differentiated societies, along with a very distinctive communication system, language. The pace of societal differentiation has
been accelerating ever since.
Today, the cultural information stored extrasomatically (outside ones own body) in human social systems far exceeds the
neurological capacity of any given human. As a result, from an
individuals viewpoint, socially distributed cognition puts a premium on knowing how to find out, rather than actually knowing everything oneself. Each of us has only partial knowledge of
our cultural heritage and is bound to others in complex webs
of reciprocal ignorance and mutual dependence. Indeed, the
degree of ignorance and dependence can be startling, as the following apocryphal story illustrates.
Once there was a young man who was visited by his grandparents. Trying to be a good host, the grandson made a pot of
coffee and, while his grandmother went to use the bathroom,
he asked, How do you like your coffee, Grandfather? To which
the old man replied, I dont know, youll have to ask your
Grandmother.
Over the years, the elderly couple had developed such a strong
division of labor between them that the grandfather no longer
remembered how much sugar and cream he liked in his own
coffee. He had off-loaded that information to his spouse. The
story also highlights the importance of language in such cognitive off-loading. Through language, distributed information can
be accessed from others on an ad hoc, need-to-know basis. In
this sense, language provides much of the social glue that makes
socially distributed cognition (and complex, differentiated societies) possible.
Being able to speak (or read) a language, however, does not
necessarily mean we know what we are talking about. As Emile

779

Socially Distributed Cognition


Durkheim ([1912] 1965, 483) noted almost a century ago, Which
of us knows all the words of the language he speaks and the
entire signification of each? (See also division of linguistic labor.) For example, starting from the casual observation
that people routinely talk about things they cannot recognize
and, conversely, are familiar with things for which they have no
names, John B. Gatewood (1983) asked a group of American college students to list all the kinds of trees they could remember.
Then, the students were asked to indicate whether they could
recognize each of the varieties they had listed. On average, the
students thought they could recognize only about 50% of the
trees they had recalled. For the other 50%, the student knew of
the trees but not much else. Loose talk of this sort, nonetheless,
has an important social function: It is a means for organizing
knowledge lying beyond the grasp of an individual. That is, the
greater the division of labor in society, the less redundant is the
information stored in each individual and, therefore, the greater
the importance of language as a means of accessing what other
people know.
This general way of thinking about human social organization has a considerable history within anthropology. One of
the wellsprings is Durkheims The Division of Labor in Society
([1893] 1964), in which he distinguished between the collective
conscience and collective representations. As the division of
labor increases, social solidarity derives more and more from the
shared collective representations that enable communication
among different groups and less and less from a shared sense of
morality and motives held in common. Anthony F. C. Wallace
(1961) provided an eloquent argument against the notion that
shared motives or cognitions are a functional prerequisite for
society, concluding that culture rests not on the replication of
uniformity but on the organization of diversity. The implication
of Wallaces reasoning is that individuals can produce a sociocultural system that is beyond their own comprehension (1961,
38). Similarly, John M. Roberts (1964) developed the idea of cultural nonsharing as part of his view of cultures as self-organizing information pools. Since 1980, there has been considerable
development, both theoretically and methodologically, with
respect to the study of intracultural variation and the socially
distributed cognition that such variability entails.
For an anthropologist, intracultural variability presents an
interesting epistemological problem. Given the variable participation of individuals (Linton 1936) in a cultures information pool (Roberts 1964), how can the anthropologist decide if
there is a common culture or just individual differences? James
Shilts Bosters (1980, 1985) study of Aguaruna manioc identification pioneered a solution to this sort of question. The key to the
approach lies in taking seriously the fact that no one knows all
of his or her groups culture and that agreement is a matter of
degree. By examining the patterning of agreement among informants, Boster suggested that one could detect whether individuals understandings of a particular domain are uniform, variable
in the form of expertise gradients, variable by subgroup affiliation, or just randomly idiosyncratic.
A. Kimball Romney, Susan C. Weller, and William H.
Batchelder (1986) developed this insight into a cultural consensus theory. Drawing on testing theory in psychology, consensus
theory assumes that the correspondence between the answers

780

of any two informants is a function of the extent to which each


is correlated with the truth (1986, 316). When people know the
same cultural truth, they will converge on the same answer. When
they do not know, they will guess. Informally, this means that the
degree of agreement evidenced in peoples responses to a set of
question can be compared to the amount of agreement expected
just by chance. More technically, if the three assumptions of their
formal model are met (there is a common culture, respondents
answer independently of one another, and the questions are of
equal difficulty), it follows by mathematical necessity that the
eigenvalue of the first factor of a minimum residual factor analysis of a chance-corrected, respondent-by-respondent agreement
matrix to a battery of questions will be substantially larger than
the eigenvalue of the second factor. Conversely, if a particular
data set does not have a large ratio between the first and second
factors, or if the mean of individuals first factor loadings is low,
then the data do not meet at least one of the three assumptions
of the model. In such a case, because the second assumption
can be controlled during data collection and the third is robust
to violations, the usual reason for nonconsensus is a violation of
the common truth assumption. That is, either subcultures exist
or the variation is just random.
In short, consensus analysis provides an inductive, datareduction technique whereby one can determine the degree of
cultural consensus (or nonconsensus) among individuals for
given sets of questions. Furthermore, each respondents first
factor loading (cultural competence, in consensus parlance)
is a composite measure of how well that individual represents
the entire samples answers. Whether this computed group-representiveness variable is associated with other characteristics of
individuals, such as age, sex, education, kin group, and so on, can
be checked using standard statistical tests. Suppose, for example, there is overall cultural consensus with respect to peoples
judgments concerning which liquors can be plausibly combined
with a variety of mixer beverages, such as tonic, orange juice, vermouth, and so on. One could then investigate the extent to which
individuals cultural competence is correlated with their sex, age,
or experience as a consumer of mixed drinks (Gatewood 1996).
In addition to off-loading knowledge for storage in other
people, extrasomatic information can also be stored in artifacts
and our built environments. The key idea here is that although
cognition is usually regarded as pertaining to individuals as
such, cognitive systems may well extend beyond the boundaries of individual organisms, the full extent being determined by
closure of cybernetic loops of information flow. Gregory Bateson
was one of the early proponents of this broader, more inclusive,
and highly contextualized view of cognition. For instance, we
may say that the mind is immanent in those circuits of the brain
which are complete within the brain. Or that mind is immanent
in circuits which are complete within the system, brain plus
body. Or, finally, that mind is immanent in the larger system
man plus environment (1972, 317). Batesons thinking along
these lines traces back to his first book, Naven (1936), but Mind
and Nature: A Necessary Unity (1979) provides his most complete articulation.
More recently, Edwin Hutchinss Cognition in the Wild
(1995) provides a compelling demonstration of this cognitive
systems view of distributed cognition, including not only the

Socially Distributed Cognition

Sociolinguistics

abstract argument for a more contexualized and distributed


view of cognition but also an extended ethnographic example.
As Hutchins explains, navigating a large ship can be viewed as a
computational problem, but one that involves teams of individuals doing calculations using numerous inanimate instruments
and components of the ships hardware. By tracing the flow
of information required to solve navigational problems, one
comes to see that the cognitive system encompasses numerous
individuals interacting among themselves, the ships hardware,
and changing environmental circumstances. The computational system includes all of these parts, human and nonhuman. In a similar vein, Charles M. Keller and Janet Dixon Keller
(1996) detail the ways in which information required for blacksmithing is distributed among the blacksmith, his tools, the
layout of his shop, and the physical changes in the iron being
worked.
In summary, socially distributed cognition is an old and
familiar notion within the social sciences, though generally
known by other names division of labor in society, social
organization of knowledge, or intracultural variation. Only in
the wake of the cognitive revolution, and particularly since
Batesons (1979) and Hutchinss (1995) books, has the expression socially distributed cognition come into vogue. This entry
has sketched some of the concepts antecedents, noted its key
features along with some of the different kinds of anthropological research based on them, and indicated the challenge posed
by the concept for the more familiar, individual-in-isolation
view of cognition. Clearly, the idea of distributed cognition is
not confined to anthropology. For example, Minsky (1986),
Suchman (1987), Dillenbourg and Self (1992), Norman (1993),
Rogers and Ellis (1994), and Hollan, Hutchins, and Kirsh (2000)
discuss the concept from the perspectives of psychology and
computer science.
John B. Gatewood
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bateson, Gregory. 1936. Naven: A Survey of the Problems Suggested by a
Composite Picture of the Culture of a New Guinea Tribe Drawn from
Three Points of View. New York: Cambridge University Press.
. 1972. Cybernetics of self: A theory of alcoholism. In Steps to an
Ecology of Mind, 30937. New York: Ballantine Books.
. 1979. Mind and Nature: A Necessary Unity. New York: E. P. Dutton.
Boster, James Shilts. 1980. How the exceptions prove the rule: An analysis
of informant disagreement in Aguaruna manioc identification. Ph.D.
diss., University of California, Berkeley.
. 1985. Requiem for the omniscient informant: Theres life in
the old girl yet. In Directions in Cognitive Anthropology, ed. Janet
W. D. Dougherty, 17797. Urbana: University of Illinois Press.
Boster, James Shilts, ed. 1987. Intracultural variation. American
Behavioral Scientist 31.2 (Theme Issue).
Dillenbourg, Pierre, and John A. Self. 1992. A computational approach
to socially distributed cognition. European Journal of Psychology of
Education 7.4: 35273.
Durkheim, Emile. [1893] 1964. The Division of Labor in Society. Trans.
George Simpson. New York: Free Press.
. [1912] 1964. The Elementary Forms of the Religious Life. Trans.
Joseph Ward Swain. New York: Free Press.
Gatewood, John B. 1983. Loose talk: Linguistic competence and recognition ability. American Anthropologist 85.2: 37887.

. 1984. Familiarity, vocabulary size, and recognition ability in four


semantic domains. American Ethnologist 11.3: 50727.
. 1996. Ignorance, knowledge, and dummy categories: Social and
Cognitive Aspects of Expertise. Presented at the 95th Meetings of
the American Anthropological Association, San Francisco. Available
online at: http://www.lehigh.edu/~jbg1/liquors.htm.
Hollan, James, Edwin Hutchins, and David Kirsh. 2000. Distributed
cognition: Toward a new foundation for humancomputer interaction research. ACM Transactions on ComputerHuman Interaction
7.2: 17496.
Hutchins, Edwin. 1995. Cognition in the Wild. Cambridge, MA: MIT
Press.
Keller, Charles M., and Janet Dixon Keller. 1996. Cognition and Tool
Use: The Blacksmith at Work. New York: Cambridge University Press.
Linton, Ralph. 1936. The Study of Man. New York: D. AppletonCentury.
Minsky, Marvin. 1986. The Society of Mind. New York: Simon and
Schuster.
Norman, Donald A. 1993. Things That Make Us Smart: Defending Human
Attributes in the Age of the Machine. Reading, MA: Addison-Wesley.
Roberts, John M. 1964. The self-management of cultures. In Explorations
in Cultural Anthropology: Essays in Honor of George Peter Murdock, ed.
Ward Goodenough, 43354. New York: McGraw-Hill.
Rogers, Yvonne, and Judi Ellis. 1994. Distributed cognition: An alternative framework for analysing and explaining collaborative working.
Journal of Information Technology 9.2: 11928.
Romney, A. Kimball, Susan C. Weller, and William H. Batchelder. 1986.
Culture as consensus: A theory of culture and informant accuracy.
American Anthropologist 88.2: 31338.
Suchman, Lucille A. 1987. Plans and Situated Actions: The Problem of
HumanMachine Communication. New York: Cambridge University
Press.
Wallace, Anthony F. C. 1961. Culture and Personality. New York: Random
House.

SOCIOLINGUISTICS
Sociolinguistics is an interdisciplinary research area concerned
with covariance between language and social factors. Since its
beginnings, it has sought to explain how variably speakers use
language and what the patterns of the variation are. It is characterized both by the diversity of its subject matters and by the
heterogeneity of its approaches to them.
Despite their diversity, the different brands of sociolinguistics subscribe to the position that a language is inherently variable, and they reject the assumption that a grammatical system
applies uniformly to all speakers and all settings. The social factors usually invoked to account for the variation include gender
(see gender and language), age (see age groups), level of
formal education, and profession. The settings are distinguished
into categories, such as family gathering, in which vernacular
(i.e., colloquial) speech is the norm; medical appointments, at
which speakers do not hold the same status and do not speak
exactly the same variety; and university lectures, at which the
control of the floor is not equally distributed between instructor
and students, and the language is typically not vernacular. Much
of this boils down to a common distinction between formal and
informal settings, correlated roughly with the opposition standard versus colloquial, casual, and nonstandard speech, among
a host of other distinctions. Another common characteristic
of all the different brands of sociolinguistics is their emphasis
on empirical observation, rather than on elicitation; research

781

Sociolinguistics
is usually focused on naturally occurring speech in everyday
situations.
In North America, a view of sociolinguistics has emerged that
tends to limit it primarily to variationist sociolinguistics, which
has focused on structural factors that bear on the speakers
choice from a range of competing variants, for instance, when the
copula is likely to be omitted in a vernacular utterance where it is
required in standard English. This tradition has led to a reclassification of approaches to the covariance of language and setting,
also known as the ethnography of communication, into linguistic anthropology. This may also have been encouraged by more
invocations of pragmatics to account for linguistic behavior. In
Europe, in contrast, the sociolinguistic landscape is far more contrasted. For example, though influenced by the variationist paradigm, James and Lesley Milroy have developed in the UK their
own brand of sociolinguistics, known as social network analysis
(SNA). They explain language change and maintenance through
the study of individuals social ties within a given network. The
latter varies according to size (dense and multiplex) and intensity (weak vs. close-knit). Contrary to the received doctrine, SNA
argues that close-knit structure functions as a conservative force
that prevents linguistic change, whereas speakers with weak ties
within a network are most exposed to external pressure and,
therefore, will favor innovations. In France, on the other hand,
sociolinguistic theory has hardly been influenced by the variationist paradigm, although William Labovs books of the early
1970s were translated into French toward the end of the same
decade. A brainchild of the sociolinguistics school is praxematics
(hardly known outside France) developed in the 1970s by Pierre
Lafont. It is primarily interested in the production of identity and
meaning, focusing not only [on] the specific meaning produced
and its implications, but also [on] the conditions and dynamics of
the process that enables a speaker to construct meaning (Lafont
2003 [1978], 86). Within this framework, analyses are speaker or
author oriented and combine techniques of discourse analysis
(see discourse analysis [human]) with interactional sociolinguistics. Indeed, they study both oral and written practice.
Another distinction is often made nowadays between quantitative sociolinguistics, another name for variation analysis, and
qualitative sociolinguistics, which focuses almost exclusively on
the correlation among form, function, and sociological factors.
A quick examination of recent textbooks on or dictionaries of
sociolinguistics also reveals that there is no general agreement
on the disciplines boundaries. Classifications of what counts as
sociolinguistics vary according to author and school of thought.
For some, creole studies, discourse analysis, conversation
analysis, and language contact, for instance, belong in
sociolinguistics, whereas for others, they are separate fields of
investigation that happen to discuss topics that overlap with the
concerns of sociolinguistics. Discipline boundaries are often ideological, serving interests that are not always strictly academic.
Part of the fuzziness in the boundaries of sociolinguistics
stems from the history of its development since the 1960s and
the diversity of disciplines from which it has evolved.

Historical Landmarks
As in the history of many disciplines, it is a convergence of several
factors, both intellectual and social that led to the development

782

of sociolinguistics as a separate field of inquiry. Nonetheless, one


should distinguish between the emergence of sociolinguistics as
a distinct discipline and linguists awareness of the interrelation of linguistic and social phenomena. The latter is apparently
very old. According to Paul Kiparsky, the Sanskrit grammarian
Pini (500 b.c.e.) was already aware of linguistic variation in
his description of stylistic preferences among variants (1979,
1). Several attestations of the social character of language can be
found throughout the literature of the late nineteenth and early
twentieth centuries, but they did not lead to any theorizing and
systematic analysis. Ferdinand de Saussure (18571913), the
initiator of modern structuralism, already stressed that language is a social fact, but he never pursued this line of research,
nor did his followers Antoine Meillet and Joseph Vendryes, cited
in Labovs early work of the 1960s.
May 1964 is often cited as the birth date of sociolinguistics,
when the UCLA Center for Research in Language and Linguistics
sponsored a conference on Sociolinguistics. Two years later,
the proceedings of the conference were published in a book
titled Sociolinguistics, edited by William Bright. The volume displays a wide range of approaches and lines of interest, including
the correlation of linguistic variation with social and contextual
factors, the functions and significance of language varieties
within a given community, and language attitudes, multilingualism (see bilingualism and multilingualism), and polyglossia. It closes with considerations of possible applications of
sociolinguistic research. Key actors of the early period who later
became the leaders of the field had been trained in different
intellectual traditions: anthropology, linguistics, sociology, and
social psychology. To date, each of these research areas has contributed to the diversity of the field and marked its development
in complementary directions. For instance, Dell Hymes, trained
as an anthropologist, developed the ethnographic approach.
Sociologists Charles Ferguson and Joshua Fishman shaped what
is called sociology of language. Labov, a former student of Uriel
Weinreich, well known for his work on language contact and dialectology, developed variationist or quantitative sociolinguistics.
One may wonder whether American sociolinguistics would
have emerged and developed the way it did in its early stages if
it had not been for the strong influence of generative grammar in the late 1950s and in the 1960s. By approaching language
as intrinsically variable, sociolinguists were clearly opposing
the structuralists idea of a homogeneous grammatical system
inherited from Saussure and maintained by Noam Chomsky.
The national and international sociopolitical context of the early
1960s also played an undeniable role in the emergence of a discipline concerned with language as a social phenomenon. Crucial
sociopolitical questions involving concrete language issues arose
from the integration of ethnic minorities in American schools. At
the same time, the independence of former European colonies
prompted scholars to address issues of multilingualism and language planning (see language policy).

The Sociology of Language


The distinction between sociolinguistics and the sociology of
language didnt exist at the beginning, and both terms were used
interchangeably. The boundary between the two research areas
is not easy to draw either. It replicates the distinction between

Sociolinguistics
macro- versus micro-sociolinguistics, with the sociology of language listed in the first category and todays sociolinguistics in
the second. One may wonder if this distinction is really tenable
since it is based mainly on a matter of scale. Indeed, many studies overlap both approaches.
Primarily associated with the work of Fishman, the sociology
of language is concerned with the correlation between language
as an entity and social factors at a macrolevel, such as a nation
(see nationalism and language) or ethnic group (see ethnolinguistic identity). Whereas sociolinguistics looks at
language structure and use in order to get a better understanding of society, the sociology of language looks through the lens of
social organization to understand language dynamics. The latter,
therefore, applies sociological research techniques and models
to the study of language. Among the subject matters associated
(nonexclusively) with the sociology of language are language
planning, bilingualism, language maintenance and shift, and
language conflicts.
While sociologists have reproached sociolinguists for their
naive conception of society and their uncritical adoption of disputable social categories (e.g., social class and gender) to explain
language dynamics, sociolinguists have in turn deplored the lack
of linguistic training among sociologists. Both allegations happen to be well founded and are prompting scholars interested
in understanding the correlation of language and social life to
develop a better integrated approach to both disciplines so as
not to reduce language to a caricature of society and vice versa.

Variationist Sociolinguistics
The distinction between quantitative/variationist and qualitative sociolinguistics is now well established, although these
approaches have increasingly been combined in recent work,
paving the way for derived research paradigms. A prominent
concern in variationism has been language variation and
change, also the title of its main journal, focusing on the extent
to which patterns of variation are indicative of the direction
of change in a language community and on the social factors
that bring about the relevant changes. Overall, statistical methods are used to uncover the distribution patterns of linguistic variants across groups of speakers. Extensive descriptions
have shown that linguistic variation is not only conditioned by
some linguistic environments but also correlated with social
parameters such as age, occupation/profession, and gender,
notwithstanding the relevant speakers particular interactional
networks.
Variationist studies also show that social stratification operates within a population through linguistic differentiation (e.g.,
the pronunciation of [r] in fourth floor). Labovs (1972) influential study on linguistic patterns and social stratification in New
York City showed that the pronunciation of postvocalic /r/ is a
marker of the highest social group. Some linguistic features not
only reflect membership in particular social classes and/or ethnic groups but are also associated with distinct social values (see
prestige), including stigmatization. The largest body of work
produced within this research paradigm is phonological,
although significant studies were produced on syntax, chiefly
on African American English, Caribbean English creoles, and
white American nonstandard English varieties.

Variationist analyses are based on vernacular speech, characterized by Labov as a less-controlled way of speaking in informal
settings, such as conversations at home with family or friends.
Preference has been given to nonstandard language varieties,
a tradition that is at variance with the original meaning of the
French word vernaculaire as the primary communication style
of a speaker, regardless of its level of formality. The emphasis on
spontaneous vernacular speech radically contrasts with the introspective and/or elicited data on which the generative school has
relied. Elaborate fieldwork techniques (e.g., informal interviews,
telephone surveys) have been developed to increase the size of
the corpus and maximize the reliability of data, while making
sure that the production of utterances is not influenced by the
presence of the interviewer/fieldworker. The problem addressed
by this latter precaution is termed the observers paradox: How
can the researcher participate in or observe the production of
data without influencing the process itself? For instance, how
can he or she make sure that the subjects can continue to communicate naturally in their vernacular without making adjustments to the fact that they are being observed (by an outsider)?
The quantitative paradigm has also come under criticism
especially for assuming a simplistic, unrefined model of society, as well as for positing uncritically such categories as social
class and race/ethnicity, which are considered problematic in
sociology. Language is approached as a mere reflection of social
structures, leaving little space for speakers personal distinctiveness. While providing extensive and fine-grained descriptions of
language variation, the variationist paradigm has not given an
accurate sociological account of how changes occur in a community, failing especially to distinguish between the initiators
and spreaders of change. These need not be the same individuals, and the factors that introduce change in a population are
not identical with those that spread them. Researchers working
on language variation and change have also been criticized for
not discriminating between normal variation, inherent to any
population of speakers, and changes in patterns of variation due
to the introduction or loss of some variants or caused by demographic changes. Only the latter kind of variation can be correlated with change, not the former.
The quantitative paradigm employed in variationist sociolinguistics focuses primarily on the structural factors that bear on
the distribution of variants and, secondarily, on social factors
that frame the linguistically relevant phenomena. In contrast,
qualitative sociolinguistics (including interactional sociolinguistics and related approaches, such as ethnography of speaking)
focuses mainly on the ways that various social parameters regulate how speakers use language and/or what it means to them.
Statistics play no role in this research paradigm.

Interactional Sociolinguistics
Associated primarily with the work of John Gumperz since the
early 1960s, interactional sociolinguistics seeks an understanding of the interpretive processes at work in face-to-face or group
interactions. One of the questions that it addresses is how conversationalists (i.e., interlocutors) use linguistic resources and
sociolinguistic knowledge in order to produce and interpret
discourse in context. Using direct empirical observations and
naturally occurring speech as their primary data, interactional

783

Sociolinguistics
sociolinguists also draw on ethnographic and anthropological
research methods.
As argued by Gumperz (1982), the activity of speaking is inherent in that of interpreting. This implies that interpretation
and meaning are approached as collaboratively achieved by the
conversationalists in the course of their exchange. Information
is conveyed through multiple channels of verbal and nonverbal
coding, and the interlocutors resort to contextualization cues
(namely, lexical, grammatical, and prosodic features; code and
style shift; laughter, gaze, and so on; see also gesture), which
signal how a chunk of discourse should be interpreted.
Interactions are analyzed in relation to a multilayered context: a) the immediate context of production (e.g., the interactants, the setting); b) the intraconversational context as rendered
visible by the interactants using contextualization cues; and c) the
broader sociopolitical context in which the interaction is embedded. For instance, the interaction of a Middle Eastern traveler
with an American immigration officer in 2010 is undoubtedly
different from how it was before 2001. Attention to the broader
sociopolitical context is one of the salient trademarks of interactional sociolinguistics, which distinguish it from conversation
analysis.
Social relations rely heavily on verbal or written interactions,
which depend on shared linguistic and cultural knowledge.
Failure to recognize these interdependencies leads to misunderstandings or breakdowns of the relevant interactions. As amply
documented by Gumperz and his associates, misunderstandings can be socially consequential for a speaker, who may be
denied access to some rights and benefits in society. This is particularly evident in social settings involving asymmetric relationships, for instance, a defendant versus a judge, a doctor versus a
patient, a job seeker versus an employer. An asylum seeker may
be denied asylum simply because his or her story does not meet
the expected narrative schema (see story schemas, scripts,
and prototypes). Although this shared linguistic and social
knowledge is often not explicitly verbalized during the interaction, it plays a crucial role in the way speakers display the social
identities they assume and ascribe new ones to their co-interactants, for example, competent versus noncompetent speakers.
Interactional sociolinguistics provides a useful framework of
analysis that helps interpret situated interactions by connecting them to broader social dynamics. Many scholars engaged in
this field seek to unveil the mechanisms of social inequality
that are partially maintained and, in some cases, even produced
through language.

The Ethnography of Speaking


Also known as the ethnography of communication, the ethnography of speaking is primarily associated with the work of Hymes,
the American anthropologist. Having its intellectual origins
in the interrelation among language, culture, and society, this
research paradigm is often classified within the discipline of
anthropology, although its history and development date back to
the early 1960s and are also closely linked to qualitative sociolinguistics. Five of the contributors to Brights Sociolinguistics had
also participated in Gumperz and Hymess The ethnography of
communication (a special issue of the American Anthropologist
in 1964).

784

The ethnography of speaking investigates the social functions


and meanings of language use within a speech community: how
speakers use their language resources (different codes and speech
styles) and what they mean to them. The activity of speaking is
approached as a socially situated action, which varies depending
on the social settings and culture in which it occurs. On the basis
of ethnographic accounts of language practice (including written
and historical discourse), the ethnography of speaking privileges
the insiders perspective on peoples ways of speaking and the
different functions assumed by the latter. Language, therefore,
is approached primarily from the point of view of what people
make of it, what it means for them, and how it is perceived by
other members of the community.
The basic unit of analysis in the ethnography of speaking is
the speech event, defined as a particular cultural activity set by
specific rules and norms of language use that help distinguish,
for instance, a lecture from a job interview. Hymes identifies
eight major components in a speech event: setting, participants,
ends, acts, keys, instrumentalities, norms, and genre, which are
commonly referred to with the mnemonic acronym SPEAKING.
Speaking of participants, rather than speakers, extends the focus
of analysis from people verbally involved in a linguistic exchange
to all of the people having verbal or visual access to an interaction, including bystanders and overhearers. According to the
ethnography of speaking, even indirect participants play a role
in the construction of the speech event and may, by the mere fact
of being present, affect the unfolding of a linguistic exchange.
For instance, interactants who do not want to be understood
by potential overhearers may choose a specific code from their
repertoire that is not intelligible to bystanders. Hymess model
of participation benefited greatly from the work of sociologist
Erving Goffman, who refined it by deconstructing the dyadic
hearerspeaker and by adding new categories of participants,
such as animator, author, and principal.
According to Hymes, language should not be explained from
the point of view of its grammatical structure only. To Chomskys
notion of linguistic competence based on an ideal speakerlisteners knowledge of grammatical structure Hymes adds that
of communicative competence. For Hymes, the ability to speak
a language adequately requires, in addition to the knowledge of
grammatical rules, that of its principles of use: what to say to
whom, and how to say it in which context. Failure to acquire
communicative competence has social implications for the way
that the participants in a speech event perceive and (mis)understand each other.
By focusing on language function and use, the ethnography
of speaking seeks an understanding of the ways that language is
manipulated into social action. What needs to be investigated is
what speakers can(not) do with the language resources available
to them.

Sociolinguistics in the Age of Globalization


New interest in globalization as a research area for sociolinguistics has emerged since the early 2000s (see, e.g., the special issue
of the Journal of Sociolinguistics, edited by Nicholas Coupland
2003). The pervasive social transformations triggered by globalization make it almost inevitable for sociolinguists to want
to analyze its impact on speakers and, therefore, on language,

Source and Target

Specific Language Impairment

following in the footsteps of the researchers on the growing area


of language endangerment (see extinction of languages).
One of the central questions sociolinguists have addressed is how
adequately their traditional approaches can help them account
for sociolinguistic behavior in the new age of globalization. This
entails, among other things, rethinking our homogenizing and
flat model of society, as the world is marked by increasing population mobility through intra- and/or transnational migrations,
and as more sociocultural and political exchanges take place by
interacting individuals. National and ethnic identities are being
redefined, if not blurred, as are gender roles. As it addresses
issues arising from globalization, sociolinguistics faces new challenges likely to reshape accounts of the covariance of language
and social factors.
Ccile B. Vigouroux
WORK CITED AND SUGGESTIONS FOR FURTHER READING
Barbris, Jeanne-Marie, Jacques Bress, Robert Lafont, and Paul Siblot.
2003. Praxematics: A linguistics of the social production of meaning.
International Journal of the Sociology of Language 160: 81104.
Bratt Paulston, Christina, and G. R. Tucker 2003. Sociolinguistics. Malden,
MA: Blackwell.
Coupland, Nikolas, ed. 2003. Sociolinguistics and globalization. Journal
of Sociolinguistics 7.4. (Special Issue).
Gumperz, John. 1982. Discourse Strategies. Cambridge: Cambridge
University Press.
Gumperz, John, and Dell Hymes. 1964. The ethnography of communication. American Anthropologist 66.6 (Special Issue): Part 2.
Hymes, Dell. 1974. Foundations in Sociolinguistics: An Ethnographic
Approach. Philadelphia: University of Pennsylvania Press.
Kiparsky, Paul. 1979. Panini as a Variationist. Cambridge, MA: MIT
Press.
Labov, William. 1972. Sociolinguistic Patterns. Philadelphia: University of
Pennsylvania Press.
. 2001. Principles of Linguistic Change. Vol. 2. Social Factors.
Malden, MA: Blackwell.
Lafont, Pierre. 1978. Le travail et la langue. Paris: Flammarion.
Mesthrie, Rajend, Joan Swann, Andrea Deumert, and William L. Leap.
2000. Introducing Sociolinguistics. Edinburgh: Edinburgh University
Press.
Milroy, Lesley. 1987. Language and Social Networks. 2d ed. Oxford: Basil
Blackwell.

SOURCE AND TARGET


This is the mapping of information from one linguistic or conceptual domain (source) onto another domain (target). With
metaphor, these terms, used together, have been employed
as alternatives to the dichotomy between metaphoric tenor and
vehicle introduced by I. A. Richards, typically understood as the
transfer of meaning from one word to another. Application of the
terms source and target, made popular by George Lakoff (see,
for example, Lakoff 1987, 276), restructures metaphoric mapping as a conceptual and not merely a linguistic phenomenon.
As used in translating text, source refers to the original language
one is attempting to faithfully reproduce in a different (target)
language.
Albert Katz

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Lakoff, George. 1987. Women, Fire, and Dangerous Things: What Categories
Reveal About the Mind. Chicago: University of Chicago Press.
Richards, I. A. 1936. The Philosophy of Rhetoric. London: Oxford
University Press.

SPECIFIC LANGUAGE IMPAIRMENT


Specific language impairment (SLI) applies to individuals with a
significant deficit in spoken language ability without accompanying problems such as hearing impairment, low nonverbal intelligence, or neurological damage. Symptoms are most obvious in
the early years of development. However, many individuals with
SLI continue to exhibit weaknesses in language ability during the
adolescent and early adult years. Along with subtle problems in
spoken language, significant deficits in reading ability are often
experienced by these individuals as well. These lingering difficulties may take their toll over time; the development of friendships
and success in finding employment are often adversely affected.
The prevalence of SLI in adulthood is not known. However, at
age five, this figure might be as high as 7 percent, based on epidemiological study (Tomblin et al. 1997)
For many children, SLI appears to have a genetic basis. For
example, studies of twins indicate that the concordance rate of
SLI is larger in monozygotic (identical) twins than in same-sex
dizygotic twins (Bishop, North, and Donlan 1995). However,
the most recent evidence suggests that any genetic source is
likely to be multifactorial, in which several genes, each with a
relatively small effect, operate in combination to produce SLI.
Furthermore, current evidence reveals unexpected similarities
between the genetic sources of SLI and autism with accompanying language disorder; these similarities are not seen between
SLI and autism without accompanying language impairment
(Tager-Flusberg 2004). Genetic studies also suggest that SLI and
developmental dyslexia are overlapping but not identical conditions (Bishop and Snowling 2004).
Two factors that pose challenges to the study of SLI are: 1) the
considerable heterogeneity among children acquiring the same
language (e.g., English) in the symptoms of the language disorder, and 2) the systematic differences in the language symptoms
of SLI that hold across languages. The within-language heterogeneity has led to attempts to create subtypes within the SLI
population (Leonard 1998). These include pragmatic language
disorder and grammatical SLI. Although such terms accurately
describe the most salient symptoms of many children with specific language impairment, the evidence that they constitute
discrete subtypes, rather than common patterns along a continuum, is not yet firm. Of the common symptoms, the most
frequently encountered appears to be a moderate-to-severe
deficit in morpho-syntax with a milder deficit in semantic
and phonological areas.
The cross-linguistic differences that hold among children
with SLI seem to rule out simplistic explanations for their language difficulties. For example, in inflectionally rich languages
such as Italian and Spanish, children with SLI do not show special difficulties with present tense verb inflections, even though
such difficulties are seen in Germanic languages (Leonard
1998). Thus, blanket proposals that these children fail to grasp

785

Specific Language Impairment

Speech-Acts

grammatical agreement (or number, or person in particular)


would seem to be quite incorrect. One general cross-linguistic
observation that seems true about SLI is that the most fragile
areas of each language as defined by late age of emergence and
attainment of mastery by typically developing children seem
to be especially problematic for children with SLI acquiring that
language. That is, the uneven profiles of development seen in
any language (such as the telegraphic look of young Englishspeaking childrens sentences, containing very few inflections)
are exaggerated in SLI.
The notion that a language impairment can exist in the absence
of other obvious difficulties has clear implications for theories of
language development, as well as for linguistic theory in general.
For example, if some children acquire language very slowly in
the absence of such factors as hearing impairment, intellectual
deficits, or neurological damage, any theory of language learning
must contain mechanisms that can accommodate this less typical (and less adaptive) manner of acquiring language. In terms
of linguistic theory, a condition such as SLI seems to lend support to the view that language is an autonomous system. There
is currently some debate on this issue, as studies of nonlinguistic
processing have revealed subtle but reliable differences between
individuals with and without SLI, even though language is the
area of greatest concern in this type of disorder (Leonard 1998).
One example of a recent account of SLI centers on the inconsistent use of tense and agreement morphology by these individuals, especially during the preschool years. According to this
proposal, children with SLI understand the notions of tense and
agreement, but they fail to grasp the fact that tense and agreement
are obligatory in main clauses (Rice, Wexler, and Hershberger
1998). Thus, these children might produce Mommy drive to
work every day, as well as Mommy drives to work every day. This
same period of treating tense and agreement as optional is seen
in typically developing young children, but this period is fleeting, rather than protracted as in the case of SLI. According to this
account, the extended period of optional use is attributable to a
maturational principle that is slow to take hold in these children.
Given both the prevalence of specific language impairment and
the theoretical and clinical need for a better understanding of
this disorder, SLI is likely to be the focus of intensive research
for many years.
Laurence B. Leonard
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bishop, Dorothy, T. North, and C. Donlan. 1995. Genetic basis of specific
language impairment: Evidence from a twin study. Developmental
Medicine and Child Neurology 37: 5671.
Bishop, Dorothy, and M. Snowling. 2004. Developmental dyslexia
and specific language impairment: Same or different? Psychological
Bulletin 130: 85886.
Leonard, Laurence. 1998. Children with Specific Language Impairment.
Cambridge, MA: MIT Press. This book provides a thorough review of
SLI research from a variety of theoretical perspectives.
Rice, Mabel, and S. Warren. 2004. Developmental Language
Disorders: From Phenotypes to Etiologies. Mahwah, NJ: Lawrence
Erlbaum. This book offers insight into the possible genetic bases of
SLI and the challenges involved in defining the phenotype of this and
related disorders.

786

Rice, Mabel, K. Wexler, and S. Hershberger. 1998. Tense over time: The
longitudinal course of tense acquisition in children with specific
language impairment. Journal of Speech, Language, and Hearing
Research 41: 141231.
Tager-Flusberg, Helen. 2004. Do autism and specific language impairment represent overlapping disorders? In Developmental Language
Disorders, ed. Mabel Rice and S. Warren, 3152. Mahwah, NJ: Lawrence
Erlbaum.
Tomblin, J. Bruce, N. Records, P. Buckwalter, X. Zhang, E. Smith, and
M. OBrien. 1997. The prevalence of specific language impairment
in kindergarten children. Journal of Speech, Language, and Hearing
Research 40: 124560.

SPEECH-ACTS
Speech-acts are the basic building blocks of language activity. In
using language publicly to communicate one performs speech-acts.
To think privately, (arguably) one performs (internal) speech-acts.
The minimal moves of conversational exchange are self-standing
speech-acts performed with sentences known as ILLOCUTION
ARY acts (Austin 1962; Searle 1969). The basic illocutionary acts
are assertions (describing, stating, concluding), directives (orders,
requests, suggestions, questions), commissives (threats, promises),
expressives (apologies, thankings, congratulatings), and declarations (baptisms, marriage pronouncements). There is also what
some theorists call the locutionary or propositional act: the act of
conveying some propositional content with a sentence. An illocutionary act is typically a part of a more encompassing act, which is
the production of some effect in the speaker in virtue of the illocutionary act: convincing or threatening. This is called a perlocutionary act. In addition to these speech-acts, we can also include
subsentential speech-acts: say, acts of referring, using property
names, or predicates, if we think predicates refer.
In what follows, I consider the distinction between sense and
force, assertion, conditional illocutionary acts, locutionary acts,
the sentential speech-act of conventional implicature distinguished from converstional implicature performatives,
and the role of speech-act theory in understanding meaning.
SENSE AND FORCE Fundamental to the orthodox understanding
of speech-acts is the distinction between sense and force (Stenius
1967; Searle 1969; Searle and Vanderveken 1985). On this view,
a sentence uttered in an illocutionary act, say, an assertion,
has two components of content the proposition and its force.
Propositions are meant to be the following:
(a) The objects of propositional attitudes, as in what
subjects believe, fear, hope, and so on.
(b) The contents of sentences in embedded contexts, as in
either P or Q, where P and Q are not performed in illocutionary acts yet still have content.
(c) The common contents of mood-modified sentences. The
following sentences all have a common content, the thought
that Fred jumps: Fred jumps. Is Fred jumping? Fred, Jump!
(d) The ultimate objects of which ascriptions of truth or falsity are made.
In contrast, force is that aspect of meaning to do with the
deployment of sentences that possess propositional content.

Speech-Acts
Thus, assertoric force is one use of a sentence, say, in which we
intend to express a true proposition. Imperative force is that
where we intend that a proposition be made true by an audience. And so on. The force behind uttered sentences does not
necessarily correspond to their grammatical mood: declarative,
imperative, or interrogative.
What exactly are forces? We can focus that question in relation to assertion. Assertion is one kind of illocutionary act. For
Paul Grice (1958), illocutionary acts are essentially acts of communication: they involve communicative intentions, that
is, intentions directed toward an audience. These intentions are
reflexive in form: the speaker U aims to get an audience H to gain
a certain state r say, a belief, desire, intention by attempting to
get the audience to recognize this very intention. (See Grice 1969,
Strawson 1964, and Schiffer 1972 for examination of the structure
of such intentions.) It is not clear how the analysis should go, or
whether we should involve intentions in this way at all.
ASSERTION The task, then, for an analysis of assertions according to Grice is to determine which state r U intends H to gain
when U asserts something. Grice (1971) toys with two proposals:

intention of uttering a truth-apt sentence that is true, leads us to


the question of what the speech-act of uttering a truth-apt sentence is. (More of that in the next section.)
Another thought is to invoke normative states. R. Brandom
(1983) contends that
U asserts that S if and only if (i) U obligates him- or herself to
justify S, if asked to, and (ii) permits speakers to use S as a
premise in arguments.

Brandoms analysis, it seems, deals with speakers who are indifferent to the communicative effect of their utterance and who
may be insincere. But there are worries. One is circularity: Both
(i) and (ii) specify conditions that really make reference to the
assertion of S, and it is not entirely obvious how this reference
can be removed. Brandom seems to suggest some kind of mutual
delineation between inference and assertion, but it is not obvious
that this works. Another concern is what a speaker does in order
to obligate him- or herself. Merely uttering a sentence with a certain tone of voice does not do that. A theory of what the speaker
does might amount, in itself, to a theory of assertion.
Evidently, assertoric force has yet to be convincingly clarified.
(See Pagin 2004, 2005 for more discussion of the challenges.)

a belief that P;
a belief that U believes that P.

But both are problematic. It is quite possible for U to assert that


P, but to be perfectly indifferent as to audience response. Say that
Jan is paid to make announcements but has no concern about
the epistemic states of her audience (see Alston 2000). Jan asserts
that shoes are on sale this week, but does not intend either that
her audience come to believe this proposition or that her audience come to believe that she believes it.
In the light of such difficulties, D. Amstrong (1971), K. Bach
and R. M. Harnish (1979), and F. Recanati (1986) suggest that
in assertion, the state that U intends his or her audience to
possess is
that H has a reason to believe that U believes that P.

But even this is too much, since an assertor may be indifferent as


to whether audiences come to have reasons to believe that he or
she believes that P.
Grices assumption that assertion is essentially an act of communication and so involves intending audiences to have certain kinds of doxastic or epistemic state is open to dispute.
Perhaps assertions are acts of expressing beliefs or aiming to
utter truths. But these proposals face the problem of insincerity. One needs to talk, rather, of representing oneself as intending to utter a truth or express belief. That is because assertions
can be insincere. The insincere asserter does not express belief
or aim at truth. Rather, there is only the semblance thereof. But
illuminating representing-oneself-as is difficult (see Pagin 2005).
Furthermore, even if we could illuminate this representing-as,
there is the question of what expressing or aiming at truth is. All
sorts of speech-acts can be said to involve expression of belief. In
using the name George Bush in a context in which I am taken to
sincerely refer, I express my belief that Bush exists but I do not
assert that he exists. The view that a sincere, clear-headed assertion involves an intention to utter a truth, which amounts to an

Conditional Illocutionary Acts


Illocutionary acts are not meant to be embedded in logical compounds (see Dummett 1981). In a silent room, we ask:
(i) If it is raining outside, why cant we hear it?

(i) is not equivalent to a question about a conditional:


(ii) Why is this true: If it is raining outside, we cannot hear it.

The answer to (ii) is that there is no sound of rain. So, does utterance of (i) comprise a question embedded in a conditional? That
cannot be. To successfully perform a question with Why cant
we hear the rain outside? one must believe that we cannot hear
the rain outside, which, in this context, requires that one believe
there is rain outside. Rather, in (i) the if-clause provides a proposition from which the presupposition of the question why cant
we hear the rain? can be derived. The interrogative sentence is
in the scope of if, but no question is performed with the (consequent) sentence.
What speech-act are we performing? The natural thought is
that we are performing a conditional question, that is, a conditional illocutionary act. One concept of conditional illocutionary
acts is that an illocutionary act, A condition on P, is an act that
amounts to an illocutionary act, A if P, and otherwise is no illocutionary act at all (see Belnap 1973). An alternative idea is that in
conditionals like (i), the speaker performs a proto-illocutionary act
in the scope of a supposition. Proto-illocutionary acts are kinds of
precursors to illocutionary acts that can embed, but are no locutionary acts in the sense described here (see Barker 2004).

Locutionary Acts and Truth-Apt Sentences


Assertions do not embed in logical compounds like disjunction or negation. According to orthodoxy, in uttering either P or
Q, U performs locutionary acts with P and Q. Locutionary acts
are meant to provide the pure said-content, or sense, separated
from any assertoric force. Such embedded sentences can be

787

Speech-Acts
truth-apt they can be judged true or false even though they
are not asserted. The standard view is that truth-apt sentences
are sentences encoding propositions. By propositions we mean
entities whose intrinsic natures are separate from pragmatic factors, and encoding is that semantic relation binding propositions
to sentences. The problem with this idea is that orders, arguably,
also encode propositions (see Price 1988). So why are orders not
truth-apt?
One idea is that a sentence is truth-apt just in case it encodes a
proposition, is declarative in mood, and embeddable in an unrestricted way in logical compounds. Sentences produced as orders
cannot be antecedents of conditionals or subject to negation, and
so arent truth-apt (Wright 1992). But many truth-apt sentences,
say, rhetorical questions, are truth-apt but do not embed and are
not declarative. And some sentences, such as epistemic modals
like There may be life on Mars, do not happily embed as the antecedents of conditionals or in the scope of negation.
Another idea, suggested by W. J. Alston (2000), is that declarative sentences represent, in a constituent-isomorphic way,
the structure of the proposition they encode, whereas imperatives do not. But any such theory would seem to be misguided.
Perfectly meaningful declarative sentences can fail to be truthapt because they are used to perform orders or performatives
(see later section). Interrogative sentences can be used to make
assertions and be truth-apt. This suggests that truth-aptness is
more a matter of pragmatic features than syntactic features and
the supposed internal composition of propositions. Thus, the
exact speech-act form of utterances of sentences that we call
truth-apt is far from clear.

Conventional Implicature
Conventional implicature introduces the prospect of another
kind of sentential speech-act. In asserting a declarative sentence,
two contents may be produced: what is said, and what is not said
but merely indicated. Moreover, the distinction between what is
said and what is indicated can occur with respect to the conventionally determined content of the sentence. Thus, in uttering He
is an Englishmen. He is therefore brave, the speaker says that he is
an Englishman and that he is brave, and indicates, through therefore, that his being brave follows from his being an Englishmen.
The non-said content in such sentences is conventional implicature (see Grice 1971). Expressions that introduce content of this
second kind include particles like even, but, therefore, and so on.
Implicated content makes no contribution to truth conditions.
Thus, consider the following:
(iii)

Even Elvis was famous.

(iv)

Even the actual world is actual.

Both sentences, assuming standard background, are slightly


weird due to even. Nevertheless, we dont judge them false. But,
then, we are not greatly inclined to say that they are true either.
The hesitation to call (iii) and (iv) true is not the presence
of a truth value gap brought about by semantic presupposition. That is because the thoughts conveyed by these
sentences that Elvis is famous and that the actual world is
actual do not depend on the content of the implicatures.
Sentences are true if and only if what they say is true. These

788

sentences say something, that something is true, so they are


true. Why are we then disinclined to assert that (iii) and (iv) are
true? A good answer is that truth ascriptions conversationally
implicate that the pragmatic presuppositions of the sentences
to which truth is ascribed are met. The conventional implicatures fail in the case of (iii) and (iv), and so we are disinclined
to assert that they are true.
There are those who deny that conventional implicatures
really exist. Bach (1999) argues that implicature-bearing sentences are simply sentences that encode more than one proposition, sentences like (v):
(v)

The Dalai Lama is a monk who, like Mother Teresa, is pious.

Bach points out that we hesitate in our truth evaluations of sentences like (vi), in which the subsidiary proposition is judged
false:
(vi)

The Dalai Lama is a monk who, unlike Mother Teresa, is


pious.

According to Bach, this is, rather, like our hesitation in the face
of sentences like (iii) and (iv). So, implicature-bearing sentences
are just a species of mutlipropositional sentences. That suggests
that we do not need any special speech-act for conventional
implicature.
The argument does not work (Barker 2003). Consider sentences like:
(vii) If the Dalai Lama is a monk who, unlike Mother Teresa, is
pious, then the Dalai Lama is unlike Mother Teresa in

being pious.
If the consequent of (vii) is false because Mother Teresa is pious,
we must conclude that the antecedent is false, contrary to Bachs
claim. The same kind of argument cannot run for conventional
implicatures because they do not unpack in this way in conditionals. Witness the oddness of
(viii) If even Granny got drunk, then Granny is a surprising
instance of drunkenness.

Its oddness can be traced to the fact that what is implicated in the
antecedent is transferred into said-content in the consequent.
If Bachs claims are false, we are left with the prospect that
conventional implicature is a different kind of sentential speechact, one whose nature is still not clarified (though see Barker in
press).

Performatives
One of the original topics of speech-act theory was performative utterances. These were described by J. Austin (1962) as
speech-acts in which speakers did things, rather than merely
said things. The following sentences can be used to issue
performatives:
(ix) I order you to leap.
(xi)

I declare these games open.

The use of these sentences as performatives is that where the


speaker carries out some act, ordering or making a declaration,
through utterance of the sentence. Naturally enough, to perform
such acts successfully, speakers must have certain social and

Speech-Acts

Speech Anatomy, Evolution of

epistemic statuses. There are detailed felicity conditions on performatives relating to these statuses (see Austin 1962 and Searle
and Vanderveken 1985).
Performatives are puzzling to some because they comprise
sentences that describe the very activity undertaken through
utterance of the sentence. In uttering (xi), I declare the games
open and, indeed, describe what I am doing in uttering (xi). That
suggests to some that performatives are truth-evaluable (see
Searle 1989). But one has to say that they do not appear intuitively to be truth-evaluable. Declarative sentences can be used in
nonassertoric ways. For example, one can order someone onto a
boat by uttering You shall be on that boat in two minutes. I am not
performing an assertion with this sentence. My utterance is not
truth-evaluable, even if it describes a future event of your getting
on the boat. The mood of a sentence is not always a guide to the
illocutionary act I perform with the sentence (see Strawson 1964;
Davidson 1984), and mere descriptive adequacy is not sufficient
for truth.

Meaning and Speech-Acts


Why study speech-acts? One reason is that they are simply what
we do with language and so of interest. But there is another
reason: We might think words have meaning because speakers
deploy them in speech-acts. There are distinct ways of understanding this idea. One is that speech-acts are the glue through
which thoughts are bound to words and sentences. The classic defender of this idea is John Locke ([1689] 1975), modern
defenders being Grice (1975) and later Alston (2000) and W.
Davis (2005). Another idea is that speech-acts are somehow
integral to propositional thought. One can see Brandom (1994)
as proposing this implicitly, and S. J. Barker (2004) proposing it
explicitly.
Stephen Barker
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alston, W. J. 2000. Illocutionary Acts and Sentence Meaning. Ithaca, NY:
Cornell University Press.
Armstrong, D., 1971. Meaning and communication. Philosophical
Review 80: 42747.
Austin, J. 1962. How to Do Things with Words, ed. J. O. Urmson and
G. J. Warnock. Oxford: Oxford University Press.
Bach, K. 1999. The myth of conventional implicature. Linguistics and
Philosophy 22: 32766.
Bach, K., and R. M. Harnish. 1979. Linguistic Communication and Speech
Acts. Cambridge, MA: MIT Press.
Barker, S. J. 2003. Truth and conventional implicature. Mind 112: 133.
. 2004. Renewing Meaning. Oxford: Clarendon.
. Truth-bearers and the unsaid. In Making Semantics Pragmatic,
ed. Ken P. Turner. Cambridge, UK: Elsevier. In press.
Belnap, N. D. 1973. Restricted quantification and conditional assertion.
In Truth, Syntax and Modality, ed. H. Leblanc, 4875. Dordrecht, the
Netherlands: North-Holland.
Brandom, R. 1983. Asserting. Nos 17: 63750.
. 1994. Making It Explicit. Cambridge: Harvard University Press.
Davidson, D. 1984. Essays on Truth and Interpretation. Oxford: Oxford
University Press.
Davis, W. 2005. Meaning and Expression. Cambridge: Cambridge
University Press.

Dummett, M. 1981. Frege: Philosophy of Language. 2d ed. London:


Duckworth.
Grice, P. 1958. Meaning. Philosophical Review 67: 37788.
. 1969. Utterers meaning and intentions. Philosophical Review
78: 14777.
. 1971. Utterers meaning, sentence-meaning, and word-meaning. In The Philosophy of Language, ed. J. Searle, 5470. Oxford: Oxford
University Press.
Locke, J. [1689] 1975. Essay Concerning Human Understanding, ed.
P. Nidditch. Oxford: Oxford University Press.
Pagin, P. 2004. Is assertion social? Journal of Pragmatics 36: 83359.
. 2005. Assertion. Stanford Encyclopaedia of Philosophy. Available
online at: http://www.plato.stanford.edu/.
Price, H. 1988. Facts and the Function of Truth. Oxford: Basil Blackwell.
Recanati, F. 1986. On defining communicative intentions. Mind and
Language 1: 21341.
Schiffer, S. 1972. Meaning. Oxford: Clarendon.
Searle, J. 1969. Speech Acts. Cambridge: Cambridge University Press.
. 1989. How performatives work. Linguistics and Philosophy
12: 53558.
Searle, J., and D. Vanderveken. 1985. Foundations of Illocutionary Logic.
Cambridge: Cambridge University Press.
Stenius, E. 1967. Mood and language-game. Synthese 17: 25474.
Strawson, P. 1964. Intention and convention in speech acts.
Philosophical Review 73: 43960.
Wright, C. 1992. Truth and Objectivity. Cambridge, MA, and London:
Harvard University Press.

SPEECH ANATOMY, EVOLUTION OF


Anatomical structures supporting speech include the brain,
vocal tract, and associated tissues: the tongue, hyoid bone (the
u-shaped bone at the base of the tongue), larynx (voicebox),
and trachea (windpipe). To start at the top, identification of socalled language areas (such as brocas area and wernickes
area) in endocasts of australopithecine brains has been used
to support an early evolution of spoken language (Tobias 1971),
although opinions differ as to the identification and significance
of these bumps. Other researchers argue that language is tied to
cerebral asymmetries, brain reorganization, or, more generally,
to enlargement of the brain or one or more of its components
(Holloway 1983).

Anatomy of the Vocal Tract


Although the brain is of primary importance for the production and understanding of spoken language, speech sounds are
filtered through the supralaryngeal vocal tract (SVT), the airway directly above the vocal cords. Rapid fluttering of the vocal
cords modifies air into sound that is then filtered in the SVT.
Nonhuman primates possess an SVT configuration in which
the length of the horizontal tube (from the lips to the uvula) far
outstrips that of the vertical tube (from the palate to the vocal
cords), so that the tongue is largely restricted to the oral cavity.
In addition, apes and Australopithecus afarensis, a 3.5 millionyear-old hominid (Alemseged et al. 2006), have hyoid bones
scooped out at the back to accommodate laryngeal air sacs. In
apes, these sacs empty into the ventricle of the larynx, hanging down into the neck on either side of the trachea (Hewitt,
MacLarnon, and Jones 2002). Spatial constraints in the neck
make it unlikely that ventricular sacs could be attached to a low

789

Speech Anatomy, Evolution of

Speech-Language Pathology

larynx, so that australopithecines likely possessed a hyoid near


the border of the lower jaw.
Living humans have a different SVT configuration, in which
the lengths of the horizontal and vertical tubes of the vocal tract
are equal (in a 1:1 proportion) and at a right angle to each other
(Negus 1949). This configuration develops by six to eight years of
age, by which time the tongue and larynx have descended into
the throat below the lower jaw (D. Lieberman et al. 2001). In
combination with a highly mobile tongue, this 1:1 SVT supports
the production, after childhood, of maximally stable, intelligible
quantal vowels that facilitate efficient spoken language (Buhr
1980; P. Lieberman 2006). The possession of a 1:1 SVT is only one
of the components necessary for spoken language. However,
inferences about the evolution of speech anatomy must take into
account not only the development of a brain capable of producing and understanding speech and language but also the development of an SVT capable of efficiently filtering those sounds.
So when did such an SVT arise? Unfortunately, the voicebox
is suspended from the skull by muscles and ligaments, and so its
position in the throat is difficult to reconstruct for fossil hominids
known only from skeletal remains. Nevertheless, research on the
basicranium (roof of the pharynx) suggests that Neanderthals
resembled monkeys, apes, and human newborns, with a hyoid
positioned at the base of the lower jaw (P. Lieberman and Crelin
1971; Laitman, Heimbuch, and Crelin 1979). Conflicting evidence from other anatomical regions suggests that Neanderthals
had a long tongue and low voicebox, like modern humans (e.g.,
Arensburg et al. 1990; Bo, Maeda, and Heim 2001). Recent
research indicates that Neanderthals, other archaic humans,
and the earliest members of our own species could not have fit
a 1:1 SVT capable of supporting quantal speech into their short
necks (McCarthy et al. in press).

Fewer Answers Than Questions


So when did an anatomical configuration capable of supporting speech and language arise? Conjectures based on the fossil
record can be used to support dates ranging between 2.5 million
and ~40,000 years ago. However, quantally based spoken language likely arose recently, within Homo sapiens about 100,000
years ago or later. One question that has generated the most
heated debate is whether Neanderthals possessed the capability
for speech. A definitive answer may be useful not only for determining if the uniquely human 1:1 SVT is an adaptation for quantal
speech but also for explaining why there is a seeming disjunction
between biological and behavioral evolution as recorded in the
archaeological record. Barring new evidence or approaches, we
must follow the advice of Ludwig Wittgenstein: What we cannot
speak about we must pass over in silence.
Robert C. McCarthy
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Alemseged, Zaraseny, F. Spoor, W. H. Kimbel, R. Bobe, D. Geraads, D.
Reed, and J. G. Wynn. 2006. A juvenile early hominin skeleton from
Dikika, Ethiopia. Nature 443: 296301.
Arensburg, Baruch, L. A. Schepartz, A. M. Tiller, B. Vandermeersch, and
Y. Rak. 1990. A reappraisal of the anatomical basis for speech in Middle
Paleolithic hominids. American Journal of Physical Anthropology
83: 13746.

790

Bo, L.-J., S. Maeda, and J.-L. Heim. 2001. Neanderthal man was not morphologically handicapped for speech. Evolution of Communication
3: 4977.
Buhr, Robert D. 1980. The emergence of vowels in an infant. Journal of
Speech and Hearing Research 23: 7394.
Hewitt, Gwen, A. MacLarnon, and K. E. Jones. 2002. The functions of
laryngeal air sacs in primates: A new hypothesis. Folia Primatologica
73: 7094.
Holloway, Ralph L. 1983. Human paleontological evidence relevant to
language behavior. Human Neurobiology 2: 10514.
Laitman, Jeff, R. C. Heimbuch, and E. S. Crelin. 1979. The basicranium
of fossil hominids as an indicator of their upper respiratory systems.
American Journal of Physical Anthropology 51: 1534.
Lieberman, Daniel, R. C. McCarthy, K. M. Hiiemae, and J. B. Palmer.
2001. The ontogeny of postnatal hyoid and larynx descent in humans.
Archives of Oral Biology 46: 11728.
Lieberman, Philip. 2006. Toward an Evolutionary Biology of Language.
Cambridge, MA: Belknap.
Lieberman, Philip, and E. S. Crelin. 1971. On the speech of Neanderthal
man. Linguistic Inquiry 2: 20322.
McCarthy, Robert C., D. S. Strait, F. Yates, and P. Lieberman. A recent
origin for fully modern human speech capabilities. Proceedings of the
National Academy of Sciences USA. In press.
Negus, Victor E. 1949. The Comparative Anatomy and Physiology of the
Larynx. New York: Hafner.
Tobias, Phillip V. 1971. The Brain in Hominid Evolution. New
York: Columbia University Press.

SPEECH-LANGUAGE PATHOLOGY
Speech-language pathology, as a profession, engages in the
research, diagnosis, and treatment of communication and
swallowing disorders. Speech-language pathologists (SLPs)
assess, diagnose, and treat disorders of articulation, voice, fluency, language, and swallowing. Additionally, they have a role
in the prevention of communication and swallowing disorders,
for example, via education, early identification and intervention,
and family counseling. SLPs collaborate with other health-care
professionals, including clinical psychologists, neuropsychologists, audiologists, physical therapists, occupational therapists,
nurses, neurologists, radiologists, and nutritionists.
Communication disorders may result from impairment in
any modality spoken, written, and manual and may occur
both receptively and expressively (e.g., Owns, Metz, and Haas
2007). Among the speech and language disorders studied and
treated by SLPs are language impairment in children (ranging, for example, from mild developmental delays to language
impairment secondary to autism spectrum disorders, mental
retardation, and hearing loss); articulation and phonological
problems (such as the inability to produce certain sounds or
the systematic substitutions of sounds); voice disorders (such
as vocal fold nodules and vocal fold paralysis); language and
speech impairment (aphasia, dysarthria) resulting from neurogenic causes (for example, degenerative disease, stroke, traumatic brain injury); disorders of fluency (i.e., stuttering); and
deficits resulting from head and neck cancers (such as those following laryngectomy). Swallowing disorders typically co-occur
along with neurological deficits like stroke or degenerative diseases, such as amyotrophic lateral sclerosis (ASL) or Parkinsons
disease. They also occur with head and neck cancers and with

Speech-Language Pathology
developmental delays. Whereas hearing loss is diagnosed and
treated by audiologists, SLPs play a vital role in assessing and
treating the concomitant language, voice, and articulation problems. Cognitive (for example, mental retardation) and social
aspects of communication (for example, behavioral problems,
autism), as well as multilingualism (see bilingualism and
multilingualism) and a diverse cultural background, have
an impact on the individual and must be considered in terms of
treatment, diagnosis, and research.
Thus, the professions scope is broad and covers the entire life
span from infancy to old age. SLPs are employed in a variety of
settings, including day-care centers (adult and child), rehabilitation centers, public and private schools, community and hospital
clinics, health-care agencies, universities, research laboratories,
and private practice.
In the United States, the American Speech-LanguageHearing Association (ASHA) is the professional, scientific,
and credentialing association for more than 123,000 members
nationally and affiliates internationally. Established in 1925,
ASHA maintains close affiliations with similar organizations in
the United Kingdom, Europe, and Asia. ASHA has set standards
for licensing and certification and for accreditation of academic
programs. A masters degree from an ASHA-accredited university program is required for ASHA certification and for a license
to practice in most states.

Assessment and Treatment


Typically, individuals seen by an SLP must first undergo an evaluation to determine the nature of the problem and its possible
causes and areas of strength and weakness. To detect whether
there is a communication problem, SLPs use standardized tests,
developed specifically for the disorder in question. Factors such
as the sensitivity and specificity of the tests, their reliability and
validity, and the population from which the norms were obtained
must be carefully assessed to determine which test is used and
how its results are interpreted (e.g., Peterson and Marquardt
1994; Spreen and Risser 2003). This is especially important when
evaluating individuals from culturally diverse populations to
assure that no bias occurs in the assessment process (Goldstein
2000).
The tests employed by SLPs may be based on a particular
diagnostic approach. For example, child language diagnosis
can be based on nonlanguage skills (such as pragmatic skills,
play skills); on clusters of symptoms that define a group or a
disorder, such as specific language impairment (SLI) or
autism spectrum disorder (ASD); on language skills alone (for
example, phonological, morphological, syntactic skills); or on
etiology, such as hearing impairment (e.g., Seymor and Nober
1998; Shames, Wiig, and Secord 2000). When skills are assessed,
one can look for delays by comparing the childs performance
to developmental norms or milestones, as well as for qualitative
differences, such as an unpredicted sequence of acquisition.
Similarly, in the diagnosis of aphasia, several approaches have
been proposed. In the clinical/neuroanatomical approach (e.g.,
Goodglass 1993), subtypes of aphasia are defined on the basis of
clinical data (for example, fluent vs. nonfluent speech production) and neuroanatomical data (for example, site of lesion).
By contrast, in the psycholinguistic approach (e.g., Caplan

1993; Kay, Lesser, and Coltheart 1996), areas of deficit are identified using linguistic theories of language processing, regardless
of anatomy (for example, disturbances of word meaning, disorders of sentence production). Finally, the functional approach
(e.g., Chapey 2002) emphasizes the effectiveness of communication, the importance of the setting and the interlocutors, and
the levels of intelligibility and functional language use. In the
evaluation process, SLPs consider aspects relevant to the specific population being evaluated (for example, academic skills
for school age children; functional communication for adults
with aphasia).
Once a problem is detected, SLPs need to define the goals of
intervention. If treatment is warranted, it is initiated and may
include both individual and/or group therapy. The duration of
treatment may vary from a relatively short term to a period of
months or even years, as with aphasia.
In providing speech-language treatment, SLPs strive to
improve the clients communication skills and facilitate the
interaction between speaker and listener. The clinician provides
verbal and visual stimuli (such as a question or a picture) to elicit
language production, and then prompts the desired language
behavior, modeling it if needed, and responding to the communication attempted by the client. SLPs provide clients with
continuous opportunities to communicate, to produce language
forms affected by the communication disorder, to increase the
use of desired language skills, and to decrease undesired language behaviors. Varying approaches to treatment have been
taken, targeting the aforementioned disorders. For example,
when treating articulation disorders, the clinician may adopt
the motor approach (e.g., Van Riper 1972) or the linguistic, phonological approach (e.g., Hodson 1992). Under the motor
approach, the treatment would aim to train the client to perceive
and produce the impaired sounds, whereas under the linguistic
approach, the treatment would address classes of sounds and
phonological rules, with the assumption that the impairment is
at the linguistic conceptual level.

Current State of Research


Research efforts in speech-language pathology have typically
focused on two main issues: the etiology and symptoms of the
disorder under investigation and treatment efficacy. Researchers
have developed and employed a variety of measures designed to
identify those areas of communication skills that are impaired
and those that are intact. Our understanding of a wide range
of communication disorders has thus advanced. For example,
research evidence has contributed to the identification of specific aspects of language (e.g., morphological processing,
semantic organization) that are compromised in children with
specific language impairment (e.g., Leonard 2000). Similarly,
research has contributed to the ability to evaluate the degree of
comprehension impairment of grammatical structures in individuals with nonfluent aphasia (e.g., Berndt 1991). Such efforts
often result in the development of new assessment tools. For
example, the Comprehensive Assessment of Spoken Language
Test (Carrow-Woolfolk 1999) examines semantic, morphological, syntactic, and pragmatic language use in children; the
Philadelphia Comprehension Battery for Aphasia (Saffran et al.
1987) tests the comprehension of varying sentence structures.

791

Speech-Language Pathology
The second area of research, focused on the evaluation of
treatment efficacy, examines the extent to which improvement
in communication skills following intervention can be attributed to the treatment received by the individuals. Efficacy and
outcome measures have been defined as an increase in scores
on standardized tests and/or on the experimental measures
generated for the study, or as a measurable change in the clients and interlocutors evaluation of the clients communication abilities (e.g., Ross and Wertz 1999). Moreover, in addition
to assessing treatment efficacy, the goal of treatment studies is
to generalize the results from the studied sample to the larger
population. This generalizability of the results is a crucial aspect
of the treatment evaluation. Whereas research studies have provided evidence associating SLP treatments with improvement
in communication, some of the studies present methodological
concerns that may render the findings inconclusive (Thompson
2006). Variables that contribute to inconsistent results of treatment efficacy include the specific measures used to demonstrate
the change following treatment, the comparability of the control
group(s) employed, and the influence of extraneous factors (for
example, education, intelligence, motivation, and amount of
language use).

Evaluation of the Research


With the growth of speech-language-pathology research, two
central topics have emerged: the question of group studies
versus studies of individual cases and the concept of evidencebased practice.
GROUP STUDIES VERSUS INDIVIDUAL STUDIES. Early reports of
description and treatment of communication disorders typically
involved case studies (e.g., Broca 1861; Van Riper 1953). Due to
the great interindividual variability among children and adults
who exhibit speech, language, or communication impairments,
case studies were deemed appropriate to the examination of
these deficits. The disadvantage of the case-study approach,
however, is that one cannot generalize results from an individual case to the larger population. With the goal of generalizing
results of assessment and treatment in SLP research and with
the advances of statistical methods, researchers have begun to
conduct studies that examine groups of individuals (e.g., Basso,
Capitani, and Vignolo 1979; German 1984). One major challenge
to group studies is defining the criteria by which individuals
might be grouped together. And indeed, the group comparison
approach has encountered extensive criticism, mainly due to
the consequences of great intergroup variability. For example,
many between-group studies of the efficacy of SLP treatment
have suggested no positive outcomes, potentially due to the heterogeneity of the groups and variability in individuals responses
to the treatment. In addition, the selection of appropriate control groups poses an added hurdle to group studies. A closely
matched control group would mean individuals with similar
communication disorders. Ethically, however, SLPs would not
opt to withhold treatment from any individual. Furthermore,
individuals who are unable to participate in the treatment program or are uninterested in doing so might differ in crucial ways
from the individuals who are included in the experimental group
(for example, levels of motivation, levels of social interaction).

792

An alternative approach to the study of individuals with communication disorders is the single-subject experimental design
(e.g., Kearns 2000). Here, studies might include a single participant or several single cases. The participants serve as the subjects
and their own control, as phases of alternative treatment methods are compared and contrasted. Treatment can be tailored to
the individual in question, and the individuals gains under each
treatment program are evaluated. Following several experimental cases, the treatment outcomes for the individual participants
across the treatment phases are integrated and examined. This
approach has proved efficient in assessing treatment methods
by considering data from a number of individuals without compromising the results by collapsing differing individuals into
participant groups. If generalization is found in repeated administration of the treatment with the same individual and across
individuals from a relatively similar population, predictions can
be extended to the larger population.
EVIDENCE-BASED PRACTICE. In the twenty-first century, leaders
in the field of SLP and communication disorders have begun to
emphasize the importance of incorporating research evidence
into the process of making clinical decisions and treatment
choices (ASHA 2005). The goal of evidence-based practice (EBP)
is to integrate evidence from up-to-date, well-executed research
studies with clinical experience and intuitions when making
decisions about the most desirable treatment. EBP, accepted
worldwide, has been put forward with the intention of improving the intervention provided by SLPs. EBP essentially allows
the clinician to discard inefficient or unacceptable treatment
methods, even if endorsed by others, and to select appropriate
and useful intervention techniques that are well supported by
research. Whereas in earlier years experts opinions and clinical traditions had been highly valued, the new EBP approach
has brought to the forefront the need for systematic research
evidence. Therefore, clinicians who embrace this approach
evaluate and interpret the best-available research data while
considering the clients preference, environment, culture, and
values.
To maximize the efficiency of EBP, SLPs consider converging
evidence from a number of studies and from a number of cases.
They pay attention to the design of the research studies and to
their experimental control. When assessing the results of treatment studies, SLPs weigh the differing treatment approaches
employed in the study and assess the results by considering variables, such as the type of treatment provided, its length, its frequency, and the degree of feasibility and relevance of the research
evidence to the practice in question. Also useful are meta-analysis techniques, such as best-evidence analysis, which assess all
available evidence but attribute greater weight to the results of
better-designed and better-executed studies. In addition to the
quality of the research published, it is important to consider the
bias in published studies toward significant results (for example,
the finding of a significant difference between two groups is more
likely to be published than the finding of a lack of a difference),
as well as experimental, rather than observational, data. Thus,
important observations and findings might be missing from the
published literature. An increasing number of research studies
that provide data for evidence-based practice have characterized

Speech-Language Pathology

Speech Perception

SLP research in the last two decades (Academy of Neurologic


Communication Disorders and Sciences 2001).
Mira Goral and Joyce West
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Academy of Neurologic Communication Disorders and Sciences. 2001.
Available online at: http://www.ancds.org/practice.html.
ASHA (American Speech-Language-Hearing Association). 2005.
Evidence-Based Practice in Communication Disorders: Position
Statement. Rockville, MD: ASHA. Available online at: http://www.
asha.org.
Basso, A., E. Capitani, and L. A Vignolo. 1979. Influence of rehabilitation
on language skills in aphasic patients: A controlled study. Archives of
Neurology 36: 1906.
Berndt, R. S. 1991. Sentence processing in aphasia. In Acquired Aphasia,
ed. M. T. Sarno, 22967. San Diego, CA: Academic Press.
Bloodstein, O. 1987. A Handbook on Stuttering. Chicago: National Easter
Seal Society.
Bloom, L., and M. Lahey. 1988. Language Disorders and Language
Development. New York: Macmillan.
Broca, P. 1861. Perte de la parole. Ramollissement chronique et destruction partielle du lobe antrieur gauche du cerveau. Bulletin de la
Socit dAnthropologie 2: 235.
Brookshire, R. 2003. Introduction to Neurogenic Communication
Disorders. St. Louis, MO: Mosby.
Caplan, D. 1993. Toward a psycholinguistic approach to acquired neurogenic language disorders. American Journal of Speech-Language
Pathology 2: 5983.
Carrow-Woolfolk, E. 1999. Comprehension Assessment of Spoken
Language. Circle Pines, MN: AGS.
Chapey, R. 2002. Functional communication assessment and intervention: Some thoughts on the state of the art. Aphasiology 6: 8593.
Chapey, R., ed. 2001. Language Intervention Strategies in Aphasia and
Neurogenic Communication Disorders. Baltimore, MD: Lippincott
Williams and Wilkins.
Clark, H. M. 2004. Neuromuscular treatment for speech and swallowing: A tutorial. American Journal of Speech-Language Pathology
12: 40015.
Davis, A. 2007. Aphasiology: Disorders and Clinical Practice. New
York: Pearson.
Elman, R. J. 2006. Evidence-based practice: What evidence is missing?
Aphasiology 20: 1039.
German, D. 1984. Diagnosis of word-finding disorders in children with
learning disabilities. Journal of Learning Disabilities 17: 3539.
Goldstein, B. 2000. Cultural and Linguistic Diversity Resource Guide for
Speech-Language Pathologists. San Diego, CA: Singular.
Goodglass, H. 1993. Understanding Aphasia. San Diego, CA: Academic
Press.
Goodglass, H., and E. Kaplan. 1983. The Assessment of Aphasia and
Related Disorders. Philadelphia: Lea and Febiger.
Hall, K. 2000. Pediatric Dysphagia Resource Guide. San Diego,
CA: Singular.
Hedge, M. N. 2006. Treatment Protocols for Language Disorders in
Children. 2 vols. Protocols Series. San Diego, CA: Plural Publishing.
Hodson, B. 1992. Applied phonology: Constructs, contributions
and issues. Language, Speech. and Hearing Services in Schools 23:
24753.
Horton, S. 2006. A framework for description and analysis of therapy for
language impairment in aphasia. Aphasiology 20: 52864.
Justice, L. M., and M. E. Fey. 2004. Evidence-based practice in
schools: Integrating craft and theory with science and data. The ASHA
Leader 4/5 (September 21): 302.

Kay, J., R. Lesser, and M. Coltheart. 1996. Psycholinguistic approach


to language processing in aphasia: An introduction. Aphasiology
4: 97101.
Kearns, K. P. 2000. Single-subject experimental design in aphasia. In Aphasia and Language: Theory to Practice, ed. S. E. Nadeau,
L. J. Gonzalez Rothi, and B. Crosson, 42141. New York: Guilford.
Lahey, M., ed. 1998. Disorders of Communication: The Science of
Intervention. London: Whurr.
Law, J., Z. Garrett, and C. Nye. 2004. The efficacy of treatment for children with developmental speech and language delay/disorder: A meta
analysis. Journal of Speech, Language, Hearing Research 47: 92443.
Leonard, L. B. 2000. Children with Specific Language Impairment.
Cambridge, MA: MIT Press.
Logeman, J. A. 1998. Evaluation and Treatment of Swallowing Disorders.
2d ed. Austin, TX: Pro-Ed.
Owns, R. E., D. E. Metz, and D. E. Haas. 2007. Introduction to
Communication Disorders: A Lifespan Perspective. 3d ed. Boston: Allyn
and Bacon.
Peterson, H. A., and T. P. Marquardt. 1994. Appraisal and Diagnosis of
Speech and Language Disorders. Englewood Cliffs, NJ: Prentice Hall.
Robey, R. 1998. A meta-analysis of clinical outcomes in the treatment of aphasia. Journal of Speech-Language-Hearing Research
41.1: 17287.
Ross, K. B., and R. T. Wertz. 1999. Comparison of impairment and disability measures for assessing severity of, and improvement in, aphasia. Aphasiology 13: 11324.
Saffran, E. M., M. F. Schwartz, M. Linebarger, N. Martin, and P. Bochetto.
1987. The Philadelphia Comprehension Battery. Unpublished test
battery.
Seymor, C. M., and E. H. Nober. 1998. Introduction to Communication
Disorders: A Multicultural Approach. Newton, MA: ButterworthHeinmann.
Shames, G. H., E. H. Wiig, and W. Secord, eds. 2000. Human
Communication Disorders: An Introduction. New York: Macmillan.
Slavin, R. E. 1986. Best-evidence synthesis: An alternative to meta-analysis and traditional reviews. Educational Researcher 15: 511.
Spreen, O., and A. H. Risser. 2003. Assessment of Aphasia. New York:
Oxford University Press.
Thompson, C. 2006. Single subject controlled experiments in aphasia: The science and the state of the science. Journal of Communication
Disorders 39: 26691.
Van Riper, C. 1953. Speech Therapy: A Book of Readings. New York:
Prentice-Hall.
. 1972. Speech Correction: Principles and Methods. 5th ed.
Englewood Cliffs, NJ: Prentice-Hall.
Zurif, E., D. Swinney, and J. A. Fodor. 1991. An evaluation of assumptions underlying the single-patient-only position in neuropsychological research: A reply. Brain and Cognition 16: 198210.

SPEECH PERCEPTION
Speech perception researchers study how language users identify spoken language forms. Perceivers identify them on the basis
of acoustic and, sometimes, visual and lexical information.
Language forms enable speakers to make their linguistic
communications public. They include, among others, consonants and vowels (phones) that compose word forms and word
forms themselves.
Experimental psychologists interest in speech perception
stemmed from research that began in the 1940s at Haskins
Laboratories, where Alvin Liberman and Franklin Cooper developed a reading machine for the blind (Liberman 1996, 28). It

793

Speech Perception
was not feasible then for a device to read text aloud. Instead,
the machine scanned text and generated a unique non-speech
sound for each orthographic symbol: an acoustic alphabet.
Liberman and Cooper tried many symbol-to-sound mappings,
but the machine consistently failed to provide learnable acoustic
sequences.
One problem was rate. Unless sounds were sequenced so
slowly that the output lacked practical utility, they merged perceptually into a blur. Liberman asked: Why is speech perceivable
at faster rates?
Using tools for displaying speech visually (the sound spectrograph) and for transforming visual patterns to sound (the pattern
playback), Haskins researchers discovered that speech is not an
acoustic alphabet. Moreover, phones, apparently, lack invariant
acoustic signatures. The culprit is co-articulation: Talkers temporally overlap vocal tract (articulatory) gestures for serially nearby
phones.
This exposes how speech differs from acoustic alphabets but
does not explain why speech is easier to perceive. To discover
how speech is perceived, Haskins researchers initiated a research
program to explore listeners perceptions of simplified acoustic
speech patterns.
A provocative finding was categorical perception (Liberman
et al. 1957). An acoustic cue was varied in even steps from that
for one consonant-vowel syllable (in the first experiment, /be/,
bay) to those for others (/de/ and /ge/). Listeners heard continuum members and identified the consonants, and they heard
three syllables in succession and decided which was different
(discrimination). Although syllables changed in equal acoustic
steps along the continuum, identification responses changed
abruptly. Additionally, discrimination was near chance when
the oddball syllable was identified as sharing its consonant
with the two identical syllables (e.g., all were identified as bay).
Discrimination improved markedly when the oddball was from
a different category (e.g., bay vs. day).
For Liberman, these findings supported his motor theory
of speech perception, which he had developed on the basis of
earlier, quite different, findings. According to the theory, listeners hear articulatory gestures, not the acoustic cues they
cause, and they achieve that percept because their speech motor
system is active in perception. Categorical perception supported
the theory, because although the acoustic cue varies continuously, the specified articulatory gestures change abruptly, like
the identification responses. A lip gesture produces /b/, a tongue
tip gesture /d/, and a tongue body gesture /g/. Within-category
discrimination is difficult because a given gesture type (e.g., lip)
has to be discriminated from another token of the same type.
Categorical perception intrigued other researchers but not
because they judged the motor theory plausible. In the theory,
consonant production is strictly categorical; accordingly, so
should perception be, but within-category discrimination is typically above chance. Thus, listeners do have perceptual access to
acoustic differences, contrary to the motor theory. For example,
Patricia Kuhl (1991) and Joanne Miller (e.g., Allen and Miller
2001) showed that listeners give distinct goodness ratings to
acoustically different members of the same category.
Another troublesome outcome for motor theorists derived
from research with animals. Kuhl and James Miller (1978) found

794

categorical-like perception of a speech dimension in chinchilla,


but chinchilla arguably cannot hear human speech gestures.
Findings like these led to an alternative class of speech perception theories, auditory theories, and a new account of categorical
perception. In most auditory accounts (e.g., Sawusch and Gagnon
1995, 635), speech signals are analyzed for their cues, which are
used to identify phones. Identification functions in categorical
perception studies are abrupt in these accounts because evidence from acoustic cues shifts abruptly from being more compatible with one phone to being more compatible with another.
Discrimination results occur because memory for the acoustic
signal is fleeting, and listeners have to depend on their phone categorizations. Across a phone boundary, listeners discriminate well
on the basis of the distinct phones they have identified; within a
category, however, the phones they have identified are the same.
Today, auditory accounts of speech perception continue to be
contrasted with gestural accounts. The motor theory of speech
perceptions gestural account (e.g., Galantucci, Fowler, and
Turvey 2006) coexists with a direct realist gestural account that
rejects the motor theorys claim that speech motor activation
underlies gesture perception (Fowler 1986). Evidence favoring
auditory theories is provided by findings on speech perception
by nonhuman animals, such as the chinchilla study already
described, and on comparisons of speech and non-speech
perception.
As to the latter, Virginia Mann (1980) showed that identification of members of a /da/ to /ga/ continuum is affected by syllables /al/ and /ar/, such that more g identifications occur after
/al/. Her interpretation was compensation for coarticulation.
The front gesture for /l/ co-articulating with /g/s gesture, pulls
it forward; /r/s back gesture co-articulating with /d/ pulls /d/
back. When listeners encounter ambiguous continuum members, they identify them as g in the context of /al/ but as d in the
context of /ar/. A problem with this gestural account is that /al/
and /ar/ can be replaced by a high tone matched to the ending
frequency of /al/s third formant (F3) and a low tone matched
to the ending frequency of /ar/s F3, and the effects on /da//
ga/ perception are qualitatively unchanged (Lotto and Kluender
1998, 61315). This cannot be compensation for co-articulation.
Possibly, it is auditory contrast, a context effect. The high tone or
the ending F3 of /al/ has a contrastive effect on the high F3 onset
of /da/, perceptually lowering it and making it more /ga/-like
The low tone (or /ar/) has the opposite effect.
Some findings favoring gesture theories also include studies of (apparent) compensation for co-articulation (e.g., Fowler
2006). For spectral contrast to occur, an event earlier in time has
to bear an appropriate acoustic relation to an event later in time
(e.g., Lotto, Kuender, and Holt 1997, 1139). However, compensation for co-articulation also occurs under the opposite conditions
(a later-event affects perception of an earlier one) and when relevant acoustic consequences of co-articulation are contemporaneous and affect the same acoustic variable (so that there can be
no context effect). Common to all instances is temporal overlap
of gestures for which listeners compensate.
Speech perceivers not only exploit acoustic sources of phonetic information but also use at least two other information
sources: visual and lexical. A seminal finding of Harry McGurk
and John MacDonald (1976) was that visible speech gestures

Speech Perception
affect what listeners experience hearing. Researchers had already
shown that, in noise, speech is more intelligible if the listener can
see the speakers face. However, the McGurk effect shows an
influence of facial speech gestures on perception of discrepant
noise-free acoustic speech signals. For example, acoustic /ba/
dubbed onto visual /va/ is heard as /va/.
This finding has fostered countless follow-ups. In a novel
study, H. Yehia, T. Kuratate, and E. Vatikiotis-Bateson (e.g.,
2002) tracked movements on the face during speech. Their network models learned to link these articulatory motions to the
associated acoustic signal. After learning, the network generated predicted acoustic signals from facial articulations that are
highly correlated with the real acoustic patterns. Therefore, there
is substantial phonetic information in facial speech gestures.
Unknown yet is how much of it is used by perceivers.
It is known that viewers are especially sensitive to dynamic
information on the face. Lawrence Rosenblum and his colleagues
(e.g., Rosenblum and Saldana 1996) found audiovisual integration when they presented faces filmed in the dark with pointlight patches on them dubbed onto acoustic signals. Motionless
point-light faces were not identifiable as faces. In motion, they
provided phonetic information to listeners. (The acoustic realm
provides an analogue. Robert Remez and his colleagues [e.g.,
Remez et al. 1981] showed that caricatures of acoustic signals
for sentences, with formant trajectories replaced by sinewaves
at the formants center frequencies, are identifiable. Sinewave
speech lacks most traditional acoustic cues, but, like point light
faces, preserves dynamic phonetic information.)
Another source of information for speech perceivers is lexical. William F. Ganong (1980) published the seminal finding
here. His listeners heard, for example, an acoustic continuum
from giss to kiss. Listeners also heard members of a gift-to-kift
continuum. They identified more of the consonants as k in the
first continuum, that is, when the /k/ end of the continuum was
a word.
This experiment has spawned many follow-ups, including
a finding by Lawrence Brancazio (2004) that lexicality fosters
McGurk integrations. When acoustic bench is dubbed onto
video dench, the expected integration is dench, a non-word. If
the acoustic signal and video are besk and desk, respectively, the
expected integration is desk, a word. More McGurk integrations
occurred in the latter condition.
Debate has addressed the architecture of the speech perceptual system such that lexical effects on phonetic identification
occur. Most theorists assume that phones are identified, and
these identifications then support lexical access. The question
is whether there is also a top-down influence of lexical knowledge on phone perception. Given an initial phone that is ambiguous between /g/ and /k/ followed by iss, there is a word kiss in
the lexicon, but no giss. If there is top-down information flow,
then kiss can augment the evidence favoring /k/. Alternatively,
there need be no top-down flow. Rather, a decision that the consonant is g, but a poor token of it, can be revised to k when the
lexicon is later accessed. Whether there is a top-down flow of
information in speech perception remains subject to intensive
investigation (e.g., Norris, McQueen, and Cutler 2000; Samuel
and Pitt 2003).
Carol A. Fowler

WORKS CITED AND SUGGESTIONS FOR FURTHER READING


Allen, J. S., and J. L. Miller. 2001. Contextual influences on the internal
structure of phonetic categories: A distinction between lexical status
and speaking rate. Perception & Psychophysics 63: 798810.
Brancazio, L. 2004. Lexical influences in audiovisual speech perception. Journal of Experimental Psychology: Human Perception and
Performance 30: 44563.
Diehl, R., A. Lotto, and L. L. Holt. 2004. Speech perception. Annual
Review of Psychology 55: 14979. This article provides an overview of
speech perception focusing on the strengths of auditory theories and
the weaknesses of gestural theories.
Fowler, C. 1986. An event approach to the study of speech perception
from a direct-realist perspective. Journal of Phonetics 14: 328.
. 2006. Compensation for coarticulation reflects gesture perception, not spectral contrast. Perception & Psychophysics 68: 16177.
Galantucci, B., C. Fowler, and M. T. Turvey. 2006. The motor theory
of speech perception reviewed. Psychological Bulletin and Review
13: 36177. This paper provides a modern evaluation of Alvin
Libermans controversial motor theory of speech perception.
Ganong, W. F. 1980. Phonetic categorization in auditory word perception. Journal of Experimental Psychology: Human Perception and
Performance 6: 11025.
Kuhl, P. 1991. Human adults and human infants show a perceptual
magnet effect for the prototypes of speech categories, monkeys do
not. Perception & Psychophysics 50: 93107.
Kuhl, P., and J. D. Miller. 1978. Speech perception by the chinchilla: Identification functions for synthetic VOT stimuli. Journal of
the Acoustical Society of America 63: 90517.
Liberman, A. 1996. Speech: A Special Code. Cambridge, MA: Bradford
Books. This book presents a history of the pioneering research on
speech perception by Alvin Liberman and his associates.
Liberman, A., F. Cooper, D. Shankweiler, and M. Studdert-Kennedy.
1967. Perception of the speech code. Psychological Review 74:
43161.
Liberman, A., K. Harris, H. Hoffman, and B. Griffith. 1957. The discrimination of speech sounds within and across phoneme boundaries.
Journal of Experimental Psychology 54: 35868.
Liberman, A., and I. Mattingly. 1985. The motor theory revised.
Cognition 21: 136.
Lotto, A., and K. Kluender. 1998. General contrast effects in speech perception: Effect of preceding liquid on stop consonant identification.
Perception & Psychophysics 60: 60219.
Lotto, A., K. Kluender, and L. Holt. 1997. Perceptual compensation for
coarticulation by Japanese quail (Coturnix coturnix japonica). Journal
of the Acoustical Society of America 102: 113440.
Mann, V. 1980. Influence of preceding liquid on stop-consonant perception. Perception & Psychophysics 28: 40712.
McGurk, H., and J. MacDonald. 1976. Hearing lips and seeing voices.
Nature 264: 7468.
Norris, D., J. McQueen, and A. Cutler. 2000. Merging information in
speech recognition: Feedback is never necessary. Behavioral and
Brain Sciences 23: 299370.
Pisoni, D., and R. Remez, eds.. 2005. The Handbook of Speech Perception.
Malden, MA: Blackwell. This edited book provides an up-to-date overview of the field.
Remez, R., P. Rubin, T. Carrell, and D. Pisoni. 1981. Speech perception
without traditional speech cues. Science 212: 94750.
Rosenblum, L., and H. Saldana. 1996. An audiovisual test of kinematic
primitives for visual speech perception. Journal of Experimental
Psychology: Human Perception and Performance 22: 31831.
Samuel, A., and M. Pitt. 2003. Lexical activation (and other factors) can
mediate compensation for coarticulation. Journal of Memory and
Language 48: 41634.

795

Speech Perception in Infants


Sawusch, J., and D. Gagnon. 1995. Auditory coding, cues and coherence
in phonetic perception. Journal of Experimental Psychology: Human
Perception and Performance 21: 63552.
Yehia, H., T. Kuratate, and E. Vatikiotis-Bateson. 2002. Linking facial
animation, head motion and speech acoustics. Journal of Phonetics
30: 55568.

SPEECH PERCEPTION IN INFANTS


speech perception presents a challenge to infants. For adults,
speech perception is partly a top-down process of comparing
acoustic input to stored representations (e.g., of known words).
Lacking the extensive experience of adults, infants need to identify the structure of the linguistic input, a task in which they make
rapid progress. At birth, infants hearing is less acute than that
of adults but much closer to mature than vision. Indeed, infants
are sensitive to speech from the earliest moments after birth, and
even respond to speech heard prenatally (DeCasper and Spencer
1986).
Adults speech perception is shaped by experience with
language: Adults primarily distinguish speech sounds that are
phonemes in their language (i.e., that imply a difference in meaning, like /d/ and /b/ in dog and bog). For example, Japanese
adults have difficulty distinguishing /r/ and /l/, a contrast that is
not phonemic in Japanese (Miyawaki et al. 1975). Young infants
distinguish more speech sounds than adults and are sensitive
to contrasts that are not phonemic in their language. Beginning
around six months, infants lose sensitivity to many contrasts
not used phonemically by their language (Best, McRoberts, and
Sithole 1988; Kuhl et al. 1992). While they lose sensitivity to contrasts not found in their language, their sensitivity to contrasts
their native language does use may increase: 11-month-olds
show stronger event-related potential (ERP) responses to nativelanguage phonemic contrasts than do younger infants (RiveraGaxiola, Silva-Pereyra, and Kuhl 2005). Infants discovery of the
categories of their native language may result from their detection of the distribution of sounds in the language (Maye, Werker,
and Gerken 2002).
Languages differ in their prosody and in the sequences of
phonemes they allow within a word (phonotactics; for example, /tl/ violates English phonotactics). Infants show preferences for certain prosodic patterns from birth, primarily based
on the pitch and rhythm of speech (Cooper and Aslin 1990;
Nazzi, Bertoncini, and Mehler 1998). By nine months, they
prefer to listen to speech with the prosodic characteristics of
their native language (Jusczyk, Cutler, and Redanz 1993). By
the same age, English-learning infants also prefer words with
English-typical phonotactics over words typical of Dutch;
Dutch-learning infants show the opposite preference (Jusczyk
et al. 1993). Infants experience with the sound patterns of
speech facilitates learning about other aspects of language,
as seen in their ability to segment words from fluent speech
(Thiessen and Saffran 2003). After English-learning infants
discover that English words predominantly receive primary
stress on their first syllable, they treat stressed syllables
as word onsets (Thiessen and Saffran 2007). Similarly, after
infants are familiar with the phonotactic structure of their language, their word segmentation is influenced by the likelihood
796

of phoneme sequences they treat unlikely combinations


(such as /tl/ in English) as cues to a possible word boundary
(Mattys and Jusczyk 2001).
The speed of infants discovery of phonetic, phonotactic,
and prosodic properties of their native language has prompted
speculation that there may be special learning mechanisms
devoted to speech perception.
Early experimental data were consistent with this possibility, as infants listening to speech showed perceptual abilities
that were considered, at the time, to be unique to language.
For example, early research on categorical perception a phenomenon in which listeners distinguish more easily between
different phonemes (e.g., /b/ and /d/) than between two exemplars of the same phonemic category found that even very
young infants showed categorical perception for speech (e.g.,
Eimas et al. 1971). However, subsequent research has typically found that infants display similar processing for speech
and non-speech input (e.g., Jusczyk et al. 1983). This suggests
that speech is processed by domain-general mechanisms,
rather than mechanisms specific to speech. That said, speech
is a highly salient stimulus for infants: They prefer speech to
a wide variety of non-speech acoustic stimuli (Vouloumanos
and Werker 2004). Thus, even if infants learn about speech
and non-speech sounds using the same mechanisms, learning
about speech is likely to proceed differently because of infants
attentional biases. Current research continues to explore how
much of infants facility for speech perception can be attributed
to learning and how much is due to extraexperiential factors
such as attentional biases and constraints on their learning
mechanisms.
Erik D. Thiessen
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Best, C. T., G. W. McRoberts, and N. M. Sithole. 1988. Examination of
perceptual reorganization for nonnative speech contrasts: Zulu click
discrimination by English-speaking adults and infants. Journal
of Experimental Psychology: Human Perception and Performance
14: 34560.
Cooper, R. P., R. N. and Aslin. 1990. Preference for infant-directed
speech in the first month after birth. Child Development 61: 158495.
DeCasper, A. J., and M. J. Spencer. 1986. Prenatal maternal speech influences newborns perception of speech sounds. Infant Behavior and
Development 9: 13350.
Eimas, P. D., E. R. Siqueland, P. W. Jusczyk, and J. Vigorito. 1971. Speech
perception in infants. Science 171: 3036.
Jusczyk, P. W., A. Cutler, and N. J. Redanz. 1993. Infants preference for
the predominant stress patterns of English words. Child Development
64: 67587.
Jusczyk, P. W., A. D. Friederici, J. Wessels, V. Y. Svenkerud, and
A. M. Jusczyk. 1993. Infants sensitivity to the sound patterns of native
language words. Journal of Memory and Language 32: 40220.
Jusczyk, P. W., D. B. Pisoni, M. A. Reed, A. Fernald, and M. Myers. 1983.
Infants discrimination of the duration of a rapid spectrum change in
nonspeech signals. Science 222: 1757.
Kuhl, P. K., K. A. Williams, F. Lacerda, K. N. Stevens, and B. Lindblom.
1992. Linguistic experience alters phonetic perception in infants by 6
months of age. Science 255: 6068.
Mattys, S. L., and P. W. Jusczyk. 2001. Phonotactic cues for segmentation
of fluent speech by infants. Cognition 78: 91121.

Speech Production
Maye, J., J. F. Werker, and L. Gerken. 2002. Infant sensitivity to distributional information can affect phonetic discrimination. Cognition
82: B10111.
Miyawaki, K., W. Strange, R. Verbrugge, A. M. Liberman, and J. J. Jenkins.
1975. An effect of linguistic experience: The discrimination of [r]
and [l] by native speakers of Japanese and English. Perception and
Psychophysics 18: 33140.
Nazzi, T., J. Bertoncini, and J. Mehler. 1998. Language discrimination by
newborns: Towards an understanding of the role of rhythm. Journal
of Experimental Psychology: Human Perception and Performance
24: 75666.
Rivera-Gaxiola, M., J. Silva-Pereyra, and P. K. Kuhl. 2005. Brain potentials to native and non-native speech contrasts in 7- and 11-month-old
American infants. Developmental Science 8: 16272.
Thiessen, E. D., and J. R. Saffran. 2003. When cues collide: Use of stress
and statistical cues to word boundaries by 7- to 9-month-old infants.
Developmental Psychology 39: 70616.
. 2007. Learning how to learn: Infants discovery of stressbased strategies for word segmentation. Language Learning and
Development 3: 75102.
Vouloumanos, A., and J. F. Werker. 2004. Tuned to the signal: The
privileged status of speech for young infants. Developmental Science
7: 2706.

SPEECH PRODUCTION
Speech production is the most skillful action most of us perform.
It involves the coordinated actions of articulators of the vocal
tract, larynx, and respiratory system. Research issues focus on
the language units that are produced, the aims (acoustic or
articulatory) of making those units publicly available, and
the extent to which those aims are compromised by the tendency
that speakers have to co-articulate speech gestures (that is, to
overlap articulatory gestures temporally).

Language Forms
Language forms, that is, consonants and vowels and the larger
units they form, are the means that languages provide for making
linguistic utterances publicly available to other language users.
A major theoretical issue is to determine what they are, in two
senses. What kinds of things are they (cognitive, articulatory,
acoustic), and what are the units?
By most accounts, consonants and vowels are categories in
the minds of language users that are defined by their featural
attributes. For example, /b/ is a voiced, bilabial, stop consonant.
/i/ (as in the word keep) is a high, front, unrounded vowel. When
speakers talk, they implement those featural attributes as actions
of the vocal tract. The two lips close for /b/. This achieves the
bilabial place of articulation, and the complete closure at the lips
makes the consonant a stop. The vocal folds of the larynx open
and close to implement voicing. Except that the vocal folds are
abducted (rather than adducted), /p/ is made in the same way.
A different, albeit controversial, idea is that consonants and
vowels are not categories in the mind; they are the public actions
that talkers engage in when they speak (e.g., Goldstein and
Fowler 2003).
As for the issue of what the units are, an important source of
information has been spontaneous errors of speech production
(e.g., Garrett 1980; Shattuck-Hufnagel 1979). These are errors
produced by people who are capable of errorless speech. They

appear to reveal many of the units proposed to be units by linguists. Accordingly, errors can involve whole words (for example,
from Fromkin 1973, We have a laboratory in our own computer
when the intended utterance was We have a computer in our
own laboratory), but they also can involve individual consonants and vowels (morage in the fountains for forage in the
mountains). Rarely, they involve syllables and single features
(glear plue sky for clear blue sky; Fromkin 1973). However,
these units reveal themselves in other ways. Syllables appear to
serve as frames in speech production planning. When consonants
or vowels move in errors (as in morage in the fountains), they
preserve their position in syllables; that is, syllable-initial consonants move to syllable-initial positions, and final consonants
move to final positions. When one consonant or vowel substitutes for another, it tends to be featurally similar to the consonant or vowel it replaces.
For the most part, speech errors have been collected on the
fly by individuals with a pen and paper who transcribe errors that
they hear. The speech-errors waters have recently been muddied, however, by evidence (Mowrey and MacKay 1990; Pouplier,
2003) that, when collected in the laboratory by means that allow
articulation to be observed, speech errors look somewhat different, and far less tidy, than the errors collected by transcription. Whereas errors collected by transcription involve shifts in
position or substitutions of whole consonants, vowels, or words,
errors observed in the laboratory can be of components of segments (e.g., muscular activations for components of a consonant;
Mowrey and MacKay 1990) and of intrusions of lesser or greater
articulatory movements suggestive of a segment (Pouplier 2003),
and they do not necessarily preserve the phonotactic constraints
of the language. For example, in research by M. Pouplier (2003),
talkers repeatedly produced such word pairs as top cop, and their
errors often involved intrusion of a tongue body gesture (for the
/k/ in cop) during the tongue tip gesture for the /t/ in top. The
intrusive gesture varied in magnitude; in any case, it created a
consonant cluster (/tk/ or /kt/) that is disallowed in the language
of the participant speakers. Those findings, though important,
may not challenge the idea, however, that consonants, vowels,
and larger units are psychologically real components of planning
for speech production.
In any case, researchers now focus on methods that do not
involve collecting speech errors but may provide converging
evidence for the units that error collections have suggested (e.g.,
Levelt, Roelofs, and Meyer 1999).

The Aims of Speech Production


Talkers produce actions of the vocal tract when they speak; those
actions causally structure air that passes through the vocal tract.
Are talkers aiming to produce gestures of particular kinds about
which the acoustic signal informs listeners, or are they aiming
to produce particular acoustic signals? Theorists disagree on this
issue.
On the one hand, by one account, implemented as the directions into velocities of articulators (DIVA) model of speech production (e.g., Guenther, Hampson, and Johnson 1998), speakers
do not have good proprioceptive information about the positioning of articulators, particularly in the production of vowels. In
addition, different speakers produce /r/ differently but generate

797

Speech Production
similar acoustic signals. Finally, talkers sometimes compensate
for articulatory perturbations by using wholly new ways of preserving the acoustic characteristics of segments. This implies
that what talkers control is the acoustic signal. In this account,
then, acoustic-perceptual targets underlie the control of articulatory movements.
In the DIVA model of speech production, targets of speaking are normalized acoustic signals reflecting resonances of the
vocal tract (formants). The normalization transformations create
formant values that are the same for men, women, and children
despite formant differences in natural speech that reflect differences in vocal tract size. Formants characterize vowels and
sonorant consonants but not, for example, stop or fricative consonants; therefore, the model is restricted to an explanation of
just those classes of phones.
To learn a mapping from articulation to acoustics, the young
DIVA model, like young infants, babbles that is, produces consonant-vowel-like sequences (see babbling).Through learning,
the articulation-to-acoustics mapping is inverted so that acoustic-perceptual targets can underlie the control of articulatory
movements.
An alternative theoretical account is that articulatory gestures are the primary targets of speech production. This kind
of account is supported, for example, by a finding that talkers
compensate for articulatory perturbations that have no audible
acoustic consequences (Tremblay, Shiller, and Ostry 2003). In
that research, compensation also occurred in a silent speech
condition, which, of course, had no acoustic consequences at all.
These results appear inconsistent with a hypothesis that speech
targets are acoustic.
There is also a more natural speech example of preservation
of inaudible articulations. In an investigation of an X-ray microbeam database, C. Browman and L. Goldstein (1991) found
examples of utterances such as perfect memory in which transcription suggested deletion of the final /t/ of perfect. However,
examination of the tongue tip gesture for the /t/ revealed its
presence. Due to overlap from the bilabial gesture of /m/, however, acoustic consequences of the /t/ constriction gesture were
absent or inaudible.
In the task-dynamic model of speech production (e.g.,
Saltzman and Kelso 1987; Saltzman, Lofqvist, and Mitra 2000;
Goldstein and Fowler 2003), the aims of speakers are to establish
transient dynamical systems in the vocal tract that achieve the
constriction locations (e.g., at the lips for /b/) and degrees (e.g.,
complete closure for /b/) of intended consonants and vowels.

Co-articulation and Co-articulation Resistance


Talkers co-articulate when they speak. Co-articulation is defined
differently by different researchers, but essentially it is temporal
overlap of the articulatory gestures that implement successive
consonants and vowels. Co-articulation has often been viewed
as a destructive property of speaking (Hockett 1955; Ohala 1981)
because temporal overlap of gestures means that consonants
and vowels are implemented as vocal tract activity in contextsensitive ways.
Research has shown, however, that the context-sensitivity is
limited. For example, although /b/ is produced with different contributions of the jaw and the two lips, depending on the openness

798

of neighboring vowels that co-articulate with it, the lips invariably close when /b/ is produced. Thus, for theoretical accounts
in which speakers goals are articulatory, it appears that gestures
for consonants and vowels are realized by synergies or coordinative structures (e.g., Easton 1972) that achieve the gestures in
flexible, equifinal ways. Accordingly, when the jaw is perturbed
(prevented from raising) during production of /b/, within 2030
ms of the perturbation extra activation of muscles of the lips occur
so that the lips can do some of the work of lip closure that the jaw
otherwise would have performed (Kelso et al. 1984).
Co-articulation is limited in another way that prevents it from
destroying the achievement of essential gestural concomitants
of consonants and vowels. Different consonants and vowels
exhibit different degrees of co-articulation resistance (Bladon
and Al-Bamerni 1976; Fowler 2005; Recasens 1984, 1985) that
is, resistance to co-articulatory overlap by neighbors. The degree
of resistance appears to reflect the extent to which the neighbors might prevent the segments articulatory goals from being
achieved. For example, D. Recasens (1984) found that resistance
by consonants to co-articulatory overlap by vowels was correlated with the extent to which the consonantal constriction
required action of the tongue body, the primary articulator of the
neighboring vowels.
This research on co-articulation suggests that, viewed at an
appropriately coarse-grained level of description, gestures for
consonants and vowels are invariantly achieved despite coarticulation. Speech production is an elegant interweaving of
gestures for successive consonants and vowels that leaves their
essential properties intact.
Carol A. Fowler
WORKS CITED AND SUGGESTIONS FOR FURTHER READING
Bladon, R. A. W., and A. Al-Bamerni. 1976. Coarticulation resistance in
English /l/. Journal of Phonetics 4: 13750.
Browman, C., and L. Goldstein. 1986. Towards an articulatory phonology. Phonology Yearbook 3: 21952.
. 1991. Gestural structures: Distinctiveness, phonological processes, and historical change. In Modularity and the Motor Theory
of Speech Perception, ed. I. G. Mattingly and M. Studdert-Kennedy,
31338. Hillsdale, NJ: Lawrence Erlbaum.
Easton, T. 1972. On the normal use of reflexes. American Scientist
60: 5919.
Fowler, C. A. 2005. Parsing coarticulated speech in perception: Effects of
coarticulation resistance. Journal of Phonetics 33: 195213.
Fromkin, V. 1973. Speech Errors as Linguistic Evidence. The
Hague: Mouton.
Garrett, M. 1980. Levels of processing in speech production. In
Language Production. Vol 1: Speech and Talk. Ed. B. Butterworth, 177
220. London: Academic Press.
Goldstein, L., and C. A. Fowler. 2003. Articulatory phonology: A phonology for public language use. In Phonetics and Phonology in Language
Comprehension and Production: Differences and Similarities, ed. N. O.
Schiller and A. Meyer, 159207. Berlin: Mouton de Gruyter. This chapter summarizes the controversial idea that language forms are public
actions of the vocal tract.
Guenther, F., M. Hampson, and D. Johnson. 1998. A theoretical investigation of reference frames for the planning of speech. Psychological
Review 105: 61133. This paper defends the idea that intended language forms are acoustic.

Spelling
Hockett, C. 1955. A Manual of Phonetics. Bloomington: Indiana University
Press.
Kelso, J. A. S., B. Tuller, E. Vatikiotis-Bateson, and C. A. Fowler. 1984.
Functionally-specific articulatory cooperation following jaw perturbation during speech: Evidence for coordinative structures. Journal
of Experimental Psychology: Human Perception and Performance
10: 81232.
Levelt, W., A. Roelofs, and A. Meyer. 1999. A theory of lexical access in
speech production. Behavioral and Brain Sciences 22: 175.
Mowrey, R., and I. MacKay. 1990. Phonological primitives: Electromyographic speech error evidence. Journal of the
Acoustical Society of America 88: 12991312.
Ohala, J. 1981. The listener as a source of sound change. In Papers from
the Parasession on Language and Behavior, ed. C. Masek, R. Hendrick,
R. Miller, and M. Miller, 178203. Chicago: Chicago Linguistics
Society.
Pouplier, M. 2003. Units of Phonological Encoding: Empirical Evidence.
Ph.D. diss., Yale University.
Recasens, D. 1984. V-to-C coarticulation in C

Potrebbero piacerti anche