Sei sulla pagina 1di 9

Deep Misconceptions and the Myth

of Data-Driven Language Understanding


On Putting Logical Semantics Back to Work

Walid S. Saba
Every thing in nature, in the
inanimate as well as in the animate
world, happens according to some
rules, though we do not always
know them

IMMANUEL KANT

I reject the contention that


an important theoretical
difference exists between
formal and natural languages

RICHARD MONTAGUE

One can assume a theory of the world


that is isomorphic to the way we talk
about it in this case, semantics
becomes very nearly trivial

JERRY HOBBS
a spectre is haunting NLP
Early efforts to find theoretically elegant formal models for
various linguistic phenomena did not result in any noticeable
progress, despite nearly three decades of intensive research (late
1950s through the late 1980s ). As the various formal (and in
most cases mere symbol manipulation) systems seemed to reach
a deadlock, disillusionment in the brittle logical approach to
language processing grew larger, and a number of researchers
and practitioners in natural language processing (NLP) started to
abandon theoretical elegance in favor of attaining some quick
results using empirical (data-driven) approaches.

All seemed natural and expected. In the absence of theoretically


elegant models that can explain a number of NL phenomena, it
was quite reasonable to find researchers shifting their efforts to
finding practical solutions for urgent problems using empirical
methods. By the mid 1990s, a data-driven statistical revolution
that was already brewing over took the field of NLP by a storm,
putting aside all efforts that were rooted in over 200 years of
work in logic, metaphysics, grammars and formal semantics.

We believe, however, that this trend has overstepped the noble


cause of using empirical methods to find reasonably working
solutions for practical problems. In fact, the data-driven
approach to NLP is now believed by many to be a plausible
approach to building systems that can truly understand ordinary
spoken language. This is not only a misguided trend, but is a very
damaging development that will hinder significant progress in
the field. In this regard, we hope this study will help start a sane,
and an overdue, semantic (counter) revolution.
February 7, 2017

Copyright 2017 WALID S. SABA


some initial
clarifications
what this study is not about
Criticisms of the statistical data-driven approach to language
understanding are very often automatically associated with the
Chomskyan school of linguistics. At best, this is a misinformed
judgement (although in many cases, it is ill informed). There is
a long history of work in logical semantics (a tradition that
forms the background to the proposals we will make here) that
has very little to do (if anything at all) with Chomskyan
linguistics.

Notwithstanding Chomskys (in our opinion valid) Poverty of


the Stimulus (POS) argument an argument that clearly
supports the claim of some kind of innate linguistic abilities, we
believe that Chomskyans put too much emphasis on syntax
and grammar (which ironically made their theory vulnerable
to criticism from the statistical and data-driven school).
Instead, we think that syntax and grammar are just the external
artifacts used to express internal, logically coherent, semantic,
and compositionally and productively (i.e., recursively)
constructed thoughts, something that is perhaps analogous to
Jerry Fodors Language of Thought (LOT).

Here we should also mention that we agree somewhat with


M. C. Corballis (The Recursive Mind) that it is thought that
brought about the external tool we call language, and not the
other way around.

Copyright 2017 WALID S. SABA


some initial
clarifications
what this study is not about
Another association that criticism of the statistical and data-
driven approaches to NLU often conjures up is that of building
large knowledge bases with brittle rule-based inference engines.
This is perhaps the biggest misunderstanding, held not only by
many in the statistical and data-driven camp, but also by
previously over enthused knowledge engineers that mistakenly
believed at one point that all that is required to crack the NLU
problem was to keep adding more knowledge and more rules.
We also do not subscribe to such theories.

In fact, regarding the above, we agree with an observation once


made by the late John McCarthy (at IJCAI 1995) that building ad-
hoc systems by simply adding more knowledge and more rules
will result in building systems that we dont even understand.
Ockham's Razor, as well as observing linguistic skills of 5-year
olds, should both tell us that the conceptual structures that might
be needed in language understanding should not, in principal,
require all that complexity.

As will become apparent later in this study, the conceptual


structures that speakers of ordinary spoken language have
access to are not as massive and overwhelming as is commonly
believed. Instead, it will be shown that the key is in the nature
of that conceptual structure and the computational processes
involved.

Copyright 2017 WALID S. SABA


some initial
clarifications
what this study is not about
FINALLY, our concern here is in introducing a plausible
model for natural language understanding (NLU). If your
concern is natural language processing (NLP), as it is used,
for example, in applications such as these

words-sense disambiguation (WSD);


entity extraction/named-entity recognition (NER);
spam filtering, categorization, classification;
semantic/topic-based search;
word co-occurrence/concept clustering;
sentiment analysis;
topic identification;
automated tagging;
document clustering;
summarization;
etc.

then it is best if we part ways at this point, since this is not at


all our concern here. There are many NLP and text processing
systems that already do a reasonable job on such data-level
tasks. In fact, I am part of a team that developed a semantic
technology that does an excellent job on almost all of the above,
but that system (and similar systems) are light years away from
doing anything remotely related to what can be called natural
language understanding (NLU), which is our concern here.

Copyright 2017 WALID S. SABA


some initial
clarifications
what this study is about
WE WILL ARGUE THAT many language phenomena

1
are not learnable from data because (i) in most situations
what is to be learned is not even observable in the data
(or is not explicitly stated but is implicitly assumed as
shared knowledge by a language community); or (ii) in
many situations theres no statistical significance in the
data as the relevant probabilities are all equal

WE WILL ARGUE THAT purely data-driven

2
extensional models that ignore intensionality,
compositionality and inferential capacities in
natural language are inappropriate, even when
the relevant data is available, since higher-level
reasoning (the kind thats needed in NLU) requires
intensional reasoning beyond simple data values.

WE WILL ARGUE THAT the most plausible

3
explanation for a number of phenomena in natural
language is rooted in logical semantics, ontology, and
the computational notions of polymorphism, type
unification, and type casting; and we will do this by
proposing solutions to a number of challenging and
well-known problems in language understanding.

Copyright 2017 WALID S. SABA


some initial
clarifications
more specifically ...
We will propose a plausible model rooted in logical semantics,
ontology, and the computational notions of polymorphism, type
casting and type unification. Our proposal provides a plausible
framework for modelling various phenomena in natural language;
and specifically phenomena that requires reasoning beyond the
surface structure (external data). To give a hint of the kind of
reasoning we have in mind, consider the following sentences:

(1) a. Jon enjoyed the movie


b. Jon enjoyed watching the movie

(2) a. A small leather suitcase was found unattended


b. A leather small suitcase was found unattended

(3) a. The ham sandwich wants another beer


b. The person eating the ham sandwich wants another beer

(4) a. Dr. Spok told Jon he should soon be done with writing the thesis
b. Dr. Spok told Jon he should soon be done with reading the thesis

Our model will explain why (1a) is understood by all speakers


of ordinary language as (1b); why speakers in multiple languages
find (2a) more natural to say than (2b); why we all understand
(3a) as (3b); and why we effortlessly resolve he in (4a) with Jon
and he in (4b) with Dr. Spok. Before we do so, however, we will
discuss some serious flaws in proposing a statistical and a data-
driven approach to NLU.

Copyright 2017 WALID S. SABA


the framework

link
An updated version of this study will constantly be placed here

Copyright 2017 WALID S. SABA

Potrebbero piacerti anche