Sei sulla pagina 1di 52

Foundations of Knowledge Representation and Reasoning

A Guide to this Volume

Gerhard Lakemeyer 1 and Bernhard Nebel 2 1 Institute of Computer Science University of Bonn R6merstr. 164 D-53117 Bonn, Germany gerhard@cs.uni-bonn.de Department of Computer Science University of Ulm D-89069 Ulm, Germany nebel@informatik, uni-ulm.de

Introduction

Knowledge representation (KR) is the area of Artificial Intelligence that deals with the problem of representing, maintaining, and manipulating knowledge about an application domain. Since virtually all Artificial Intelligence systems have to address this problem, K R is one of the centrM subfields of Artificial Intelligence. Main research endeavors in KR are - representing knowledge about application areas (e.g., medical knowledge, knowledge about time, knowledge about physical systems), - developing appropriate representation languages, - specifying and analyzing reasoning over represented knowledge, and - implementing systems that support the representation of knowledge and reasoning over the represented knowledge. While knowledge about an application domain may be represented in a variety of forms, e.g., procedurally in form of program code or implicitly as patterns of activation in a neural network, research in the area of knowledge representation assumes an explicit and declarative representation, an assumption that distinguishes K R from research in, e.g., programming languages and neural networks. Explicitness of representation means that the represented knowledge is stored in a knowledge base consisting of a set of formal entities that describe the knowledge in a direct and unambiguous way. Declarativeness means that the (formal) meaning of the representation can be specified without reference to how the knowledge is applied procedurally, implying some sort of logical methodology behind it.

Although the above two points seem to be almost universally accepted by researchers working in KR, this consensus has been achieved only recently. Brachman and Levesque mentioned in the Introduction to a collection of papers in 1985 that the "research area of Knowledge Representation has a long, complex, and as yet non-convergent history," [Brachman and Levesque, 1985, p. xiii] an impression that is indeed confirmed by the papers in this collection. A large portion of the papers contain meta-level discussions arguing about the right methods for representing knowledge or they present approaches that are completely incompatible with a logical, declarative point of view. Nowadays, the picture has completely changed, however. Logical methods predominate and methodological problems are hardly discussed any longer [Brachman et al., 1989, Allen et al., 1991, Nebel et al., 1992, Brachman, 1990]. Instead, research papers focus on particular technical representation and reasoning problems and address these problems using methods from logic and computer science. While this development indicates that K R has become a mature scientific discipline, it also leads to the situation that research results in K R appear to be less accessible to the rest of the Artificial Intelligence community. As a matter of fact, it is often argued that the foundational results that are achieved in the K R field are not relevant to Artificial Intelligence at all. We concede ~hat a large amount of K R research probably does not have any immediate impact on building Artificial Intelligence systems. However, this is probably asking for too much. Foundational K R research aims at providing the theoretical foundations on which we can build systems that are useful, comprehensible, and reliable, i.e., it aims at providing the logical and computational foundations of knowledge representation formalisms and reasoning processes. Results in foundational KR often "only" provide explanations why a particular approach works or how an approach can be interpreted logically. Additionally, the borderlines of what can be represented are explored and it is analyzed how efficiently a reasoning process can be. While this may not be of central concern when building Artificial Intelligence systems, such results are nevertheless important when we want to understand such systems, and when we want to guarantee their reliability. Perhaps the main motivation and driving force behind most research in KR has been the desire to equip artifacts with commonsense. This is literally true of a paper by John McCarthy, first published in 1958 and republished as [McCarthy, 1968], which started the whole KR enterprise, and it is still true, if only implicitly, of the papers in this book. In fact, work on the foundations of KR can largely be indentified with work on the foundations of commonsense reasoning, a point of view which we will follow throughout this brief survey. In the following sections, we touch on some basic logical and computational aspects of commonsense reasoning. The reader is warned that this is not a comprehensive overview of the field, which would be far outside the scope of this book. Instead we confine ourselves mainly to those areas that are actually covered by papers in this book.

Logical

Foundations

of Commonsense

Reasoning

As mentioned already in the beginning, the main assumption that distinguishes knowledge-based systems from other approaches is that knowledge is represented declaratively in some logic-like language. This is one part of what Brian Smith has called the knowledge representation hypothesis [Smith, 1982]. The other part postulates that these representations play a causal role in engendering the behavior of the system. While this causal connection is present in one form or another in every knowledge-based system, it is fair to say that so far there are very few, if any, theoretical results that explain this connection. Hence most foundational research in KR, including the work reported in this book, deals with problems that arise from the first part of the K R hypothesis and which can be dealt with independently from the second part. In this context, one can identify three fundamental questions: 1. What is the right representation language? 2. W h a t inferences should be drawn from a knowledge base? 3. How do we incorporate new knowledge? In the rest of this section, we will address each question in turn with an emphasis on the relevant papers in this book. 2.1 The Right Representation Language

While there is little disagreement any more about the assumption that a representation language is one of logic, where the sentences can be interpreted as propositions about the world, 3 designing an adequate language is not an easy task, since the various desirable features are often incompatible. In particular, very expressive languages usually have poor computational properties, an issue that has drawn considerable interest since a seminal paper by Brachman and Levesque [1984] and which is discussed in more detail in the next section. At this point we only mention that computational considerations have led to the development of languages that are far less expressive than full first-order logic, most notably the so-called concept languages or terminological logics. Four of the papers in this collection are devoted to this topic [Bander and Hollunder, 1994, Bettini, 1994, Allgayer and Franconi, 1994, Donini el al., 1994]. From the point of view of expressiveness, it often seems useful to have special epistemological or ontological primitives built into the language. Shoham and Cousins [1994] survey work in AI on a whole range of mental attitudes like beliefs, desires, goals, or intentions. The need for making such notions explicit is probably most convincing in multi-agent settings, where agents need to reason about each other's mental attitudes in order to communicate and cooperate successfully. [Gottlob, 1994, Kalinski, 1994, Niemel~ and Rintanen, 1994] consider the specific case of belief, 3 Until the late seventies, many so-called representation languages actually violated this fundamental assumption and led to vivid discussions such as [Hayes, 1977, McDermott, 1978].

which, together with knowledge, is probably the best understood among the attitudes. In these papers a very specific aspect of belief is considered, namely the ability to model certain forms of defeasible reasoning by referring explicitly to the system's own epistemic state (see Section 2.2 below). Bettini and Lin [Bettini, 1994, Lin, 1994], on the other hand, are concerned with adding explicit notions of time to the language. While Bettini considers incorporating an existing interval-based concept of time to a temporal logic, Lin proposes a new axiomatization of time, where time instances are defined on the basis of events. 2.2

The Right I n f e r e n c e s

Having explicit representations of knowledge alone is not very useful in general. Instead one wants to reason about these representations to uncover what is implied by them. After all, we use the term commonsense reasoning and not commonsense representation. Until the early seventies, deduction was the main focus of attention as far as inference mechanisms are concerned. It became clear, however, that a lot of commonsense reasoning is not deductive in nature. In particular, many inferences humans draw all the time are uncertain in some sense and may therefore be defeasible if new information becomes available. The prototypical example is the assumption that birds normally fly and if someone tells me about a bird called Tweety, then, knowing nothing else, I conclude that Tweety flies. Later on, if I find out that Tweety is indeed a penguin, I withdraw my earlier conclusion without hesitation. There are essentially two main research fields that try to formalize such reasoning, one which is based on probability theory (see, for example, [Pearl, 1988]) and another which directly models nonmonotonic reasoning by modifying classical logic in one way or another (see, for example, [Brewka, 1991]). While probabilistic methods are not dealt with at all in this volume, nonmonotonic reasoning receives a fairly broad coverage [Baader and Hollunder, 1994, Gottlob, 1994, Kakas, 1994, Kalinski, 1994, Niemel~ and Rintanen, 1994, Weydert, 1994]. Except for McCarthy's [1980] Circumscription, the main formalisms on nonmonotonic reasoning are represented in this volume. Baader and Hollunder [1994] discuss extending terminological logics using Reiter's [1980] Default Logic (DL). Kakas extends DL by applying ideas from abductive logic programming to it. Gottlob [1994] relates DL and Moore's [1985] Autoepistemic Logic (AEL) by showing how to faithfully translate DL theories into AEL theories. Both Kalinski [1994] and Niemel~i and Rintanen [1994] are concerned with complexity issues, the former by considering a weaker form of AEL and the latter by considering only AEL theories of a special form (with applications to other nonmonotonic formalisms as well). Finally, Weydert [1994] presents results on nested conditionals. This work is in the tradition of modeling nonmonotonic inferences on the basis of conditional logics such as [Lewis, 1973, Adams, 1975]. Apart from probabilistic and nonmonotonic reasoning, there are many other forms such as fuzzy, inductive, abductive or analogical reasoning. Of those the latter two are represented here with one paper each. Console and Dupre [1994] address abduction, which is concerned with finding plausible explanations for a

given observation. In particular, they address the problem of finding explanations at different levels of abstraction. Myers and Konolige [1994] discuss reasoning with analogical representations such as maps. They are particularly concerned with integrating both analogical and symbolic (sentential) representations. 2.3 Evolving Knowledge

Since knowledge bases are hardly ever static, devising methods for incorporating new information into a knowledge base is of great importance in K R research. This problem, often referred to as belief revision, is particularly challenging if the new information conflicts with the contents of the old knowledge base. Over the past decade, substantial progress has been made on the topic of belief revision, particularly since the ground-breaking work by AlchourrSn, G~irdenfors, and Makinson [1985], who propose postulates which any rational revision operator should obey (now referred to as AGM-postulates). Later, Katsuno and Mendelzon [1991] introduce an important distinction between revising a knowledge base, which refers to incorporating new information about a static world, and updating it, where the new information reflects changes in the world. They also propose a set of rationality postulates for update operators. In this volume, Boutilier [1994] and Nejdl and Banagl [1994] present new results following this line of research. Nejdl and Banagl define subjunctive queries for knowledge bases in the case of both update and revision. In particular, they show that their query semantics for revision and update satisfies precisely the AGM-postulates and the Katsuno-Mendelzon-postulates, respectively. Boutilier shows that, in the context of conditional logic, belief revision and nonmonotonic reasoning have precisely the same properties, further substantiating the claim that the two areas are closely related. Witteveen and 3onker[1994] address revision from a somewhat different angle. Here the emphasis is on finding plausible expansions of logic programs, which are incoherent under the well-founded semantics, such that the revised programs are no longer incoherent. 3 Commonsense Reasoning as Computation

Once a knowledge representation scheme together with its associated commonsense reasoning task has been formalized logically, we can immediately make use of the computational machinery associated with logic. For instance, once we have identified that a particular representation formalism is "simply" a subset of standard first-order logic, we know that resolution (or any other complete proof method) is a method to compute all the valid consequences of a knowledge base. In other words, in such a case, commonsense reasoning could be reduced to a well-known computation technique. However, this point of view is over-simplifying. First of Ml, often one deals with non-standard logics, e.g., non-monotonic or modal logics, for which standard techniques do not work. Secondly, even in the case that one only has a subset of

standard first-order logic, it does not make sense to use general proof methods if specialized reasoning techniques, tailored to the restricted language, turn out to be much more ej~icient. In particular, one might be able to specify methods that always terminate, i.e., inference algorithms. Efficiency is indeed one of the major problems when we turn logical formalization into computation. As is well-known, even propositional logic requires already significant computational resources - reasoning in propositional logic is NP-hard. 4 On the other hand, commonsense reasoning appears to be quite fast when humans perform it, and, moreover, should work reasonably fast on computers if the system is required to be of any use [Levesque, 1988]. In particular, if it is required that the reasoning process is computationally tractable, we are often forced to restrict the expressiveness of the representation language or to give up on the accuracy of the answer [Levesque and Brachman, 1987]. Research questions coming up in this context are: 1. Can we specify an inference algorithms for the reasoning task? 2. W h a t is the computational complexity of the reasoning task? 3. How can we achieve tractability? 3.1 Inference Algorithms

As is evident from most papers, the formalization of a commonsense reasoning task as a form of logical inference is usually not overwhelmingly difficult, provided appropriate formal techniques and tools are employed. For instance, the semantics of a terminological logic extended by operators to express collective entities and relations [Allgayer and Pranconi, 1994] can be specified on less than half a page. W h a t appears to be much more involved is the specification of an appropriate reasoning technique. As pointed out above, one could employ standard proof techniques if the formalism under consideration is (a notational variant of) a subset of standard first-order logic. However, usually we do not want an arbitrary method, but an algorithm that is as efficient as possible - a problem that is addressed by most of the papers in this volume. Allgayer and Franconi [1994], for instance, showed in their paper that it is possible to extend the tableau-based technique introduced by Schmidt-Schaut3 and Smolka [Schmidt-SchauB and Smolka, 1991] to terminological logics containing operators for collective entities, providing us with a sound, complete, and terminating method for reasoning in this language. Bander and Hollunder [1994] also start with terminological logics, but extend these by incorporating default logic [Reiter, 1980], i.e., in this case it is not possible to use standard first-order logic methods. However, as they are able to show, it is possible to combine the tableau-based reasoning techniques for terminological logics with re~oning techniques developed for default logics [Junker 4 Consult, e.g., [Garey and Johnson, 1979] for an introduction to computational complexity theory.

and Konolige, 1990, Schwind and Risch, 1991] in an almost straightforward way, leading to an inference algorithm for the combined formalism. It should be noted that in order to guarantee decidability, it is necessary to use a somewhat nonstandard interpretation of open defaults, though. Since, as shown by Baader and Hollunder, the standard interpretation of open defaults not only leads to undeeidability but also to counter-intuitive results, giving up this interpretation does not seem to be much of a sacrifice. 3.2 Computational Complexity of Reasoning

An inference algorithm for a particular commonsense reasoning task demonstrates that that there is one way to turn this task into computation. However, it does not answer the question whether this is the most efficient way. In order to answer this question, computational complexity theory can be used for analyzing the inherent difficulty of the problem. Such an analysis can guide the search for more efficient algorithms or for a reformulation of the reasQning problem in a way that renders reasoning more efficient. Finally, a computational complexity analysis can be used to compare and contrast different reasoning problems. For instance, Donini et al [1994] study the extension of terminological logics by an epistemic operator and show that this operator does not increase the computational complexity of reasoning in one of the standard terminological logics (the so-called .A/:C language [Schmidt-SchauB and Smolka, 1991]). Furthermore, Donini et al [1994] are able to show that in some relevant special cases the complexity goes even down from co-NP-hardness to polynomial time. Kautz and Selman [1994] analyze the computational problems arising when approximating arbitrary propositional theories by Horn theories. They show that such a Horn theory may sometimes be of exponential size and that it is unlikely that a dense representation can be found in all cases. A final example for the use of computational complexity theory is the paper by Gottlob [1994]. Although this paper is not by itself a paper on computational complexity analysis of commonsense reasoning, it makes use of computational complexity results [Gottlob, 1992] that show that the three main forms of nonmonotonic reasoning all have the same complexity, which implies that there must exist (polynomial) translations between these formalisms. Based on this observation, Gottlob develops a translation from default logic to autoepistemic logic that is quite interesting. 3.3 T h e E x p r e s s i v e n e s s vs. E f f i c i e n c y T r a d e o f f

If a reasoning problem can be shown to require time that is not polynomial in the size of the problem description (under the assumption that N P c P ) , this implies that in the w o r s t case we will not get an answer in tolerable time when the problem description grows beyond a certain (usually moderate) size. Of course, if the problem descriptions are almost always small, such computational complexity results are irrelevant. However, we usually want to deal with more

than 20 concepts or 10 default rules. So we should consider the possibility of worst cases for moderately sized problem descriptions. One way to exclude worst cases is to restrict the expressiveness of the representation language the reasoning task has to deal with. Brachman and Levesque, for example, showed that excluding a particular operator from a terminological logic reduces the complexity of reasoning from NP-hardness to polynomiality [Brachman and Levesque, 1984, Levesque and Brachman, 1987]. Subsequent investigations along this line [Donini et al., 1991] have shown that requiring polynomiality of the inference algorithm leads to a severe restriction on the possible constructs one can use. Although there have been strong arguments about the usefulness of achieving efficiency by restricting the expressiveness [Doyle and Patil, 1991], there seems to be nevertheless a consensus that it is useful to analyze special cases of general reasoning patterns that can be solved more easily than the general problem, provided the special cases are relevant. Moreover, restricting the expressiveness can mean a number of things that are quite different from, for example, excluding a particular operator from a representation language. For instance, instead of considering a representation language with less constructs, it makes sometimes sense to use a language with more constructs but with restrictions on the structure of allowed expressions. Donini et al [1994] show that enlarging a terminological logic with an epistemic operator for building concepts that are used as queries and restricting the forms of the query can indeed lead to a more natural reasoning task which is also more efficient. The work by Myers and Konolige [1994] also extends the representational framework (first-order logic) in order to achieve efficiency. In this case, however, the aim is not to guarantee worst-case efficiency in all cases, but to provide special means for representing knowledge about one particular domain - spatial knowledge - that can be more naturally represented and more efficiently reasoned about using analogical representations, which are also much more restricted than general propositional representations. The main problem Myers and Konolige identify and solve is the integration of analogical reasoning with the general framework of reasoning in first-order logic. The paper by Niemels and Rintanen [1994] aims again at guaranteeing polynomial runtime in all cases by restricting expressive power. As in the cases above, however, they do not restrict the expressive power by disallowing logical operators in AEL theories, but they consider restrictions on the form of the theories. In particular, they show that reasoning in stratified AEL Horn theories can be done in polynomial time. 3.4 T h e A c c u r a c y vs. E f f i c i e n c y T r a d e o f f

If the expressiveness of a representation cannot he restricted, other means for getting timely answers are called for. Usually, one gives up on the quality or accuracy of an answer, for example, by restricting the processing time or by employing incomplete reasoning methods. While this may lead to the desired runtime behavior, it raises the question as to how far we can still trust answers

from a representation and reasoning system. In other words, we are seeking a principled description of the reasoning capabilities of an incomplete reasoner. Kautz and Selman [1991] addressed this problem by a "knowledge compilation" technique. They propose to compute (off-line) Horn theories that approximate the logical contents of a given arbitrary theory. As mentioned above, this approximation can lead to computational problems in itself [Kautz and Selman, 1994]. Kautz and Selman show that the approximating theory can become very large, and although there are sometimes ways around this problem, they can show that it is very unlikely that dense representations of a approximating Horn theory exist in all cases. Nevertheless, their approximation scheme appears to be interesting since instead of general Horn theories one may aim for more restricted forms of such theories which can be polynomially bounded in size. Greiner and Schuurmans [1994] address the multiple extension problem of default reasoning [Reiter, 1987], which is known to be one source of computational complexity in default reasoning [Gottlob, 1992, Nebel, 1991]. They propose to approximate default reasoning by ordering the defaults linearly, where the particular order chosen is intended to be "optimally correct." As they show, it is not possible to compute such an ordering in polynomial time, but they approximate such an ordering by computing a locally optimal ordering. The paper by Witteveen and Jonker [1994] applies a similar method to achieve tractability for revising logic programs. They show that a globally minimal revision cannot be computed in polynomial time, but a locally minimal revision can well be computed in polynomial time. 4 Outlook

The collection of papers in this book does certainly not give a complete overview of the research going on at providing foundations for knowledge representation and reasoning. For instance, probabilistic approaches are not represented at all. Nevertheless, the set of papers in this book covers a wide range of topics in the area of foundational KR&R research and highlights the common research methodology, namely, to analyze representation and reasoning tasks from a logical and computational perspective. As already mentioned in the Introduction, this research methodology does most probably not lead to any immediate benefit in the sense that we can build faster or better reasoning systems. However, by providing the theoretical underpinning for KR&R systems, this research will help us understand where and what the limits of representation and reasoning are and how we can guarantee a reasonable behavior of KR&R systems. References [AAAI-90, 1990] Proceedings of the 8th National Conference of the American Association ]or Artificial Intelligence, Boston, MA, August 1990. MIT Press. [Adams, 1975] E. W. Adams. The Logic of Conditionals. Reidel, Dordrecht, Holland, 1975.

]0 [Alchourr6n et al., 1985] Carlos E. Alchourr6n, Peter Gs and David Makinson. On the logic of theory change: Partial meet contraction and revision functions. Journal of Symbolic Logic, 50(2):510-530, June 1985. [Allen etal., 1991] J.A. Allen, R. Pikes, and E. Sandewall, editors. Principles of Knowledge Representation and Reasoning: Proceedings of the ~nd International Conference, Cambridge, MA, April 1991. Morgan Kaufmann. [Allgayer and Franconi, 1994] Jiirgen Allgayer and Enrico Franconi. Collective entities and relations in concept languages. In Lakemeyer and Nebel [1994]. [Baader and HoUunder, 1994] Franz Bander and Bernhard Hollunder. Computing extensions of terminological default theories. In Lakemeyer and Nebel [1994]. [Bettini, 1994] C. Bettini. A formalization of interval-based temporal subsumption in first order logic. In Lakemeyer and Nebel [1994]. [Boutilier, 1994] Craig Boutilier. Normative, subjunctive and autoepistemic defaults. In Lakemeyer and Nebel [1994]. [Brachman and Levesque, 1984] Ronald J. Brachman and Hector J. Levesque. The tractability of subsumption in frame-based description languages. In Proceedings of the th National Conference of the American Association for Artificial Intelligence, pages 34-37, Austin, TX, 1984. [Brachman and Levesque, 1985] Ronald J. Brachman and Hector J. Levesque, editors. Readings in Knowledge Representation. Morgan Kaufmann, Los Altos, CA, 1985. [Brachman et al., 1989] R. Brachman, H. J. Levesque, and R. Reiter, editors. Principles of Knowledge Representation and Reasoning: Proceedings of the 1st International Conference, Toronto, ON, May 1989. Morgan Kaufmann. [Brachman, 1990] Ronald J. Brachman. The future of knowledge representation. In AAAI-90 [1990], pages 1082-1092. [Brewka, 1991] Gerhard Brewka. Nonmonotonic Reasoning: Logical Foundations of Commonsense. Cambridge University Press, Cambridge, UK, 1991. [Console and Dupr6, 1994] Luca Console and Daniele Dupr6. Abductive reasoning with abstraction axioms. In Lakemeyer and Nebel [1994]. [Donini etal., 1991] Francesco M. Donini, Maurizio Lenzerini, Daniele Nardi, and Werner Nutt. Tractable concept languages. In Proceedings of the 1Pth International Joint Conference on Artificial Intelligence, pages 458-465, Sydney, Australia, August 1991. Morgan Kaufmann. [Donini et al., 1994] Francesco Donini, Manrizio Lenzerini, Daniele Nardi, Andrea Schaerf, and Werner Nutt. Queries, rules and definitions as epistemic sentences in concept languages. In Lakemeyer and Nebel [1994]. [Doyle and Patti, 1991] Jon Doyle and Ramesh S. Patil. Two theses of knowledge representation: Language restrictions, taxonomic classification, and the utility of representation services. Artificial Intelligence, 48(3):261-298, April 1991. [Garey and Johnson, 1979] Michael R. Garey and David S. Johnson. Computers and Intractability--A Guide to the Theory of NP-Completeness. Freeman, San Francisco, CA, 1979. [Gottlob, 1992] Georg Gottlob. Complexity results for nonmonotonic logics. Journal for Logic and Computation, 2(3), 1992. [Gottlob, 1994] Georg Gottlob. The power of beliefs or translating default logic into standard autoepistemic logic. In Lakemeyer and Nebel [1994]. [Greiner and Schuurmans, 1994] Russell Greiner and Dale Schuurmans. Learning an optimally accurate representation system. In Lakemeyer and Nebel [1994].

11 [Hayes, 1977] Patrick J. Hayes. In defence of logic. In Proceedings of the 5th International Joint Conference on Artificial Intelligence, pages 559-565, Cambridge, MA, August 1977. [Junker and Konolige, 1990] Ulrich Junker and Kurt Konolige. Computing extensions of autoepistemic and default logics with a truth maintenance system. In AAAI-90 [1990], pages 278-283. [Kakas, 1994] A. C. Kakas. Default reasoning via negation as failure. In Lakemeyer and Nebel [1994]. [Kalinski, 1994] Jiirgen Kalinski. Weak autoepistemic reasoning and well-founded semantics. In Lakemeyer and Nebel [1994]. [Katsuno and Mendelzon, 1991] Hirofumi Katsuno and Alberto O. Mendelzon. On the difference between updating a knowledge base and revising it. In Allen et aJ. [1991], pages 387-394. [Kautz and Selman, 1994] Henry Kautz and Bart Selman. Forming concepts for fast inference. In Lakemeyer and Nebel [1994]. [Lakemeyer and Nebel, 1994] Gerhard Lakemeyer and Bernhard Nebel, editors. Foundations of Knowledge Representation and Reasoning. Springer-Verlag, Berlin, Heidelberg, New York, 1994. [Levesque and Brachman, 1987] Hector J. Levesque and Ronald J. Brachman. Expressiveness and tractability in knowledge representation and reasoning. Computational Intelligence, 3:78-93, 1987. [Levesque, 1988] Hector J. Levesque. Logic and the complexity of reasoning. Journal of Philosophical Logic, 17:355-389, 1988. [Lewis, 1973] David K. Lewis. Counterfactuals. Harvard University Press, Cambridge, MA, 1973. [Lin, 1994] Yuen Q. Lin. A common-sense theory of time. In Lakemeyer and Nebel [1994]. [McCarthy, 1968] John McCarthy. Programs with common sense. In M. Minsky, editor, Semantic Information Processing, pages 403-418. MIT Press, Cambridge, MA, 1968. [McCarthy, 1980] John McCarthy. Circumscription--a form of non-monotonic reasoning. Artificial Intelligence, 13(1-2):27-39, 1980. [McDermott, 1978] Drew V. McDermott. Tarskian semantics, or no notation without denotation! Cognitive Science, 2(3):277-282, July 1978. [Moore, 1985] Robert C. Moore. Semantical considerations on nonmonotonic logic. Artificial Intelligence, 25:75-94, 1985. [Myers and Konolige, 1994] Karen Myers and Kurt Konolige. Reasoning with an~dogica~ representations. In Lakemeyer and Nebel [1994]. [Nebel et al., 1992] B. Nebel, W. Swartout, and C. Rich, editors. Principles of Knowledge Representation and Reasoning: Proceedings of the 8rd International Conference, Cambridge, MA, October 1992. Morgan Kanfmann. [Nebel, 1991] Bernhard Nebel. Belief revision and default reasoning: Syntax-based approaches. In Allen et al. [1991], pages 417-428. [Nejdl and Banagl, 1994] Wolfgang Nejdl and Markus Banagl. Asking about possibilities - revision and update semantics for subjunctive queries (extended report). In Lakemeyer and Nebel [1994]. [Niemel~ and Rintanen, 1994] Ilkka Niemel/i and Jussi Rintanen. On the impact of stratification on the complexity of nonmonotonic reasoning. In Lakemeyer and Nebel [19941.

]2 [Pearl, 1988] Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, CA, 1988. [Reiter, 1980] Raymond Reiter. A logic for default reasoning. Artificial Intelligence, t3(1):81-132, April 1980. [Reiter, 1987] Raymond Reiter. Nonmonotonic reasoning. Annual Review of Computing Sciences, 2, 1987. [Schmidt-SchauB and Smolka, 1991] Manfred Schmidt-SchauB and Gert Smolka. Attributive concept descriptions with complements. Artificial Intelligence, 48:1-26, 1991. [Schwind and Risch, 1991] (7. Schwind and V. Risch. A tableau-based chaxa~:terisation for default logic. In R. Kruse and P. Siegel, editors, Symbolic and Quantitative Approaches to Uncertainty, Proceedings of the European Conference EGSQA U, pages 310-317, Marseilles, France, 1991. Springer-Verlag. [Selman and Kautz, 1991] Bart Selman and Henry Kautz. Knowledge compilation using Horn approximations. In Proceedings of the 9th National Conference of the American Association for Artificial Intelligence, pages 904-909, Anaheim, CA, July 1991. MIT Press.
[Shoham and Cousins, 1994] Yoav Shoham and Steve B. Cousins. Logics of mental attitudes in AI -- a very preliminary survey. In Lakemeyer and Nebel [1994].

[Smith, 1982] Brian C. Smith. Reflection and Semantics in a Procedural Language. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, 1982. Report MIT/LCS/TR-272. [Weydert, 1994] Emil Weydert. Hyperrational conditionals - monotonic reasoning about nested default conditionals. In Lakemeyer and Nebel [1994]. [Witteveen and Jonker, 1994] Cees Witteveen and Catholijn M. Jonker. Revision by expansion in logic programs. In Lakemeyer and Nebel [1994].

Collective Entities and Relations in Concept Languages*


J i i r g e n Allgayer 1 a n d Enrico F r a n c o n i 2 x IBM Germany, Software Architectures and Technologies Schlo6-Str. 70, D-7000 Stuttgart 1, Germany (allg ayer@ vnet.ibm.com) 2 Istituto per la Ricerca Scientifica e Tecnologica (IRST) 1-38050 Povo TN, Italy (franconi@irst.it)

A b s t r a c t . Collective entities and collective relations play an important role in natural language. In order to capture the full meaning of sentences like "The Beatles sing 'Yesterday' ", a knowledge representation language should be able to express and reason about plural entities - like "the Beatles" and their relationships - like "sing" with any possible reading (cumulative, distributive or collective). In this paper a way of including collections and collective relations within a concept language, chosen as the formalism for representing the semantics of sentences, is presented. A twofold extension of the A/ZC concept language is investigated : (1) special relations introduce collective entities either out of their components or out of other collective entities, (2) plural quantifiers on collective relations specify their possible reading. The formal syntax and semantics of the concept language is given, together with a sound and complete algorithm to compute satisfiability and subsumption of concepts, and to compute recognition of individuals. An advantage of this formalism is the possibility of reasoning and stepwise refining in the presence of scoping ambiguities. Moreover, many phenomena covered by the Generalized Quantifiers Theory are easily captured within this framework. In the final part a way to include a theory of parts (mereology) is suggested, allowing for a lattice-theoretical approach to the treatment of plurals.

Introduction

In this paper it is shown how a concept language, i.e, a knowledge representation language of the KL-ONE family - also called Frame-Based Description
* This paper is a a reduced version of a paper to appear in M i n d s and Machines, special issue on Knowledge Representation for Natural Language Processing. This work has been partially supported by the Italian National Research Council (CNR), project "Sistemi Informatici e Calcolo Parallelo", and by the IRST M A I A project. We would like to thank also Alessandro Artale, Werner Nutt and Achille Varzi for the helpful and incisive discussions we had with them.

14 Languages, Term Subsumption Languages, Terminological Logics, Taxonomic Logics or Description Logics [30] - can be extended in order to represent and reason about collective entities or collections [1, 9]. The enriched concept language proposed here is intended to form the semantical and computational means for the representation of plurals and plural quantifiers in natural language; other approaches in the literature include [3, 17, 20, 22, 26, 27]. Although this work has been conceived for concept languages, it can be also applied to other knowledge representation formalisms, such as Conceptual Graphs [26] and Intensional Propositional Semantic Networks SNePS [24]. An analysis of plurals in natural language leads us to distinguish among two different categories of plural entities: classes and collections. Classes are involved in sentences like "Men are persons", where the NP "men" is represented by means of the class predicate MAN:

Vx. M A N ( x ) --. PERSON(x).


On the other hand, collections are contingent aggregates of objects, and they should be represented as terms instead of predicates, i.e. they should be interpreted at the same level of individuals as single elements of the domain. For example, the logical form of the sentences "The Beatles are John, Paul, George and Ringo" and "John is the leader of the Beatles" is the following:

~(beatles, paul), ~(beatles, john), ~(beatles, ringo), ~(beatles, george), LE D-BY (beatles, john).
The plural entity Beatles is interpreted as a collection, and in the logical form it does not appear as a predicate, but as a term, at the same level as the objects it is composed by. In order to give a meaning to the terms denoting collections, a weakened form of Set Theory - called Collection Theory - is adopted, and the dangerous leap into a second order theory is avoided. It turns out that the collection theory is more adequate to represent plurals than set theory, because the extensionality principle does not hold. Within the collection theory, plural quantifiers are introduced, in order to capture the different readings of a relation when applied to a collection. This approach allows for the representation of ambiguous readings, so that in the presence of incomplete information a complete reasoning can still be carried on. Interesting connections with the Generalized Quantifiers Theory [5] can be drawn; for a deep analysis of the relations between collection theory and Generalized Quantifiers Theory please refer to the full paper [10]. In the full paper the use of this formalism within a natural language dialogue system is extensively discussed. A more radical departure from set theory to represent collections is proposed in the last part of this paper, introducing an non-extensional mereology [25]. In this way, a lattice-theoretical approach for the treatment of plurals as in [18] is possible.

15

The paper is organized as follows. At the beginning the Collection Theory and the Plural Quantifiers are introduced in a generic logical framework. Then it is presented how these theories can be merged into the propositionally complete concept language .A~:C; many examples will clarify the expressive power of the newly obtained language As An account of the computational properties of .A/:CS is given, and a sound and complete decision procedure for a weaker form of the language is devised. Finally, the collection theory is abandoned in favour of a Mereology, i.e. a theory of the part-whole relation. 2 The Collection Theory

In this section a simple formal way to model collections of objects is introduced. A collection is formed by selecting certain objects, called members or elements of the collection. Within this model, both objects and collections of objects are denoted by terms, i.e. they are interpreted as entities of the domain. Like in the standard set theory, a primitive membership binary relation (denoted by 9) is introduced, to relate the collective entities with their elements. For example, the formulas

2(beatles, paul), ~(beatlcs, john), ~(beatles, ringo), ~(beatles, george),


are intended to mean that the entities john, paul, ringo and george are elements of the collective entity named beatles; thus, their conjunction represents the meaning of the sentence "The Beatles are John, Paul, George and Ringo". A collection can be related with other collections: it can share some components or it can include all the components of other collections. The sub-collection and overlapping relations between collections are defined by means of the primitive relation: D e f i n i t i o n 1. ( S u b - c o l l e c t i o n a n d O v e r l a p p i n g )

C_(a,b) iffVx. ~ ( a , x ) ~ ~(b,x) N(a, b ) i f f 3 x . O(a,x) A ~(b,x).


For example, the formulas

C_(beatles, appleCharterMembers),

fl(beatles, mostPopularSingers)

express that each of the beatles is also a founder of the Apple Records company, and that some of them are among the most popular singers. The C relation is defined as the usual subset of the ordinary set theory, whereas the Cl relation is defined to be true for all the non-empty set-theoretic intersections N s, i.e. C'l(a, b) if and only if Cl'(a,b) # O. Since any other axiom for such relations is not introduced, it turns out that the C_ and Cl relations have weaker properties than their counterparts in set theory. For example, it follows from the definition that the C_ relation is reflexive and transitive but not anti-symmetric - i.e. it is a quasi-ordering - since the extensionality axiom does not hold. In set theory, the extensionality axiom says that

16 two sets are equal if and only if they have the same elements. In our framework, two collections having the same elements are not necessarily equal. So, from

C_(appleCharterMembers, beatles),
does not follow that

C_(beatles, appleCharterMembers)

appleCharterMembers = beatles,
because the entity beatles could have different attributes from the entity representing the group of people who founded the Apple Records company, like, for instance, legal liability or taxes to pay. This simple framework will be referred as the Collection Theory. A richer theory would include also negated relations, such as "~", " / l " (disjointness) and "qs - see [28]; they are needed to properly cover many natural language phenomena. However, it is argued (though not yet proved) that, in the context of concept languages, negated relations may lead to undecidability. As a trivial consequence of this choice, it is worth noting that paradoxes - like the Russel paradox - are avoided and well-foundedness of the Collection Theory is guaranteed, if the negation of the ~ relation can not be expressed.

Plural Quantifiers

Generic relations which apply to collective entities can be quantified in different ways. So, representational means are introduced to capture the semantics of the (possibly underdetermined) reading variants of NL expressions. As an example, take the possible readings involving the plural subject of the sentence "John, Paul, George and Ringo sing 'Yesterday' ". For the collective reading all the men together sing the song; in the case each man sings separately, we speak of the distributive reading; and, finally, the cumulative reading can describe the mixed situations in which, say, one sings alone, and separately the others sing together, with the proviso that all of them are involved in some action of singing. The <1 and ~ (resp. D and b_) operators - called plural quantifiers- introduce the left (resp. right) distributive and cumulative readings for generic binary relations having a collection as left (resp. right) argument. Thus, relationships between collections have a more structured semantics than the standard one. A relation holds not only directly between the objects of predications, but may be distributed between the elements of such objects, if they are collections. This approach helps in delaying the decision for any of those variants of a sentence, allowing for the representation of the scope ambiguities in the logical form. The economy of representation makes it unnecessary for the system to compute all the disambiguated interpretations of the sentence before storing its meaning in the knowledge base [21]. In the case of underdetermined reading, a possible incremental growth of the knowledge base might rule out one or another reading. In order to introduce the formalism, consider the following example:

17 "John is the leader of the Beatles"

LE D-BY (beatles, john),


"The Beatles were born in Liverpool"

<1BORN-IN(beatles, liverpool),
"The Beatles sing 'Yesterday' "

~_SING(beatles, yesterday).
The relation L E D - B Y has a "collective" reading - i.e. john is the leader of the whole collection beatles. The relation BORN-IN is "left distributive" over the components of the beatles - i.e. each member was born in Liverpool:

BORN-IN(john, liverpool), BORN-IN(paul, liverpool), BORN-IN(george, liverpool), BORg-IN(ringo, liverpool).


The relation SING has a "left cumulative" reading with respect to the beatles i.e. it is possible that any collection of components of beatles sings 'Yesterday', ranging from single individuals to the entire beatles collection, with the proviso that the union of such collections should include at least all the beatles members. For e.xample, together with the collective interpretation

SING(beatles, yesterday),
and the distributive interpretation

SING(john, yesterday), SING(paul, yesterday), SING(george, yesterday), SING(ringo, yesterday),


a possible valid interpretation for the cumulative reading is the following: ~(C1, paul), ~(C1, john), ~(C1, clvis),

~(C2, paul), 2(C2, george), 2(C2, ringo), SING(C1, yesterday), SING(C2, yesterday).
In this interpretation, the relation SING holds between two collective entities C1, C2 and yesterday. The 'inclusion' condition of the cumulative reading is satisfied: each member of the beatles belongs to at least one of the collections participating to the relation. The plural quantifiers are formally defined as follows.

Definition 2. (Plural Quantifiers) <3n(a, b) iff Vx. ~(a, x) --* R(x, b) I>R(a, b) iff Vx. ~(b, x) --* R(a, x) b) ifr Vx. x) b) V (3s. A R(s, b)) ) :>R(a, b) iff Vx. ~(b, x) --+ (R(a, x) V (3s. ~(s, x) A R(a, s)) ).
It is easy to check that the cumulative plural quantified expressions subsume

18 the non-quantified - i.e. the collective - expressions and the distributive plural quantified ones: W , y . R(x, y) -~ ~R(x, y), W, y. R(x, y) --* ~ n ( x , y), W , y . ~ R ( x , y ) - ~ ~R(x, y), W, y. ~R(~, y) ~ ~R(~, y).

In this way, the non-disambiguated logical form of a sentence with multiple interpretations can be represented: "The Beatles sing 'Yesterday' "

<3S I N G( beatles, yesterday).


Information which comes later in the discourse can monotonically refine the knowledge about the quantifiers scoping. From the sentence "They sing the song all together" the system is able to conclude that they is anaphoric to the Beatles and the song to 'Yesterday', and will produce an interpretation which specializes the preceding underdetermined one:

SING(b, y),

SONG(y),

b = beatles,

y = yesterday,

On the other hand, if later in the discourse the sentence "Each one of them sings the song" appears, the system will add the formulas

<ISING(b, y),

SONG(y),

which are Mso a refinement with respect to the preceding cumulative interpretation.

The

.As

concept

language

The collection theory is now merged into a larger logic, in order to obtain an expressive, but still decidable, description language. With respect to the formal apparatus, I will strictly follow the concept language formalism introduced by [23] and further elaborated for example by [6, 8, 7]. Basic types of a concept language are concepts, roles and individuals. A concept is a description gathering the common properties among a collection of individuals; logically it is a unary predicate ranging on the domain of individuals. Properties are represented by means of roles, which are logically binary relations. In the following, we will consider the language .AI~CS, which extends the propositionMly complete concept language .A/:C [23], Mlowing a richer expressivity for roles: the 9, C and N relations are the basic roles of the collection theory; the <l, I>, <1 and D plural quantifiers introduce the distributive and cumulative readings for generic roles. According to the syntax rules of figure 1, .A~.CS concepts (denoted by the letters C and D) are built out of primilive concepts (denoted by the letter A) and roles (denoted by the letter R); roles are built out of primitive roles (denoted by the letter P).

19

C~ D --~

A[

(primitive concept)

T I
-l- I -~C I C~D I C UD I VR.C [ 3R.C R --+ P I 9 I C_ I n I <1R I DR I RI ~R

(top)
(bottom) (general complement) (conjunction) (disjunction) (universal quantifier) (existential quantifier)

(primitive role) (has element relation) (sub-collection relation) (overlaps relation) (left distributive) (right distributive) (left cumulative) (right cumulative)

Fig. 1. Syntax rules for the .A/:CS concept language.

Usually, concept and role expressions are called TBox terms. An interpretation Z = (A z, .z) consists of a set A z of individuals (the domain of I ) and a function .z (the interpretation function of I ) that maps every concept to a subset of A z and every role to a subset of A z A z such that the equations of figure 2 are satisfied. An interpretation I is a model for a concept C if C z 7s 0. If a concept has a model, then it is satisfiable, otherwise it is unsatisfiable. A concept C is subsumed by a concept D (written C E D) if C z C_ D z for every interpretation I . Subsumption can be reduced to satisfiability since C is subsumed by D if and only if C F3-~D is not satisfiable. According to the given semantics, the concept non-empty collection, denoting any collection having at least one element, and the concept empty collection, denoting any entity having no elements, can be defined as follows: COLL - 3S.T, E C O L L & VS.-I- = -~COLL.

It is easy to verify the validity of the following statements: Vx, y. E C O L L ( x ) ---+C_(x, y), Vx, y. ( C O L L ( x ) A C O L L ( y ) ) ~ (C_(x, y) ---+n ( x , y)), --3x, y. n(x, y) A ( E C O L L ( x ) V E C O L L ( y ) ) .

20
AZ 72: .1_2: = 0

(C [q D)2: --- C z n D2: ( C U D ) z = C2: u D2: (~c)2: = A 2: \ C2:


(VR.C)2: = {a 9 A2:
(~R.C)2: = {a 9 z~2: 13b-(a,b) 9149 C I = {(~, b) 9 z~2: x ~2: I Vx. (a, x) CI2: {(a, b) a2: a s 1 3x. (a, x) (,~R) ~ = {(a, b) 9 ~2: ~2: I W . (~, x) ( ~ R ) z = {(a, b) 9 a2: ~2: I W . (b, ~) ( ~ R)2: = {(a,b) 9 A z x A z [

IVb'(a,b) 9 z--~b 9 z} z}
9 9 ~ --. (b, x) 9 92:} 9 9 ~ ^ (b, ~) ~ 92:} ~ 9 ~ -~ (~, b) 9 R ~} e 9 ~ -~ (a, ~) 9 R D

( ~ R)2:

V x . ( a , x ) 9 9 z ---+ ((3s~ ( s , x ) 9 9 z A (s,b) 9 R z) V (x, b) 9 RZ)} {(a, b) 9 ,~2: a s I Vx. (b,x) 9 92: -'-+ ((3s. (s,x) 9 92: A (a,s) 9 R2:) V (a,x) 9 Rz)}

Fig. 2. The semantic interpretation for concepts and roles in ~4Z:C,~.

These statements match our intuitions about collections, and reflect the corresponding valid axioms in set theory, where E C O L L is interpreted as the empty set and C O L L is interpreted as any non-empty set: Vy. Cs(0, y),

Vx,y. (x # 0 ^ y :/: O) ~ (c_s(x,y) ~ n'(x, y) r 0), ~3x, y. n'(x, y) r 0 ^ (x = O v y = O).


Let us introduce now some more complex concept definitions, in order to understand better the expressive power of .Af_.CS. The concept (Vr'I.COLL) denotes any entity which possibly has an overlapping with some non-empty collection; the concept (VC_.COLL) denotes any entity which possibly contains less elements than some non-empty collection; finally, the concept ( 3 n . T ) denotes any collection having at least a common element with something else. It can be shown that the first concept is equivalent to the top concept, the second include in its denotation all the non-empty collections, and the latter denotes only non empty collections:

V n . C O L L =_ T,

COLL C VC.COLL,

3 N . T E COLL.

It is worth noting that expressions containing the <1 plural quantifier cannot be reformulated in terms of the ~ role only. This is somehow counterintuitive, and surprisingly it can be proved that:

21

# Vm.(3R.C).
This can be understood by considering that the concept in the right hand side introduces for each element a possible new relation, whereas the concept in the left hand side introduces the same relation for each element; therefore, the former has a larger denotation. In fact, the best that can be proved is an inclusion relation: 3 ( 4 R ) . C E_ V g . ( 3 R . C ) . Let us consider now assertions, i.e. predications on individual objects; usually, they are referred to as ABox statements. Let O be the alphabet of symbols denoting individuals; an assertion is a statement of the form C(a) or R(a, b), where C is a concept, R is a role, and a and b denote individuals in O. In order to assign a meaning to the assertions, the interpretation function z is extended to individuals, so that az E A z for each individual a E O and az 7s b~ if a r b (Unique Name Assumption). The semantics of the assertions is the following: C(a) is satisfied by an interpretation Z iff az E C z, and R(a, b) is satisfied by Z iff (a s, bz) E R z. A set ~ of assertions is called an knowledge base. An interpretation Z is a model of Z iff every assertion of ~ is satisfied by I . If Z has a model, then it is satisfiable. Z logically implies an assertion a (written Z ~ a) if a is satisfied by every model of ~ . Given a knowledge base Z, an individual a in O is said to be an instance of a concept C if ~' ~ C(a). The instance recognition problem, i.e. checking whether Z ~ C(a), can be reduced to satisfiability since a is an instance of C with respect to a knowledge base Z if and only if Z U {~C(a)} is unsatisfiable

[13).
Coming back to our example regarding the Beatles, let us see how the concept representing any pop group can be defined using the .As language:

P O P - G R O U P - V ~ . P E R S O N fq V L E D - B Y . P E R S O N I q V<I B O R N - I N . C I T Y I9 V~ S I N G . P O P - S O N G .

The definition states that a pop group is composed by persons, that the relation "led by a person" is inherently collective with respect to the group, that the relation "born in a city" inherently distributes to the single persons composing the group, and that the relation "sing a pop song" has a cumulative reading for the group. From the definition and from the knowledge acquired during the discourse, the system is able to recognize the individual beatles as an instance of P O P - G R O U P and is able to classify the P O P - G R O U P as a collection:

POP-GROUP(beatles),

P O P - G R O U P E COLL.

22

T h e calculus for A z . c , s

In this section an algorithm for deciding satisfiability of .Af_.gS-concepts is proposed. This algorithm can be used also to decide subsumption between two concepts and instance recognition between an individual and a concept. The soundness of the algorithm and its completeness with respect to . A L g S - , which is a weaker variant of .As will be proved. The algorithm is sound but not complete for A s In order to obtain AL:gS- from .A/:CS, the Collection Theory is changed by relaxing the semantics of the C, n , ( 4 R), ( t> lZ), (~_ R) and ( _ R) collective

roles.
D e f i n i t i o n 3. ( C o l l e c t i v e R o l e s f o r AL:g,~-)

C_(a, b) ~ Vx. ~(a, x) ---+O(b,x)

f3(a, b) ::~ 3x. ~(a, x) A ~(b, x)


.R(a,b)

Vx. ~(a, x) ---*R(x, b) W. ~(a, ~) --+ ( (3s. ~(s, ~) ^ R(s, b)) V R(~, b)) W. ~(b, ~) -~ ( (3s. ~(s, ~) ^ R(a, s)) V R(a, ~)).

~_R(a,b) I>R(a,b)

With respect to the definition of the collective roles for .AZCS, the iff 'r has been replaced with a simple implication ' ~ ' . In this way, the collective roles become primitive; nonetheless, they still induce a structure between the elements. Even if at first sight this simplification may give the impression that the obtained theory is too weak, we claim that this is not actually the case. There are two reasons: first, in a open world semantics - which is usually adopted for NLP semantics - the lost deductions that can be performed in .4/3g8 from the elements to the collective roles are very few; second, from the natural language point of view such deductions are not the intuitive ones. In the following we will refer to the language .A/:CS-. The rule-based cMculus to decide the satisfiability of .A/:C,5- and A/:C,S concepts operates on constraints [14]. A constraint can be of the type x : C or xRy, where C is a concept, R is a role, and x, y are variables belonging to a predefined alphabet of variable symbols. The interpretation of constraints is defined as follows. Let Z be an interpretation of the concept language. An Z-assignment r is a function that maps every variable to an element of A z. We say that c~ satisfies x : C if and only if a(x) E C z, and ~ satisfies x R y if and only if (a(x), ~(y)) E R z. A constraint system S is a finite, non-empty set of constraints. An Z-assignment o~ satisfies a constraint system S if ~ satisfies every constraint in S. A constraint system S is satisfiable if there is an interpretation Z and an Z-assignment c~ such that ~ satisfies S. A clash is a system having one of the forms: {x : _l_},{~: : A, 9 : -~A}.

23
Proposition4. ( R e d u c t i o n t o a c o n s t r a i n t s y s t e m ) A . A E C S - concept C is satisfiable if and only if the constraint system {x:C} is satisfiable.

Proof. Follows from the definitions.

[]

For the calculus, we consider only simple . A R C S - concepts. A concept is called simple if it c o n t a i n s only c o m p l e m e n t s of the f o r m ~ A , where A is a p r i m i t i v e concept. A n a r b i t r a r y A R C S - ( A E C S ) concept can be t r a n s f o r m e d in linear t i m e i n t o an equivalent simple concept b y m e a n s of the following r e w r i t i n g rules:

-~T --. _L -11 --, T -~-~C -~ C

-~(C n D) --~ -~C u -~D -~(C U D) --. -~C n -~D

-~VR.C --~ 3R.-~C ~ 3 R . C -~ VR.-~C

S t a r t i n g from the s y s t e m S = {x : C}, the propagation ~ules are applied, u n t i l a c o n t r a d i c t i o n is g e n e r a t e d or a m o d e l of C is explicitly o b t a i n e d : the p r o p a g a t i o n rules preserve satisfiability. It is i m p o r t a n t to r e m a r k t h a t they are i n t r o d u c e d in order to prove satisfiability, a n d t h a t they are not i n t e n d e d to be used as d e d u c t i o n rules. We have the following rules: S-*n S-*u {x:C1, x:C2}uS i f x : C l [ T C 2 in S, and b o t h x : C 1 a n d x : C 2 not in S {x:D}uS i f x : C l U C 2 in S, and neither x : C ' l nor x : C 2 i n S , and D----C1 or D = 6 ' 2

S -+3" {xRy, y : C} U S if x : SR.C in S, and there is no z s.t. both x R z and z : C in S,


and y is a new variable S-+v {y:C}US if x : VR.C and xRy in S, and y : C not in S

s -*c_ {y~z} u s if xCy and x~z in S, and yOz not in S S--~r {xSk, y g k } u S if xNy in S, and there is no z s.t. both x g z and ySz in S,
and k is a new variable

S--*~

{zRy}US ifx(<lR)y and xgz in S, and zRy not in S S -+ I> {*Rz} u S if x( ~>R)y and y2z in S, and x R z not in S
S-~<l T U S

if x(<IR)y and x g z in S, and there is no t s.t. both tBz and s is a new variable, and T S -~ t> T u S if x( ~ R ) y and ygz in S, and there is no t s.t. both tgz 9nd s is a new wriable, and T

and t r y in S, and zRy not in S, = {sgz, sRy} or T = {zRy}

and x R t in S, and x R z not in S, = (sgz, xRs} or T = {xRz}

24 Propagation rules are either deterministic - they yield a uniquely determined constraint system - or nondeterministic (--+.,, ---~<1,~ t> ) - yielding several possible constraint systems. P r o p o s i t i o n 5. ( L o c a l s o u n d n e s s a n d c o m p l e t e n e s s ) Let S be a constraint system of .Al:CS- concepts. If S t is obtained from S by applying a deterministic propagation rule, then S is satisfiable if and only if S ~ is satisfiable. If S ~ is obtained from S by applying a nondeterminislic propagation rule, then S is satisfiable if S I is satisfiable; moreover, there is a way to apply the rule to S such that the obtained system is satisfiable if and only if S is satisfiable.

Proof. Easy by translating the constraint systems into logical formulas.

[]

A constraint system is said to be complete if no propagation rule is applicable to it. Because of the presence of nondeterministic rules, several complete systems can be derived from {z : C}. P r o p o s i t i o n 6. ( T e r m i n a t i o n ) Let C be a simple A s concept. Given the constraint system {x : C}, after a finite number of applications of the propagation rules one obtains a finite set of complete constraint systems.

Proof. (sketch) One should prove that the size of each obtained complete constraint system is finite - in particular by checking the number of the newly introduced variables -, and that the number of complete constraint systems generated by the propagation rules is finite - by observing that the application of the rules decreases the complexity of the constraint system, i.e. rules add to the system only simpler constraints for which no rule directly applies again. []
P r o p o s i t i o n 7. ( S a t i s f i a b i l i t y ) A constraint system S = {z : C} is satisfiable if and only if there exist a clash free complete constraint system which can be derived from S by applying the propagation rules.

Proof. (sketch) First prove that a complete system is satisfiable if and only if it contains no clash. Then the proposition follows from local soundness and completeness and the termination of the propagation rules, and from the independence of the meanings of the basic role expressions. []
Now, it is straightforward to put together the blocks and build up a sound and complete decision procedure to check satisfiability of .A/:gS- concepts. One should collect all the complete constraint systems derivable from {x : C} by applying the propagation rules. Such systems are, up to variable renaming, finitely many. If at least one of these systems is clash free, then C is satisfiable, otherwise it is unsatisfiable. So, the following holds: T h e o r e m 8 . ( D e c i d a b i l i t y ) Satisfiability, subsumption and instance recognilion problems in .Af~gS- are decidable.

25
C o r o l l a r y 9. ( A l g o r i t h m f o r . 4 s

The proposed algorithm is a sound and complete decision procedure to check satisfiability, subsumption and instance recognition in r163

The proposed algorithm is a sound decision procedure to check satisfiabilily, subsumption and instance recognition in As
C o r o l l a r y 10. ( A l g o r i t h m f o r , 4 s From the computational point of view, the precise complexity of the satisfiability problem still need to be found. It is easy to see that the problem is PSPACE-hard - such a lower bound comes from `4s However, it is not known if a PSPACE-algorithm for checking the satisfiability of A E C S - concepts exists. Final!y, a less incomplete algorithm for AECS is under study.

A Mereology

In this section the switching of the basic collection-forming operator from a quasi-ordering relation _C founded on a membership relation 9, to a primitive part-whole partial-ordering relation "~" [25] is proposed. The idea is as follows. Let's have now the roles defined according to the syntax rule: n ~ P Ih I

<~c.R} ~_c.nl .~c.nl J>c.RI >_c.RI ~c'.R

The basic collection-forming relation is the partial-ordering binary relation _ (to be read HAS-PART) on the A z set - it is a reflexive, anti-symmetric and transitive relation. D e f i n i t i o n 11. ( P a r t - w h o l e r e l a t i o n ) Vx. __(x, x).

vx, y. ___(x, y) A ~-(y, x) ~ x = y. w , y, z. ~(~, y) A ~-(y, ~) ~ ~(~, z).


Again, the language embodies plural quantifiers which specify the reading of a binary relation applied to collections. Plural quantifiers are qualified in the sense that the elements of actual predications are selected by a qualification predicate C. The <1 and I> quantifiers specify that the relation necessarily holds for all the parts of a certain type C, i.e. they express the left and right distributive readings. The <1 and t> quantifiers specify that the relation necessarily holds for some parts including all the parts of a certain type C, i.e. they express the left and right cumulative readings. Moreover, a new class of plural quantifiers can be easily introduced in the mereological framework: the ~ and ti~ quantifiers. They allow us to represent the group reading of a relation - as in [15, 16]. The ,~1 and tl~ quantifiers specify that the relation possibly holds for some part of a certain type C. The semantics of the plural quantifiers is given by the following definition:

26

Definition 12. (Qualified Plural Quantifiers) <lC.R(a,b) iffVx. (_(a, x) A C(x)) -~ R(x, b) >C.R(~, b)iffW. (_~(b, x) ^ C(~)) --. R(a, x) <C.R(~, b) iffW. (h(a, x) ^ C(x)) -~ (as. _~(s, x) ^ R(s, b)) t>C.R(a, b) iff Vx. (~-(b, x) A C(x)) --. (as. h(s, x) A R(a, s)) <~]C.R( a, b) iff 3x. ~_(~,x) A c(x) A R(x, b) ~C.R(a, b) iff 3x. h(b, x) A C(x) A R(a, x)
As is expected, the cumulative plural quantified expressions are still more general than the unquantified (collective) expressions and the distributive plural quantified ones: Vx, y. R(x, y) -~ < C . R ( x , y), Vx, V. R(x, V) -+ ~ C . R ( x , V), Vx,y. <~C.R(x,y) --+ ~ c . a ( x , y ) , Vx, V. ~C.R(x, V) -~ ~C.R(x, V).

In this way, ambiguities in the readings can be preserved by using the cumulative operator. In this mereological framework, a plural operator ' , ' is defined as follows:

Definition 13. (Plural Operator) *P(a) iffVx.~(a, x) -* (3y.P(y) A (~_(x, y)V ~_(y, x) ) )
The purpose of the plural operator is to allow the construction of plural collective entities having singular objects of a certain type as their parts. More specifically, for any subpart x of the plural entity, either x is a part of the given singular type or the latter is part of x. We can define, for example, the plural predicate P E O P L E from the singular predicate PERSON:

PEOPLE - ,PERSON.
In this way, a non-extensional Mereology is reconstructed, which is a generalization of the simple Theory of Collections presented above. It is called nonextensional for the same reasons as in collection theory: the extensionality axiom is not valid within this mereology. Elements of a collection are parts in the partial order: ~(beatles, john). As observed in [26], the main advantage of this approach is its uniformity in treating elements and collections. The difference is evident if we look at the semantics of the cumulative plural quantifier: the Collection Theory distinguishes an individual x from a singleton collection s whose only element is x, such that ~(s, x): this is why an explicit disjunction was introduced in the semantics. On the other hand, within mereology the disjunction disappears, since the _ relation is reflexive: >-(john, john). In order to recover the lost distinction between a collection and its elements, qualification on plural quantifiers is needed. The qualification predicate acts like a filter to select the correct level in the mereological partonomy. In this way,

27
problems with multi-level plural entities [19] can be solved. The examples for distributive and cumulative readings, using qualified plural quantifiers, are expressed in the mereological framework in the following way: "The Beatles are born in Liverpool"

( <lPERSON.SORN-IN)(beatles, liverpool),
"The Beatles sing 'Yesterday' "

( ~ PERSON.SING)(beatles, yesterday).
The qualification predicates for the plural quantifiers are taken from the lexical definition of the involved relations: in this example, the qualification P E R S O N comes from the lexical semantics entries of the relations B O R N - I N and SING. The group reading is a sort of weakened collective reading; for example: "The Beatles played in London"

(.~llPEOPLE.PLAY-IN)(beatles, london),
represents the case in which it is unknown whether the group which played in London was composed actually by all the members of the Beatles. So, the relation P L A Y - I N holds for some collection of persons, i.e. an element of the class PEOPLE, which should be a part of the collective entity beatles. A mereological version of the collection theory has also been applied to model the structure of events and processes [4] in the domain of tense and aspect in natural language, in order to properly account for perfective and imperfective sentences and for habituals by means of plural quantifiers ranging on collections of events [11, 12]. The basic assumption taken into consideration is that verbal morphology plays a crucial role in specifying the temporal meaning of a sentence. Several issues still need to be considered within the mereological approach: the existence of atoms (entities which have no parts), the possible inclusion of several different >-i relations [29], the introduction of a mass dissective predicate as in [18], and, last but not least, the calculus. This reformulation of the collection theory can be adopted for the logical analysis of plurals according to the lattice-theoretical approach pursued by [18].

Conclusions

In this paper a Collection Theory has been presented, i.e. a formalism which is intended to give semantical and computational means to plurals and plural quantifications. This approach allows for complete reasoning even in the presence of scoping ambiguities. The concept language .AECS has been studied, which embeds in a uniform and compositional way the collection theory, and a sound and complete algorithm to decide satisfiability, subsumption and instantiation for a slightly weaker variant of the language has been devised.

28 Finally, a mereological framework has been suggested, and it has been argued that a theory of part-whole relation is more expressive and more adequate than an element-based collection theory. However, more work is needed to refine this mereological framework.

References
1. J. Allgayer. LSB--ONE+ -- dealing with sets efficiently. In Proc. of the 9 th ECAI, pages 13-18, Stockholm, Sweden, 1990. 2. J. Allgayer and E. Franconi. A semantic account of plural entities within a hybrid representation system. In Proc. of the 5 th International Symposium on Knowledge Engineering, pages 305-312, Seville, Spain, October 1992. 3. E. Bach. The algebra of events. Linguistics and Philosophy, 9:5-16, 1986. 4. J. Barwise and R. Cooper. Generalized quantifiers and natural language. Linguistics and Philosophy, 4:159-219, 1981. 5. F. M. Donini, B. Hollunder, M. Lenzerini, A. Marchetti Spaccamela, D. Nardi, and W. Nutt. The complexity of existential quantification in concept languages. Artificial Intelligence, 53:309-327, 1992. 6. F. M. Donini, M. Lenzerini, D. Nardi, and W. Nutt. The complexity of concept languages. In Proc. of the 2 nd International Conference on Principles of Knowledge Representation and Reasoning, pages 151-162, Cambridge, MA, 1991. 7. F. M. Donini, M. Lenzerini, D. Nardi, and W. Nutt. Tractable concept languages. In Proc. of the 12 th 1JCAL pages 458-465, Sidney, Australia, August 1991. 8. E. Franconi. Adding constraints inference to ABox reasoning. Abstract 9205-02, ]RST, Povo TN, Italy, May 1992. 9. Enrico Franconi. A treatment of plurals and plural quantifications based on a theory of collections. Minds and Machines, special issue on Knowledge Representation for Natural Language Processing (to appear), 1993. A preliminary version appears in the Preprints of the International Workshop on Formal Ontology, N. Guarino and R. Poll (eds.), pages 219-249, Padova, Italy, 1993. 10. Enrico Franconi, Alessandra Giorgi, and Fabio Pianesi. A mereological approach to tense and aspect. In Proc. of the International Conference on Mathematical Linguistics, ICML-93, Barcelona, Spain, April 1993. (Abstract). 11. Enrico Franconi, Alessandra Giorgi, and Fabio Pianesi. Tense and aspect: a mereological approach. In Proc. of the 13 th IJCA1, Chambery, France, August 1993. 12. B. Hollunder. Hybrid inferences in KL-ONE-based knowledge representation systems. In Proc. of the 14 th German Workshop on Artificial Intelligence. SpringerVerlag, 1990. 13. B. Hollunder, W. Nutt, and M. Schmidt-SchauB. Subsumption algorithms for concept description languages. In Proc. of the 9 th ECAL pages 348-353, Stockholm, Sweden, 1990. 14. Fred Landman. Groups, I. Linguistics and Philosophy, 12:559-605, 1989. 15. Fred Landman. Groups, II. Linguistics and Philosophy, 12:723-744, 1989. 16. L. Lesmo, M. Berti, and P. Terenziani. A network formalism for representing natural language quantifiers. In Proc. of the 8 th ECA1, pages 473-478, Munich, Germany, 1988. ]7. Godehard Link. The logical analysis of plurals and mass terms: a lattice-theoretical approach. In R. Banerle, C. Schwarze, and A. von Stechow, editors, Meaning, Use and Interpretation of Language, pages 302-323. Walter de Gruyter, 1983.

29 18. Godehard Link. Algebraic semantics for natural language: some philosophy, some applications. In Nicola Guarino and Roberto Poll, editors, Proc. of the International Workshop on Formal Ontology, pages 19-49, Padova, Italy, March 1993. 19. M. Poesio. Dialog-oriented ABoxing. In Proc. of the 5 th International Symposium on Methodologies for Intelligent Systems, pages 277-288, Knoxville, TN, 1990. 20. Massimo Poesio. Relational semantics and scope ambiguity. In J. Barwise, J. M. Gawron, G. Plotkin, and S. Tutiya, editors, Situation Semantics and its Applications, vol.2, chapter 20, pages 469-497. CSLI, Stanford, CA, 1991. 21. J. Quantz. How to fit generalized quantifiers into terminological logics. In Proc. o f the 10 th ECAL pages 543-547, Vienna, Austria, 1992. 22. M. Schmidt-Schaufi and G. Smolka. Attributive concept descriptions with complements. Artificial Intelligence, 48(1):1-26, 1991. 23. S. C. Shapiro and W. J. Rapaport. The SNePS family. Computer and Mathematics with Applications, special issue: Semantic Networks in Artificial Intelligence, 23(25):243-275, March-May 1992. 24. P. Simons. Parts: A Study in Ontology. Clarendon Press, Oxford, 1987. 25. J. F. Sowa. Toward the expressive power of natural language. In J. F. Sowa, editor, Principles of Semantic Networks, pages 157-189. Morgan Kaufmann, 1991. 26. Bosco S. Tjan, David A. Cardiner, and James R. Slagle. Representing and reasoning with set referents and numerical quantifiers. In T. E. Nagle, J. A. Nagle, L. L. Gerholz, and P. W. Eklund, editors, Conceptual Structures, current reasearch and practice, chapter 2, pages 53-66. Ellis Horwood, 1992. 27. Michael P. Wellman and Reid G. Simmons. Mechanisms for reasoning about sets. In Proc. of AAAI-88, pages 398-402, St. Paul, MN, 1988. 28. Morton E. Winston, Roger Chaffin, and Douglas Herrmann. A taxonomy of partwhole relations. Cognitive Science, 11:417-444, 1987. 29. William A. Woods and James G. Schmolze. The KL-ONE family. Computer and Mathematics with Applications, special issue: Semantic Networks in Artificial Intelligence, 23(2-5):133-177, March-May 1992.

Computing Extensions of Terminological Default Theories


F r a n z B a a d e r and B e r n h a r d Hollunder

German Research Center for Artificial Intelligence (DFKI) Stuhlsatzenhausweg 3 D-66123 Saarbriicken, G e r m a n y e-malh (last name)@dfki.uni-sb.de
A b s t r a c t . We consider the problem of integrating Reiter's default logic into terminological representation systems. It turns out that such an integration is less straightforward than we expected, considering the fact that the terminological language is a decidable sublanguage of first-order logic. Semantically, one has the unpleasant effect that the consequences of a terminological default theory may be rather unintuitive, and may even vary with the syntactic structure of equivalent concept expressions. This is due to the unsatisfactory treatment of open defaults via Skolemization in Reiter's semantics. On the algorithmic side, this treatment may lead to an undecidable default consequence relation, even though our base language is decidable, and we have only finitely many (open) defaults. Because of these problems, we then consider a restricted semantics for open defaults in our terminological default theories: default rules are only applied to individuals that are explicitly present in the knowledge base. In this semantics it is possible to compute all extensions of a finite terminological default theory, which means that this type of default reasoning is decidable.

Introduction

Terminological representation systems are used to represent the taxonomic and conceptual knowledge of a problem domain in a structured and well-formed way. To describe this kind of knowledge, one starts with atomic concepts (unary predicates) and roles (binary predicates), and defines more complex concepts using the operations provided by the concept language of the particular formalism. In addition to this concept description formalism, most of these systems also have an assertional component. One can for example state that an individual is an instance of a concept, or that two individuals are connected by a role. In terminological representation formalisms, the concept descriptions are interpreted as universal statements, which means, unlike frame languages, they do not allow for exceptions. As a consequence, the system can use descriptions to automatically insert concepts at the proper place in the t a x o n o m y (classification), and it can use the facts stated about individuals to deduce to which concepts they must belong (realization). For example, one could define the concept M a m m a / as an Animal that feeds its young with Milk, where feeds-young-with is

31 used as a role. If the concept Platypus I is defined as an Animal that lives-in the Water, feeds its young with Milk, and reproduces with Eggs, then the system will recognize that Platypus is a subconcept of Mammal. However, commonsense reasoning is often based on assumptions that may ultimately be shown to be false. In our example, one might want to assume by default that Mammals reproduce Viviparously. Only if it is known that a specific mammal reproduces with eggs, should this assumption be cancelled. If one wants to use terminological systems for this kind of commonsense reasoning, one needs a formalism that can handle such default assumptions, but does not destroy the definitional character of concept descriptions--because otherwise the advantage of automatic concept classification, etc., would be lost (see [6]). Besides the general arguments for the importance of reasoning with defaults, which can be found in the nonmonotonic reasoning literature, the need for embedding defaults into terminological representation formalisms is also substantiated by the fact, that this is an important item on the wish list of users of terminological representation systems (see e.g. [20]). Several existing terminological systems, such as BACK [19], C L A S S I C [7], Kaep [15], LOOM [18], o r SB-ONE [14], have been or will be extended to provide the user with some kind of default reasoning facilities. However, as the designers of these systems themselves point out, these approaches usually have an ad hoc character, and are not equipped with a formal semantics. For example, defaults in the FAME system, which is built using K-aep, "will not be complete (or even consistent)" ([15], p . l l ) unless the user is very careful when using them. In CLASSIC, "a limited form of defaults can be represented with the aid of rules and test functions." However, the user is warned to "use this trick with extreme caution" ([7], p.45,46). Our arguments for the importance of default extensions for terminological representation languages so far were given from the viewpoint of the terminological systems community. However, these investigations may also be of interest for research in nonmonotonic reasoning itself. Most nonmonotonic reasoning formalisms (e.g. Reiter's default logic [21], Circumscription [16]) use full first-order predicate logic as their base language. In this general form, the formalisms are usually highly undecidable (see e.g. [21] Theorem 4.9). For this reason, work on decision procedures for decidable subcases was mostly restricted to propositional logic (see e.g. [13]), thus leaving the wide gap between propositional logic and full first-order logic almost unexplored. Since most terminological representation languages can be viewed as decidable subclasses of first-order logic--but are nevertheless much more expressive than propositional logic--they can serve as interesting test cases for nonmonotonic reasoning formalisms. We shall see that this not only applies for algorithmic, but also for semantic considerations. We shall here consider the problem of integrating Reiter's default logic into a terminological representation formalism. This treatment of defaults in terminological systems has already been proposed by Brachman and Schmolze [8], but to a We are taking this as our exceptional animal, in view of the fact that last IJCAI was in Austrafia, and not in the Antarctic.

32 the best of our knowledge, this proposal was never followed up. Reiter's default rule approach seems to fit well into the philosophy of terminological systems because most of them already provide their users with a form of "monotonic" rules. These rules can be considered as special default rules where the justifications-which make the behaviour of default rules nonmonotonic--are absent. At first sight, one might think that, from a semantic point of view, the proposed integration should be unproblematic. In fact, the terminological representation language we shMl consider (see Section 2) is a sublanguage of firstorder logic, and Reiter's semantics has been formulated for full first-order logic. However, in [3] we have shown that one runs into severe problems, due to the unsatisfactory treatment of open defaults by Skolemization (see also Section 3). A similar problem arises when considering the integration from the algorithmic point of view. In the abstract of their paper on how to compute extensions for default logic, Junker and Konolige [12] write that their method is applicable if the default theory "consists of a finite number of defaults and premises and classical derivability for the base language is decidable." A related formulation can be found in the abstract of Schwind and Risch's paper on the same topic [25]. Since our base language is decidable, and we certainly do not want to have infinitely many default rules, these methods seem to apply in our case. However, a closer look at the papers reveals that by "a finite number of defaults" it is meant "a finite number of closed defaults." But the default rules one wants to consider in terminological default theories are open defaults. In fact, as already pointed out by Reiter ([21], p.115) "the genuinely interesting cases involve open defaults." In [3] we have shown that, with a (decidable) terminological language as base language, a finite set of premises and open defaults may lead to an undecidable default consequence problem, if the open defaults are treated as proposed by Reiter ([21], Section 7.1). Because of the semantic as well as algorithmic problems posed by Reiter's treatment of open defaults, we shall consider a restricted semantics for open defaults in our integration: default rules are only applied to individuals that are explicitly present in the assertional part (ABox) of the knowledge base. T h o u g h one may thus lose some intuitive default inferences, this treatment of default rules is akin to the treatment of the monotonic rules in terminological systems such as CLASSIC. With this restricted semantics, a finite set of open defaults stands for a set of closed defaults that is finite as well. Thus the above-mentioned methods of Schwind and Risch and of Junker and Konolige can be applied to compute extensions (see Section 4). In order to make these methods more efficient, one has to solve certain algorithmic problems for the terminological language. For Junker and Konolige's methods one has to find minimal proofs for assertional facts--which can be seen as an abduction problem for ABoxes--and for Schwind and Risch's method one must find maximal consistent sets of assertional facts. In Section 5 we shall point out how the tableaux-based methods for assertional reasoning developed in our group ([11, 2]) can be modified to solve these problems.

33 2 The Representation Formalisms

First we shall briefly review the terminological language .A/:g [24] and Reiter's default logic. Then terminological default logic is defined as the speciMization of default logic to A/:g.
2.1 T h e t e r m i n o l o g i c a l l a n g u a g e .A/~C

Terminological knowledge representation formalisms can be used to define the relevant concepts of a problem domain (terminological knowledge), and to describe objects of this domain with respect to their relation to concepts and their interrelation with each other (assertional knowledge). Depending on which constructs are allowed for building concept descriptions we get different terminological languages. In the present paper we restrict our attention to the language

As
D e f i n i t i o n 1. The terminological part of the language .A/~C consists of the following concept description formalism. The concept terms of this formalism are built from concept and role names using the constructors conjunction (C D), disjunction (C U D), negation (-~C), exists-restriction (3R.C), and valuerestriction (VR.C), where C, D stand for concept terms and R for a role name. The assertional part of our language allows us to assert facts concerning particular objects. These objects are referred to by individual names, and we can state that an object belongs to a concept (written C(a)), or that two objects are related by a role (written R(a, b)). Here a, b stand for individual names, C for a concept term, and R for a role name. A finite set of such facts is called an ABox.
The semantics of an ABox can either be given directly by defining interpretations and models, or by a translation into first-order logic. In order to make the fact explicit that we are dealing with a sublanguage of first-order logic, we choose the second option. Concept names are considered as symbols for unary predicates, and role names as symbols for binary predicates. Consequently, concept names A are translated into (atomic) formulae a ( x ) with one free variable, and role names R into (atomic) formulae R(z, Y) with two free variables. Concept terms are also translated into formulae with one free variable. The semantics of conjunction, disjunction, and negation are defined in the obvious way, i.e., (C[3D)(x):= C ( z ) A D ( z ) , (CUD)(x):= C(x)VD(x), (--C)(x):= -~C(x). For value-restrictions we define (YR.C)(x) := Yy: (R(x, y) --+ C(y)), and the semantics of exists-restrictions is given by (3R.C)(x) := 3y: (R(x, y) A C(y)). The individual names of the ABox are considered as constant symbols. In terminological systems one usually has a unique name assumption, which can be expressed by the formulae a ~ b for all distinct individual names a, b. The formula corresponding to the assertional fact C(a) (resp. R(a, b)) is obtained by replacing the free variable(s) in the formula corresponding to C (resp. R) by a (resp. a, b). To sum up, an ABox is translated into a set of first-order formulae consisting of

34 the translations of the ABox facts together with the formulae expressing unique name assumption. The basic inference service for ABoxes is called instantiation. It answers the question of whether (the translation of) a given ABox fact C(a) is a (logical) consequence of (the translation of) a given ABox ,4. If the answer is yes we say that a is an instance of C with respect to .A (.4 ~ C(a)). Algorithms which solve this inference problem have, for example, been described in [11, 2].

2.2

Reiter's default logic

Reiter [21] deals with the problem of how to formalize nonmonotonic reasoning by introducing nonstandard, nonmonotonic inference rules, which he calls default rules. A default rule is any expression of the form
o~ : ~ l , . . . , ~ , ~

..,/

where c~,/31, . . . , / 3 n , ~/are first-order formulae. Here c~ is called the prerequisite of the rule, ill, -..,/3n are its justifications, and 7 is its consequent. For a set of default rules 7), we denote the sets of formulae occurring as prerequisites, justifications, and consequents in 7) by Pre(7)), gus(7)), and Con(7)), respectively. A default rule is closed iff c~, /31,...,/3n, 3' do not contain free variables. A default theory is a pair (}4/, 7)) where )41 is a set of closed first-order formulae (the world description) and 7) is a set of default rules. A default theory is closed iff all its default rules are closed. Intuitively, a closed default rule can be applied, i.e., its consequent is added to the current set of beliefs, if its prerequisite is already believed and all its justifications are consistent with the set of beliefs. Formally, the consequences of a closed default theory are defined with reference to the notion of an extension, which is a set of deductively closed first-order formulae defined by a fixed point construction (see [21], p.89). In general, a default theory may have more than one extension, or even no extension. Depending on whether one wants to employ skeptical or credulous reasoning, a closed formula $ is a consequence of a closed default theory iff it is in all extensions or if it is in at least one extension of the theory. In general, this consequence relation is not even recursively enumerable (see [21], Theorem 4.9). Reiter also gives an alternative characterization of an extension, which we shall use, in a slightly modified way, as the definition of extension. Here and in the following, T h ( F ) stands for the deductive closure of a set of formulae/~. D e f i n i t i o n 2. Let E be a set of closed formulae, and (l/V, 7)) be a closed default theory. We define E0 := W and for all i >_ 0 E~+I := El tA {7 I a:/31,..., fl~ E 7), a E Th(E,), and -'/31,..., "~fl~ ~ T h ( E ) } . 7 Then Th(E) is an extension of (W, 7)) iff Th(E) - ~Ji~=oTh(Ei).

35 Note that the extension Th(E) to be constructed by this iteration process occurs in the definition of each iteration step. Since we are only adding consequents of defaults during the iteration, any extension Th(E) of (W, 1)) is of the form Th(14] U Con(/)')) for a subset 73' of 7). Reiter shows ([21], T h e o r e m 2.5) that the s e t

~/

E 7) [ a e Th(E) and o f 1,..., "fin r Th(E)

always satisfies this property. For this reason it is called set of generating defaults for the extension Th(E). Another easy consequence of Definition 2 is t h a t ()IV, 7)) has an inconsistent extension iff )IV is inconsistent. Reiter defines extensions of arbitrary default theories ()IV,7)), i.e., default theories with open defaults, as follows. First, the formulae of )IV and the consequents of the defaults are Skolemized (see [21], Section 7). Second, a set 7)~ of closed default rules is generated by taking all ground instances (over the initial signature together with the newly introduced Skolem functions) of the defaults of 7). Now E is an extension of 04/, 7)) iff E is an extension of the closed default theory (}4/I, 7)1), where }4/I is the Skolemized form of )/V. The reason for Skolemizing before building ground instances will be explained by an example in the next section.

2.3

Terminological default theories

A terminological default theory is a pair (.4, 7)) where .4 is an ABox and 7) is a finite set of default rules whose prerequisites, justifications, and consequents are concept terms. Obviously, since ABoxes can be seen as sets of closed formulae, and since concept terms can be seen as formulae with one free variable, 2 terminological default theories are subsumed by Reiter's notion of an open default theory. However, as for ABox reasoning without defaults, we are not interested in arbitrary formulae as consequences of a terminological default theory (.4,/)), but only in assertional facts of the form C(a), where a is an individual n a m e occurring in the original ABox .4.

Reasons for and Problems Caused by Skolemization

First, we illustrate by an example why Reiter uses Skolemization in his semantics for open default theories. Then we shall argue that this t r e a t m e n t of open defaults is problematic both from a semantic and an algorithmic point of view. The first example shows that intuitively valid consequences would get lost if one did not Skolemize. Suppose t h a t our ABox consists of the fact t h a t Tom has some child who is a doctor, i.e., .4 = {(qchild.doctor)(Tom)}. By default we 2 The concept terms occurring in one rule are assumed to have identical free variables.

36 want to conclude that doctors usually are rich persons, and usually have children who are doctors. T h u s / ) consists of the default rules

doctor : rich-person and doctor : 3child.doctor rich-person 3child. doctor


Skolemization of the world description ,4 yields

,4' = {child(Tom, Bill), doctor (Bill) },


where Bill is a new Skolem constant, whereas Skolemization of the consequent of the second default yields a unary Skolem function, say child-of. It is easy to see that the corresponding closed default theory has exactly one extension, and that this extension contains the assertional facts that Tom has a rich child and a grandchild who is a doctor, i.e.,

(3child.rich-person)( Tom)

and

(3child.3child.doctor)( Tom).

Intuitively, this comes from the fact that the closed defaults obtained by instantiating our open defaults with the Skolem constant Bill are applicable. Without these ground instances, the above facts could not have been deduced by default. To deduce by default that the grandchild of Tom is not only a doctor, but also a rich one, the first default has to be instantiated by the term child-ot(Bill). Although the treatment of open defaults via Skolemization yields an appropriate behaviour in this example, it is in general problematic. Beside the fact that Skolemization of the world description may lead to counterintuitive consequences of default theories (see Section 3 of [3]), the following example demonstrates that the consequences of a default theory may depend on the syntactic form of the world description, i.e., for identical sets of open defaults, logically equivalent world descriptions may lead to different results. Consider concept terms C1 := 3R.(A N B) and C2 := 3R.A where R is a role name and A, B are concept names. Obviously, if we assert that an individual a is in the first term this implies that it is in the second one as well. For this reason, the ABoxes A1 :-- { e l ( a ) } and .42 := {Cl(a),C2(a)} are logically equivalent. When Skolemizing the first ABox, we get a single new Skolem constant b which is R-related to a and lies in A B, whereas when Skolemizing the second ABox we get two Skolem constants c and d, both R-related to a, but where c lies in A [9 B and d lies in A. Now consider the (open) default A : "~B/-~B. For the Skolemized version of .41, this default is instantiated with a, b, whereas for the Skolemized version of .4~ it is instantiated with a, c, d. Obviously, the default rule cannot fire for b and c, because their being in A ~ B is inconsistent with its justification. On the other hand, this default rule can be applied to d, because being in A is consistent with being in -~B. For this reason, d is put into ~B, which shows that the Skolemized version of .42 has (3R.-~B)(a) as a default consequence, whereas this fact cannot be deduced by default from the Skolemized version of .41. Technically, the reason for this behaviour is due to the fact that, before the application of the default, the individuals c and d might be identical (which is

37 the reason why the two ABoxes are logically equivalent) whereas this is no longer possible after the default has been applied. In addition to this semantic problem caused by Skolemization, we have shown that this treatment of open defaults can also lead to an undecidable default consequence relation, even though one employs a decidable base language and a finite set of defaults. In fact, the consequence problem for an open terminological default theory over a language which extends ,4/:g by attributes, i.e., functional roles, and so-called agreements on attribute chains is in general undecidable (cf. Section 4 of [3]). 4 Computing Extensions

Because of the problems caused by Skolemization in Reiter's treatment of open defaults, we now propose a restricted semantics for open default theories: default rules are only applied to individuals that are explicitly mentioned in the ABox. D e f i n i t i o n 3 . In the restricted semantics for terminological default theories, an open default of a theory (`4,/)) is interpreted as representing the closed defaults obtained by instantiating the free variable by all individual names occurring in `4. Because the ABox ,4 and the set of open defaults D are assumed to be finite, we end up with a finite set of closed defaults. Since our terminological language is decidable, the methods of Junker and Konolige, or of Schwind and Risch can be applied to compute all extensions (according to our restricted semantics). In principle, both methods depend on the fact that any extension of a closed default theory (.4, Z)) is of the form Th(`4 U Con(~)) for a subset ~ of :D. If T) is finite, there are only finitely many such subsets, and the only problem is to decide which of these generate an extension. In fact, if the base language is decidable, one could even use for this purpose the iteration process described in the definition of an extension. This is so because decidability of the base language makes each iteration step effective, and the iteration process terminates because there are only finitely many consequents to be added. However, with this m e t h o d one has to consider all the (exponentially many) subsets of Z). The two methods which we shall describe below try to avoid considering all subsets, thus making the search for (the sets of generating defaults of) all extensions more efficient. 4.1 Junker and Konolige's method

Junker and Konolige [12] translate a closed default theory (,4, 2)) into a Truth Maintenance Network (TMN) ~ la Doyle [9]. The nodes of the T M N are the consequents Cz), and the prerequisites and negated justifications/:z) of the defaults. A default c~ :/~1, 9-., J3,~/7 of 7:) is translated into a nonmonotonic justification (in(a), out(-,l~l,...,-'~n) -+ ^/) of the TMN. In order to supply the truth maintenance system with enough information about first-order derivability in the

38 base language, each prerequisite and negated justification of a default gives rise to several monotonic justifications of the TMN. These justifications are of the form (in(Q) -4 q) where q E s and Q is a minimal subset of C9 such that ,4 U Q entails q--i.e., ,4 u Q ~ q but ,4 U Q' ~ q for every proper subset Q~ of Q. Junker and Konolige show that there is a 1-1-correspondence between admissible labellings of the TMN thus obtained and extensions of the default theory, and they describe an algorithm which computes all admissible labellings of a TMN. Given such an admissible labelling, the set of generating defaults of the corresponding extension consists of the defaults whose consequents are labelled
~ i n . '~

In order to make the translation of terminological default theories into TMNs effective, one has to show how to compute the above mentioned monotonic justifications of the TMN. First note that the elements o f / : 9 U C9 are admissible assertional facts. This is obvious for the prerequisites and the consequents of our instantiated defaults, and for the negated justifications it follows from the fact that the concept language has negation as an operator. For this reason, ,4 U Q for a subset Q of Cv is an admissible ABox of our language, and the entailment problem ,4 U Q ~ q for q E L:o is an ordinary instantiation problem. As mentioned in Section 2, the instantiation problem is decidable for our language. A brute [orce algorithm could just compute all subsets Q of C9 such that ,4 U Q entails q E / : 9 , and then, for each q, eliminate the ones which are not minimal. Of course, this simple algorithm is very inefficient, and thus not appropriate for actual implementations. Because .4 U Q entails an assertional fact C(a) iff `4 u Q u {-~C(a)} is inconsistent, we need a solution of the following problem: Let `4, B be ABoxes. Find all minimal subsets Q of B such that ,4 U Q is inconsistent. Since a similar algorithmic problem has to be solved for the method obtained from Schwind and Risch's characterization of an extension, we defer the description of a more efficient solution of this problem to a separate section. A characteristic feature of Junker and Konolige's method is that--after the computation of the minimal sets Q - - i t is completely abstracted from derivability in the base language. This may be advantageous from a conceptual point of view, but it can be problematic from the algorithmic point of view. In fact, one has to compute the corresponding minimal sets for all elements q i n / : o , even though this information may not contribute to the computation of an extension. 4.2 A m e t h o d b a s e d on a t h e o r e m b y S c h w i n d a n d R i s c h

Schwind and Risch [25] give a theorem which characterizes those subsets 7) of 7) which are sets of generating defaults of an extension of a closed default theory (W, ~). They use this characterization for computing extensions of propositional default theories. In this subsection, we shall show how to apply the theorem to computing extensions of terminological default theories. Before we can formulate the theorem we need one more piece of notation.

D e f i n i t i o n 4 . Let W be a set of closed formulae, and 7) be a set of closed

39 defaults. We define 730 = 0 and, for i k O,

73i+l = 73i U {d -- Ot : f l l ' ' ' " f l n 7

I d E D and W U Con(73i) ~ a } .

Then :P is called grounded in W iff 73 = Ui~176731. This definition of groundedness differs from the one given in [25], but it is easy to see that both formulations are equivalent. The advantage of our formulation is that it can directly be used as a procedure for deciding groundedness, if 73 is finite and the entailment problem in the base language is decidable. If 73 is not grounded in W, then Ui~0 73i is the largest subset of 73 that is grounded in W. The iteration process described above corresponds to the iteration in the definition of extensions, with the main difference that it disregards the justifications. The second condition given in the following theorem makes up for this neglect. T h e o r e m 5 ( S c h w i n d a n d R i s c h ) . Let (W, 73) be a closed default theory. A

subset ~ of 73 is a set of generating defaults of an extension of (W,73) iff the following two conditions hold: I. D is grounded in W . 2. For all d E 73 with d = ~ : i l l , . . . , fln/'~ we have d E D iff W U Con(D) ~ og and for all i, 1 < i < n, W U Con(~) ~ -'fli.
If 73 is finite, and the entaihnent problem in the base language is decidable, this theorem provides us with an effective test of whether a subset 73 of 73 is a set of generating defaults of an extension of (YV, 79). We shall now describe a method based on this theorem which allows us to compute (the sets of generating defaults of) all extensions without having to consider all subsets of 73. If W is inconsistent then there is only one extension, namely the set of all formulae. In the following, we shall without loss of generality assume that W is consistent. Now, let 730 be the largest subset of 73 that is grounded in W, and let 7 3 t , . . . , 73,,~ be all maximal subsets of 730 such that W tO Con(73i) is consistent. Since W is assumed to be consistent, extensions are consistent as well, which means that a set of generating defaults of an extension is a subset of one of the 73i. The idea underlying our method is to start with these maximal sets 73i, and successively eliminate defaults violating the first condition of the theorem, or the "only if" part of the second condition. If no more defaults can be eliminated, the "if" part of the second condition is tested. Figure 1 describes the procedure for computing all extensions of a closed default theory. To show soundness and completeness of the procedure (Theorem 9) we need three lemmas. L e m m a 6 . Let (W,73) be a closed default theory and let 73' C_ 19 be such that W U Con(73') is consistent. Suppose the call Remove-Defaults(W, 73, 73') returns

the list s of sets of defaults. If :Do C s then Do is a set of generating defaults for an extension of (W,73).

40

Compute-All-Extensions(W, l:))
begin (1) if W is inconsistent (2) then print "Inconsistent world description" (3) else for all maximal subsets :D~ of :Do such that W U Con(:D ~) is consistent (4) do Remove-Defaults(W, :D, :D') ; end

Remove-Defaults(W, :D, :D')


begin (1) let :Do be the largest subset of :DI that is grounded in W; (2) if W t.J Con(:D0) ~ -~fli for some justification fli E Jus(790) (3) then let d = a : fl~,..., fin/7 be the corresponding default; (4) Remove-Defaults(W,:D,:Do \ {d}); (5) for all maximM subsets :D" of :D0 such that d e :D" and }IVU Con(/:)") ~ ~fli (6) do Remove-Defaults(W, :D, :D1,); (7) else if for each a : i l l , . . . , ~n/7 9 :D \ :Do either W U Con(:Do) ks a (8) or W U Con(:D0) ~ -~fli for some i (9) then add :D0 to the list of sets of generating defaults; end

Fig. 1. Procedure for computing the sets of generating defaults of all extensions of the closed default theory (W, :D). Proviso: 79 is finite and entailment in the base language is decidable.

Proof. We prove this lemma by showing that a set 7)o of defaults contained in s satisfies Conditions 1 and 2 of Theorem 5. Suppose that Do is contained in s It is easy to see that Do is a subset of :D' that is grounded in W (because of line (1)), which shows that Condition I of Theorem 5 holds for 790. To show that :Do satisfies the second condition of Theorem 5, first assume that d = a : ill,. 9 fl,~/7 E :Do. Recall that D0 is grounded in W, which implies that WOCon(790) ~ c~. Furthermore, observe that, for all i, 1 < i < n, WUCon(:Do) -~fli (because the condition in line (2) does not hold for :Do). Both facts together show that the "only if" part of Condition 2 holds. Now assume that d - a : i l l , . . . , fin/7 E 19 \790. Then either W U Con(79o) or }4] U Con(Do) ~ ~fli for some i (because the condition in lines (7) and (8) holds for :Do). This shows that the "if" part of Condition 2 is Mso satisfied. []

41 L e m m a 7. Let Do be a set of generating defaults for an extension of a closed default theory (}IV, D), and let D' be a subset o l d such that :Do C D' and 141U Con(D') is consistent. If Remove-Defaults(W, D, D') recursively calls RemoveDefaults then there is a call with arguments "IV, D, D" where Do C_ D" C 7)'.

Proof. Let Do C D' be sets of defaults satisfying the assumptions of the lemma. Suppose Remove-Defaults is called with arguments )IV,D, D ~. Let 1) 0 be the largest subset of D' that is grounded in 141. Then Do _C 790 because every set of generating defaults for an extension of (141,:D) is grounded in 14;. If the condition in line (2) does not hold for D0, Remove-Defaults is obviously not called recursively, and nothing has to be shown. Thus assume that the condition in line (2) holds for D~. This means that there is a default d = (~: i l l , . . . , fl,~/~; e DO such that 141 U Con(DO) ~ --'~i for some i, 1 < i < n. If d ~ Do we have Do C D~) \ {d} C 7:)0 and the call of Remove-Defaults with input (I/Y,D,D~ \ {d}) (cf. line (4)) satisfies the required property. Now assume that d E D0. Since Do is a set of generating defaults for an extension we know that 141 U Con(D0) ~= -~/3i. Thus there is a maximal subset D" of /)~ with 141 U Con(/)") ~ -~/3i that contains Do, and this means that the call Remove-Defaults(W, D, D") has the required property (cf. line (5) and (6)). []
L e m m a S . Let Do be a set of generating defaults for an extension of a closed default theory ( W , D ) , and let 7)' be a subset of 7) such that Do C_ O ~ and W O Con(D') is consistent. Suppose Remove-Defaults is called with arguments W, D, D'. Then

- there is a recursive call of Remove-Defaults, or - Do is added to the list of sets of generating defaults. Proof. Let Do C_ D' be sets of defaults satisfying the assumptions of the lemma. Suppose the call Remove-Defaults()~V,D,O I) does not recursively call Remove-Defaults. This means that the condition in line (2) does not hold for D~, where D~ is the largest subset of D I that is grounded in W. We show that D~ = Do. Since Do is grounded in W, we get Do C _ C _D0, and thus we only have to show D~ C Do. Assume to the contrary that D~ \ Do ~ ~. First we show that ),V t3 Con(Do) ~ a for some default a : i l l , . . . , fin/7 E D0 \ Do. To see this, recall that D~ is grounded in W. This means that there is a sequence dl; d2; ... of default in V~) such that 142 U Con({d~,...,d~,_l} ) ~ a~, where a~, is the prerequisite of the k-th default. Let 1 be the smallest number such that d~ 6 D~o\Do. Thus d~ 6 Do for all j, 1 _< j < l, which shows that W U C o n ( D o ) ~ ~ . Second, we have W U Con(Dl)) ~ -'fli for all justifications fli E Jus(D~)) because the condition in line (2) does not hold for D~. Since Do C_ D~ we especially know that W U Con(Do) ~ -~fli for all justifications fli E Jus(D0). Thus, we have shown that there is some default d = a :/3x,...,/3n/7 E D0\D0 such that W U C o n ( D 0 ) ~ a and W U C o n ( D 0 ) ~: -~fli for all i, 1 < i < n. Because of Theorem 5 this is a contradiction with our assumption that Do is a

42 set of generating defaults. Therefore the assumption :D~ \ :Do 5s I~ is falsified, and we can conclude that D~ = :Do. Since :Do is a set of generating defaults, the condition in lines (7), (8) holds for :Do (cf. Condition 2 of Theorem 5). Thus :Do is added to the list of sets of generating defaults. [] Now we are ready to prove soundness and completeness of our algorithm. First we observe that every set of defaults computed by the algorithm is in fact a set of generating defaults for an extension of a closed default theory (kV, :D) (cf. L e m m a 6). Now assume that :Do is a set of generating defaults for an extension of (kV, D). Recall that 141 U Con(Do) is consistent. Thus there is a maximal subset :D' of :D such that kV to Con(:D') is consistent and 7)' contains :Do. This shows that Compute-All-Extensions(W, :D) generates a call Remove-Defaults with arguments )IV, :D, :D' (cf. lines (3) and (4) in the function Compute-All-Extensions) for some subset l)' of :D with :Do C :D'. If the call Remove-Defaults(W,:D,:D') returns the list Z: of sets of defaults then :Do is contained in s This result is an immediate consequence of the previous two lemmas. In fact, Lemma 7 shows that there is a sequence of calls of Remove-Defaults such that ~/V, :D, Ci are the arguments of the i-th call where C1 = :D~, Ci+l C C{, and :Do C_ Ci for all i. Since :D is assumed to be finite and the Ci's are decreasing, there is some m > i such that Remove-Defaults(W,:D, Cm) does not generate a recursive call of Remove-Defaults. In this case :Do is added to the list s of sets of defaults (Lemma 8). Theorem9.

The call Compute-All-Extensions(W,:D) computes sets of generating defaults for all extensions of the closed default theory (kV, :D).

The functions Compute-All-Extensions and Remove-Defaults use the following subprocedures which have not explicitly been described: Decide whether W is consistent. Compute all maximal subsets :D' of :D such that W tO Con(:D') is consistent. Compute the largest subset :Do of :D' that is grounded in W. Compute all maximal subsets :D" of :Do such that W tO Con(:D") ~: -~/3~.

The first subprocedure is a direct application of the decision algorithm for entailment in the base language. The third subprocedure is simply obtained by implementing the definition of groundedness. The other two procedures depend on an algorithm for the following problem, which will be considered in the next section: Let ,4, B be ABoxes. Compute all maximal subsets Q of B such that ,4 t9 Q is consistent. In fact, the second subprocedure is a direct application of such an algorithm. For the fourth subprocedure, note that W t.l Con(7)") ~: -~/3i iff W U Con(:D") U {13i} is consistent.

43 5 Computing Minimal Consistent ABoxes Inconsistent and Maximal

This section is concerned with the following algorithmic problems: Given two ABoxes .,4, B, find all minimal (resp. maximal) subsets Q of B such that ,4 U Q is inconsistent (resp. consistent). Since consistency of ABoxes in .As is decidable, there is the obvious "bruteforce" solution which tests consistency of .,4 U Q for all subsets Q of B, and then takes the minimal inconsistent (maximal consistent) ones. In the following we shall describe a more efficient method of finding these minimal (maximal) sets. The method is an extension of the tableaux-based consistency algorithms for ABoxes described in [1, 11]. The idea of employing tableaux-based methods for such purposes was already used in [17, 25], but these papers restricted themselves to propositional logic, which is a much easier case. In order to decide whether an ABox ..4 is consistent, the tableaux-based consistency algorithm tries to generate a finite model of A. In principle, it starts with ..4, and adds new assertional facts with the help of certain rules until the obtained ABox is "complete," i.e., one can apply no more rules. Because of the presence of disjunction in our language, a given ABox must sometimes be transformed into two different new ABoxes, with the intended meaning that the original ABox is consistent iff one of the new ABoxes is consistent. Formally, this means that one is working with sets of ABoxes instead of a single ABox. Figure 2 describes the transformation rules of the tableaux-based consistency algorithm for .A/:C. Without loss of generality we assume that the concept terms occurring in ,40 are in negation normal form, i.e., negation occurs only directly in front of concept names. Negation normal forms can be generated using the fact that the following pairs of concept terms are equivalent: --~C and C, --,(C I-7 D) and -,CU--D, --(C U D) and --C [7--D, -,(3R.C) and VR.--,C, as well as --(VR.C) and 3R.-,C. The following facts make clear why the rules of Figure 2 provide us with a decision procedure for consistency of ABoxes of.As (see [11, 1] for a proof). P r o p o s i t i o n 10. 1. IrA1 is obtained from.4o by application of the conjunction, exists-restriction, or value-restriction rule then Ao is consistent ill.A1 is consistent. 2. If.41, As are obtained from ..40 by application of the disjunction yule then .Ao is consistent iff A1 or.A2 is consistent. 3. A complete ABox, i.e., an ABox to which no more rules apply, is consistent iff it does not contain an obvious contradiction, i.e., facts A(b),-,A(b) for an individual name b and a concept name A. 4. The transformation process always terminates. An obvious contradiction of the form A(b),--,A(b) will also be called "clash" in the following.

44

Let M be a finite set of ABoxes, and let ,40 be an element of .&4. The following rules replace `40 by an ABox `41 or by two ABoxes ,41 and ,42. T h e c o n j u n c t i o n rule. Assume that (C["ID)(a) is in `40, and that `40 does not contain both assertions C(a) and D(a). The ABox `41 is obtained from ,40 by adding C(a) and D(a). T h e d i s j u n c t i o n rule. Assume that (C U D)(a) is in -40, and that ,40 contains neither C(a) nor D(a). The ABox ,41 is obtained from ,40 by adding C(a), and the ABox ,42 is obtained from ,40 by adding D(a). T h e e x i s t s - r e s t r i c t i o n rule. Assume that (3R.C)(a) is in -40, and that -40 does not contain assertions R(a, c) and C(c) for some individual c. One generates a new individual name b, and obtains `41 from `40 by adding R(a, b) and C(b). T h e v a l u e - r e s t r i c t i o n rule. Assume that (VR.C)(a) and R(a,b) are in ,40, and that ,40 does not contain the assertion C(b). The ABox `41 is obtained from ,40 by adding C(b).

Fig. 2. Transformation rules of the consistency algorithm for `4s

To check whether a given ABox `4 is consistent one thus starts with {`4}, and applies transformation rules (in arbitrary order) as long as possible. Eventually, this yields a finite set M of complete ABoxes with the property that .4 is consistent iff one of the ABoxes in Ad is consistent. Since the elements of A/[ are complete their consistency can simply be decided by looking for an obvious contradiction. Now assume that `4,/3 are ABoxes, and we want to find all minimal (resp. maximal) subsets Q of B such that A to Q is inconsistent (resp. consistent). We start with applying the tableaux-based consistency algorithm to `4 U/3. Let ,41,...,`4m be the complete ABoxes obtained this way. If one of these is not obviously contradictory, `4to8 is consistent, and there are no minimal inconsistent sets to compute (resp. B is the maximal consistent set). Otherwise, we want to know which elements of B can be dispensed with without destroying the property that all complete ABoxes contain an obvious contradiction (resp. which elements of/3 have to be removed to get at least one complete ABox without obvious contradiction). For this reason, it is important to know which facts in B contribute to a particular obvious contradiction. To this purpose we introduce a propositional variable for each element of/3, and label assertional facts with "monotonic" boolean formulae built from these variables, i.e., propositional formulae built from the variables by using conjunction and disjunction only. In the original ABox ,4 t3/3, the elements of .4 are labelled with "true," and the elements

45

of B are labelled with the corresponding propositional variable. If, during the consistency test, n assertional facts with labels r r give rise to a new fact, the new one is labelled by r A r Since the same assertional fact m a y arise in more than one way, we also get disjunctions in labels. Again, we end up with complete ABoxes .41,..-,-Am, but now all assertional facts occurring in these ABoxes have labels. More formally, we shall now describe a labelled consistency algorithm for ABoxes `4 tO B consisting of "hard" facts .4 and of "refutable" facts 13. Without loss of generality we assume that the concept terms occurring in .4 tO 13 are in negation normal form. Initially, the elements of.4to/3 are labelled with monotonic boolean formulae as described above. We shall refer to the label of an assertional fact c~ by ind(c~). Starting with the singleton set {.4 tO B}, the transformation rules of Figure 3 are applied as long as possible. As for the unlabelled consistency algorithm, there cannot be an infinite chain of rule applications. This can, for example, be shown by a straightforward adaptation to the labelled case of the termination ordering used in [1]. Thus the labelled consistency algorithm also terminates with a finite set of complete ABoxes, i.e., labelled ABoxes to which no rules apply. The labels occurring in these ABoxes can be used to describe which of the original facts in 13 are responsible for the obvious contradictions. D e f i n i t i o n l l ( C l a s h f o r m u l a ) . Let .41,-..,.4,~ be the complete ABoxes obtained by applying the labelled consistency algorithm to .4 tO/3. A particular clash A(a),-~A(a) E .41 is expressed by the propositional formula ind(A(a)) A ind(-~A(a)). Now let r r be the formulae expressing all the clashes in .4i. The clash formula associated with .4 U B is
r~ k,

i=lj=l

We have used conjunction when expressing a single clash because both assertional facts are necessary for the contradiction. Now recall that we need at least one clash in each of the complete ABoxes to have inconsistency. This explains why disjunction is used to combine the formulae expressing the clashes of one complete ABox, and why the formulae corresponding to the different complete ABoxes are combined with the help of conjunction. P r o p o s i t i o n 12. Let r be the clash formula associated with .4 tO 13, let Q C B,

and let w be the valuation which replaces the propositional variables corresponding to elements of Q by "true" and the others by "false." Then .4 U Q is inconsistent iff r evaluates to '~true" under w.
Before proving this proposition we point out how the clash formula can be used to find minimal (resp. maximal) subsets Q of B such that `4 tO Q is inconsistent (resp. consistent). By Proposition 12, such minimal (resp. maximal) sets directly correspond to minimal (resp. maximal) valuations making the clash

46

Let .M be a finite set of labelled ABoxes, and let `40 be an element of .44. The following rules replace .4o by an ABox .41 or by two ABoxes .4i and .42. These new ABoxes either contain additional assertional facts, or the indices of existing assertional facts are changed. In order to avoid having to distinguish between these two cases in the formulation of the rules, we introduce a new notation. An ABox is extended by an assertional fact with index r means the following: If this fact is already present with index r we just change its index to r Y r Otherwise, it is added to the ABox and gets index r T h e c o n j u n c t i o n rule. Assume that (CF3D)(a) is in `40, and that Ao does not contain assertions C(a) and D(a) whose indices are both implied by ind((C n D)(a)). The ASox `4i is obtained by extending .40 by C(a) with index ind( (C F3D)(a) ) and by D(a) with index ind( (C rq D)(a) ). T h e d i s j u n c t i o n rule. Assume that (CU D)(a) is in .40, and that .40 does not contain C(a) or D(a) whose index is implied by ind((C U D)(a)). The ABox .4i is obtained by extending .4o by C(a) with index ind((C U D)(a)), and the ABox .42 is obtained by extending .40 by D(a) with index ind((C U D)(a)). T h e e x i s t s - r e s t r i c t i o n rule. Assume that (3R.C)(a) is in .40, and that .40 does not contain assertions R(a, c) and C(c) whose indices are both implied by ind((3R.C)(a)). One generates a new individual name b, and obtains .4i from .40 by adding R(a, b) and C(b), both with index

ind( (3R.C)(a) ).
T h e v a l u e - r e s t r i c t i o n rule. Assume that (VR.C)(a) and R(a,b) are in .4o, and that -4o does not contain an assertion C(b) whose index is implied by ind((VR.C)(a)) A ind(R(a, b)). The ABox `41 is obtained by extending -4o by C(b) with index ind( (VR.C)(a)) A ind(R(a, b)).

Fig. 3. Transformation rules of the labelled consistency algorithm for

`4s

formula r "true" (resp. "false"). Here "minimal" and "maximal" for valuations is meant with respect to the partial ordering Wl < w2 iff wl(Pi) < w2(Pi) for all propositional variables Pi, where we assume that "false" is smaller than "true." It is easy to see that the problem of finding maximal valuations making a monotonic boolean formula "false" can be reduced to the problem of finding minimal valuations making a monotonic boolean formula "true." In fact, for a given monotonic boolean formula r and a valuation w, let cd denote the formula obtained from r by replacing conjunction by disjunction and vice versa, and let wd denote the valuation obtained from w by replacing "true" by "false" and vice versa. Then w is a maximal valuation making r "false" iff wd is a minimal valuation making cd "true."

47 It should be noted that the problem of finding minimal valuations that make a monotonic boolean formula r "true" is NP-complete. In fact, if r is in conjunctive normal form, this is just the well-known problem of finding minimal hitting sets [22, 10]. On the other hand, if r is in disjunctive normal form, the minimal valuations can be found in polynomial time. However, transforming a given monotonic boolean formula into disjunctive normal form may cause an exponential blow-up. To optimize the search for minimal valuations one can use the method described in [23]. The rules of the labelled consistency algorithm as described have the unpleasant property that deciding whether or not a rule is applicable is an NP-hard problem. In fact, the preconditions of the rules include an entailment test for monotonic boolean formulae, which is NP-hard. However, one can weaken the precondition by testing a necessary condition for entailment (e.g. occurrence of the index in the top-level disjunction) without destroying termination and the property stated in Proposition 12. In this case, the rules will in general produce longer formulae occurring as indices, but the test whether a rule applies becomes tractable. P r o o f o f P r o p o s i t i o n 12. First we shall explain the connection between application of rules of the labelled consistency algorithm, starting with `4 U B, on the one hand, and application of rules of the unlabelled algorithm, starting with .4 U Q for O~ C_ B, on the other hand. D e f i n i t i o n 1 3 . Let `40 be a labelled ABox, and let w be a valuation. The wprojection of `4o (for short, w(`40)) is obtained from .4o by removing all facts whose labels evaluate to "false." Let Q be a subset of ~. In the following, the valuation w is assumed to be such that it replaces the variables corresponding to elements of Q by "true" and the others by "false." Obviously, this means that w(`4 U B) = .4 U Q. Now we shall show how application of a rule of the labelled consistency algorithm to a labelled ABox `4o corresponds to application of a rule of the unlabelled algorithm to w(`40). To get this correspondence, the conditions on applicability of the disjunction and the exists-restriction rules have to be weakened for the unlabelled algorithm: T h e m o d i f i e d d i s j u n c t i o n r u l e . Assume that (C kl D)(a) is in -40, and that .40 does not contain C(a) and D(a). The ABox `41 is obtained from .40 by adding C(a), and the h B o x `42 is obtained from .4o by adding D(a). T h e m o d i f i e d e x i s t s - r e s t r i c t i o n r u l e . Assume that (3R.C)(a) is in `4o. One generates a new individual name b, and obtains ,41 from A0 by adding R(a, b) and C(b). Since the same it is easy This will the modified exists-restriction rule can be applied infinitely often to fact (3R.C)(a) the modified set of rules need no longer terminate. But to see that the first two properties stated in Proposition 10 still hold. be sufficient for our purposes.

48

Let `40,`41 be labelled ABoxes such that `41 is obtained from `4o by application of the conjunction (resp. exists-restriction, value restriction) rule. Then we either have w(`41) = w(`4o), or w(A1) is obtained from w(`4o) by application of the (unlabelled) conjunction (resp. modified exists-restriction, value restriction) rule.
Lemraal4.

Proof. (1) Assume that the conjunction rule is applied to the assertional fact (C D)(a), and that this fact has index r in `40. First, consider the case where w(r = false. In this case, we have w(`41) = w(`40). In fact, if C(a) (resp. D(a)) is not in ,40 then this fact has index r in `41. Since w(r = false this means that C(a) (resp. D(a)) is not in w(`41). If C(a) (resp. D(a)) is an element of.,4o with index r then C(a) (resp. D(a)) has
index r V r in `41. Since w(r = false we have w(r V r = w(r which shows that C(a) (resp. D(a)) is an element of w(`41) iff it is an element of w(`4o). Now assume that w(r = true. Thus (C [3 D)(a) is an element of w(`4o). Since `41 is obtained by extending ,40 by C(a) and D(a), both with index r we also know that C(a) and D(a) are contained in w(`41). If both facts are already present in ~(`40) we have ~(`41) = w(`40). Otherwise, ~(`41) can be obtained from w(`40) by applying the conjunction rule to (C [3 D)(a). (2) Assume that the value-restriction rule is applied to the assertional facts (VR.C)(a) and R(a, b), and that these facts respectively have index r and r in `40As for the conjunction rule, w(r A r = false implies ~J(`41) = 02(`40). Thus assume that w(r A r = true. Then (VR.C)(a) and R(a, b) are contained in ~(`40). Since `41 is obtained by extending `40 by C(b) with index r A r we know that C(b) is an element of w(`41). If this assertional fact is already present in w(`40) then w(`41) = w(`4o). Otherwise, w(`41) can be obtained from w(`4o) by applying the value-restriction rule to (VR.C)(a) and R(a, b). (3) Assume that the exists-restriction rule is applied to the assertional fact (3R.C)(a), and that this fact has index r in `40. The case where w(r = false is again trivial. Thus assume that w(r = true. Then (3R.C)(a) is an element ofw(`4o). The labelled ABox At is obtained from `4o by generating a new individual b, and adding C(b) and R(a, b) to `40, both with index r For this reason, C(b) and R(a,b) are contained in w(`41). We can obtain w (`41) from w(`40) by applying the modified exists-restriction rule to (3R.C)(a) (without loss of generality we may assume that the newly generated individual is called b). It should be noted that the (unmodified) exists-restriction rule need not be applicable since w(`4o) may well contain an individual c and assertions C(e) and R(a, c). [] For the disjunction rule, we have a similar lemma.
L e m m a 15. Let `4o,`41,`4s be labelled ABoxes such that `41,`4s are obtained from `4o by application of the disjunction rule. Then we either have w(`41) = ,,,(`40) = , , , ( A s ) , o r a r e obtained from by application of the (unlabelled) modified disjunction rule.

49

Proof. Assume that the disjunction rule is applied to the assertionM fact (C U D)(a), and that this fact has index r in .40. If ~o(r = false then w(.dl) = w(`4o) = w(`42). This can shown as in the corresponding cases in the proof of L e m m a 14. Thus assume that w(r = true. Then (C U D)(a) is an element of w(`40). In addition, we know that C(a) is contained in r and that D(a} is contained in w(`42). If both C(a) and D(a) are already present in w(Ao) then w(`41) = w(`4o) = w(`42). Otherwise, we can obtain w(A1),w(A2) from w(`40) by applying the modified disjunction rule to (C u D)(a). It should be noted that the (unmodified) disjunction rule need not be applicable since ~o(`40) m a y well contain one of C(a) and D(a), but not both. []
Now assume that we have obtained the complete ABoxes `41,-..,-Am by starting with .,4 U B, and applying the rules of the labelled consistency algorithm as long as possible. By L e m m a 14 and 15, and since the (modified) rules of the unlabelled consistency algorithm preserve solvability, we know that w(`4 U B) = `4 U Q is consistent iff one of w(.A1),... ,w(.4") is consistent. The next lemma implies that these projected ABoxes are also complete. L a m i n a 16. Let `40 be a labelled ABox to which none of the rules of the la-

belled consistency algorithm applies. Then none of the (unmodified) rules of the unlabelled consistency algorithm applies to w(Mo). Proof. We consider an assertional fact (C r-1D)(a) in w(.40), and show that the conjunction rule cannot be applied to this fact in w(`40). (The other cases can be treated similarly.) Since (C n D)(a) is present in w(A0) its index r in -40 satisfies w(r = true. Completeness of `40 implies that the (labelled) conjunction rule is not applicable to (CV1D)(a) in -4o- For this reason, .40 contains the assertional facts C(a) and D(a), and their indices (say r r are implied by r But then w(r = true implies w(r = true = w(r Thus C(a) and D(a) are contained in w(A0), which shows that the conjunction rule is not applicable to (CR D)(a) in w(`4o). []
Since ` 4 1 , - . . , Am are complete we thus know that w ( . 4 1 ) , . . . , ~o(`4~) are complete as well. Now Proposition 10 implies that w (`4i) is inconsistent iff it contains a clash. A particular clash A(a),-,A(a) E Ai is still present in w(,4;) iff w evaluates ind(A(a)) A ind(",A(a)) to "true." Now let r r be the formulae expressing all the clashes in .Ai. Obviously, w(`41) contains a clash iffw evaluates vk'-_l r to "true." For this reason, all the ABoxes w ( A 1 ) , . . . ,w(M.) contain a clash iff w evaluates to true the clash formula

i=1 j = l

AV

computed by the labelled consistency algorithm. This concludes the proof of Proposition 12.

50 To sum up, we thus have presented a solution of the two algorithmic problems described at the beginning of this section, and have proved the correctness of this solution. Together with the methods of Section 4 this gives us effective procedures to compute all extensions of terminological default theories.

Conclusion

We have investigated the integration of Reiter's default logic into a terminological representation formalism. We have shown that the treatment of open defaults by Skolemization is problematic, both from a semantic and an algorithmic point of view (see also [3]). For this reason, we have considered a restricted semantics where default rules are only applied to individuals explicitly present in the knowledge base. This treatment of default rules is similar to the treatment of monotonic rules in many terminological systems, which means that users of such systems are already familiar with the effects this restriction to explicit individuals has. However, because of the nonmonotonic character of default rules, this restriction may sometimes lead to more consequences than would have been obtained without it. With respect to the restricted semantics, the methods of Junker and Konolige and of Schwind and Risch for computing all extensions of a default theory can be applied. We have shown how the algorithmic requirements for Junker and Konolige's method (i.e., the computation of minimal inconsistent sets of assertional facts) and for an optimized algorithm based on a theorem of Schwind and Risch (i.e., the computation of maximal consistent sets of assertional facts) can be solved by an extension of the tableaux-based algorithm for assertional reasoning. As an alternative to the pragmatic solution described in the present paper, [5] proposes a new semantics for open defaults, in which defaults are also applied to implicit individuals. To make this possible without encountering the problems pointed out in Section 3, open defaults are not viewed as schemata for certain instantiated defaults. Instead, they are used to define a preference relation on models, which is then treated with a modified preferential approach. According to Reiter's semantics the specificity of prerequisites of rules has no influence on the order in which defaults rules are supposed to fire. In [4] we describe a modification of terminological default logic in which more specific defaults are preferred.

A c k n o w l e d g e m e n t s We should like to thank Bernhard Nebel and Peter PatelSchneider for helpful comments, and Thomas Trentz for implementing the procedure for computing extensions described in Subsection 4.2. This work has been supported by the German Ministry for Research and Technology (BMFT) under research contract I T W 92 01.

51

References
1. F. Baader and P. Hanschke. A Scheme for Integrating Concrete Domains into Concept Languages. Research Report RR-91-10, D F K I Kaiserslantern, 1991. 2. F. Baader and P. Hanschke. A scheme for integrating concrete domains into concept languages. In Proceedings of the 12th International Joint Conference on Artificial Intelligence, Sydney, Australia, 1991. 3. F. Baader and B. Hollunder. Embedding defaults into terminological knowledge representation formalisms. In Proceedings of the 3rd International Conference on Knowledge Representation and Reasoning, Cambridge, Mass., 1992. 4. F. Baader and B. Hollunder. How to prefer more specific defaults in terminological default logic. Research Report RR-92-58, D F K I Saarbrficken, 1992. Also to appear in Proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France, 1993. 5. F. Baader and K. Schlechta. A semantics for open normal defaults via a modified preferential approach. Research Report RR-93-13, D F K I Saarbrficken, 1993. 6. R. J. Brachman. 'I lied about the trees' or, defaults and definitions in knowledge representation. The A I Magazine, 6(3):80-93, 1985. 7. R. J. Brachman, D . L . McGuinness, P . F . Patel-Schneider, L. A. Resnick, and A. Borgida. Living with CLASSIC: When and how to use a KL-ONE-like language. In J. Sowa, editor, Principles of Semantic Networks, pages 401-456. Morgan Kanfmann, San Mateo, Calif., 1991. 8. R. J. Brachman and J. G. Schmolze. An overview of the K L - O N E knowledge representation system. Cognitive Science, 9(2):171-216, 1985. 9. J. Doyle. A truth maintenance system. Artificial Intelligence, 12:231-272, 1979. 10. M. Garey and D. Johnson. Computers and Intractability- A Guide to the Theory of NP-Completeness. Freeman, San Francisco, Cal., 1979. 11. B. Hollunder. Hybrid inferences in KL-ONE-based knowledge representation systems. In 14th German Workshop on Artificial Intelligence, volume 251 of Informatik-Fachberichte, pages 38-47, Ebingerfeld, Germany, 1990. Springer. 12. U. Junker and K. Konolige. Computing extensions of autoepistemic and default logics with a truth maintenance system. In Proceedings of the 8th National Conference on Artificial Intelligence, pages 278-283, Boston, Mass., 1990. 13. H. A. Kautz and B. Selman. Hard problems for simple defaults. In Proceedings of

14. 15. 16. 17.

the 1st International Conference on Principles of Knowledge Representation and Reasoning, pages 189-197, Toronto, Ont., 1989. A. Kobsa. The SB-ONE knowledge representation workbench. In Preprints of the Workshop on Formal Aspects of Semantic Networks, Two Habours, Calif., 1989. E. Mays and B. Dionne. Making KR systems useful. In TerminologicalLogic Users Workshop - Proceedings, pages 11-12, KIT-Report 95, TU Berlin, 1991. J. McCarthy. Circumscription - a form of non-monotoIfic reasoning. Artificial Intelligence, 13:27-39, 1980. D. McDermott and J. Doyle. Non-monotonic logic I. Artificial Intelligence, 13:41-

72, 1980. 18. R. McGregor. Statement of interest. In K. von Luck, B. Nebel, and C. Peltason, editors, Statement of Interest for the 2nd International Workshop on Terminological Logics. Document D-91-13, DFKI Kaiserslautern, 1991. 19. taBACK. System presentation. In Terminological Logic Users W o r k s h o p - Proceedings, page 186, KIT-Report 95, TU Berlin, 1991.

52 20. C. Peltason, K. v. Luck, and C. Kindermann (Org.). Terminological logic users workshop - Proceedings. K I T Report 95, TU Berlin, 1991. 21. R. Reiter. A logic for default reasoning. Artificial Intelligence, 13(1-2):81-132, 1980. 22. R. Reiter. A theory of diagnosis from first principles. Artificial Intelligence, 32:5795, 1987. 23. R. Rymon. Search through systematic set enumeration. In Proceedings of the 3rd International Conference on Knowledge Representation and Reasoning, Cambridge, Mass., 1992. 24. M. Schmidt-SchatrB and G. Smolka. Attributive concept descriptions with complements. Artificial Intelligence, 47, 1991. 25. C. Schwind and V. Risch. A tableau-based characterisation for default logic. In

Proceedings of the 1st European Conference on Symbolic and Quantitative Approaches for Uncertainty, pages 310-317, Marseille, France, 1991.

Potrebbero piacerti anche