Sei sulla pagina 1di 21

Deductive Databases

CS 95 Advanced Database Systems Handout 6

Deductive Databases
An area that is the intersection of databases, logic, and artificial intelligence or knowledge bases; A deductive database system is a database system that includes capabilities to define (deductive) rules, which can deduce or infer additional information from the facts that are stored in a database Part of the theoretical foundation for some deductive database systems is mathematical logic, such rules are often referred to as logic databases. May be also referred to as intelligent databases, expert database systems or knowledge-based systems. This systems also incorporate reasoning and inferencing capabilities using techniques that were developed in the field of artificial intelligence.

Knowledge-based Systems vs Deductive Database Systems


Knowledge-based expert systems have traditionally assumed that the data needed resides in main memory; hence secondary storage management is not an issue. Deductive database systems attempt to change this restriction so that either a DBMS is enhanced to handle an expert system interface or an expert system is enhanced to handle secondary storage resident data. The knowledge in an expert or knowledge-based system is extracted from application experts and refers to an application domain rather than to knowledge inherent in the data.

Deductive Databases Terminology


A deductive database system is a database system that includes capabilities to define (deductive) rules, which can deduce or infer additional information from the facts that are stored in a database Rules are specified using a declarative language - a language in which we specify what to achieve rather than how to achieve it. An inference engine (or deduction mechanism) within the system can deduce new facts from the database by interpreting these rules. Model used for deductive databases is closely related to the relational model, and particularly to the domain relational calculus formalism.

Deductive Databases Terminology (contd)


Deductive databases is also related to the field of logic programming and the Prolog language. Deductive database work based on logic has used Prolog as a starting point. Datalog - a variation of Prolog which is used to define rules declaratively in conjunction with an existing set of relations, which are themselves treated as literals in the language. Although the language structure of Datalog resembles that of Prolog, its operational semantics - that is, how a Datalog program is to be executed - is still a topic of active research.

A deductive database uses two main types of specifications: facts and rules. Facts are specified in a manner similar to the ways relations are specified, except that it is not necessary to include attribute names.

Deductive Databases: Facts and Rules

Recall that a tuple in a relation describes some real-world fact whose meaning is partly determined by the attribute name. In a deductive database, the meaning of an attribute value in a tuple is determined solely by its position within the tuple. They specify virtual relations that are not actually stored but can be formed from the facts by applying inference mechanisms based on the rule specifications. The main difference between rules and views is that rules may involve recursion and hence may yield virtual relations that cannot be defined in terms of standard relational views.

Rules are somewhat similar to relational views.

Deductive Databases: Evaluation of Prolog Programs


The evaluation of Prolog programs is based on a technique called backward chaining which involves a top-down evaluation of goals. A goal in Prolog is equivalent to a query in a relational database system. In a deductive database that use Datalog attention has been devoted to handling large volumes of data stored in a relational database. Hence, evaluation techniques have been devised that resemble that of bottom-up evaluation (forward chaining). Prolog suffers from the limitation that the order of specifications of facts and rules is significant in evaluation; moreover, the order of literals within a rule is significant. The execution techniques for Datalog programs attempt to circumvent these problems.

Prolog Programming System


Prologisalogicprogrammingsystemthatisbasedona resolutiontheoremprover.Thesystemconsistsoftwo main components: the Prolog database and the inference engine. The Prolog database contains the sequence of Horn clauses that defines the logic program. The Prolog inference engine provides the control mechanism for proof construction using a theorem proving algorithm based on unification and backtracking. Prolog is not a pure logic programming language but rather a practical and partial implementation of logic programming. Apart from Horn clause logic, Prolog also incorporates evaluable predicates that have only a procedural interpretation and secondorder predicate logic features which allow

ComponentsofthePrologSystem
P r o lo g S y s t e m

P r o lo g D a t a b a s e (P D B )

P r o lo g I n f e r e n c e E n g i n e (P IE )
Q u e r ie s U SER S R e s u lt s

Prolog Programming System (contd)


The data objects of Prolog, called terms, can be either a constant,avariable,astructureoralist.Prologisafunction freelanguage;functionalexpressionsarenotvalidtermsbut structuresareallowed,whichcanbeusedtothesameeffect as functional expressions. Each type of term is briefly describedbelow:

Constants include integers (e.g. 0, 1, 10), reals (e.g. 1.45, 10.04), strings (e.g. "Hello") and atoms (e.g. like, john, 'New York') which normally begin with a lower case letter or enclosed in single quotations. Some special combinations are also considered atoms (e.g.? ,:, >).Thespecialunderlinecharacter'_'maybeinsertedin themiddleofanatomtoimproveitslegibility. Variablesaresimilartoatomsexceptthattheybeginwithacapital letter or an underline character '_' (e.g. X, Name, _address). The underline character '_' also denotes an anonymous variable whose instancesarealwaysuniquewithinthePrologsystem.

Prolog Programming System (contd)

Structuresaremorecomplexdataobjects.Astructurecomprisesafunctor and a sequence of one or more terms called arguments. A functor is characterized by its name, which is an atom, and its arity or number of arguments. In contrast to functional expressions, structures are not evaluated when used as arguments. However, the use of structures as argumentsallowsmetaprogramminginPrologsinceastructurebothcanbe manipulated as a datum when used as an argument and evaluated as a procedure when taken independently as a predicate. For example, the structurepoint/3withargumentsX,YandZ,whichiswrittenaspoint(X,Y, Z), can be used as an argument to line/2 as follows: line(point(X1, Y1, Z1), point(X2,Y2,Z2)). ListsareconcatenationsofPrologtermsthathastheform.(a,.(b,.(c,[]))) orsimply[a,b,c].

Similartologicprograms,aclauseinPrologcanbeeitherafact,aruleor a query. In Prolog, the ':' is used instead of '' as the implication symbol and the naming convention for atoms and variables is the reverse of that of the standard logic program notation, that is, atoms startwithalowercasecharacterandvariablesstartwithanuppercase character. Prolog has a declarative and procedural semantics which is

PrologComparisonPredicates

PrologRepresentationofEntity RelationshipDatabaseSchemes

PrologRepresentationofEntity RelationshipDatabaseRelations

PrologRepresentationofEntity RelationshipDatabaseRelations

Prolog Evaluation Strategy


As mentioned above, the Prolog inference engine (PIE) is based on a resolution theoremprover that is based on unification and backtracking. Briefly, resolution is an inference pattern that permits the taking of arbitrarily large inference steps which require very considerable computational effort to carry out (Robinson, 1992); unification is the process of matching a subgoal with the head of a clause; and backtracking is a nondeterministic process of reviewing the goals which have been

Prolog Evaluation Strategy


The Prolog inference engine (PIE) is based on a resolution theoremprover that is based on unification and backtracking. Briefly, resolution is an inference pattern that permits the taking of arbitrarilylargeinferencestepswhichrequirevery considerable computational effort to carry out; unification is the process of matching a subgoal with the head of a clause; and backtracking is a nondeterministic process of reviewing the goals which have been satisfied and attempting to resatisfy these goals by finding alternative solutions.TheProloggoalevaluationstrategyisby defaulttopdownandproceedsfromlefttoright.

Prolog Evaluation Strategy


(a) p(a,b). q(b,d). p(a,c). q(c,f). r(A,B,C) :- p(A,B), q(B,C). (b) :- r(X,Y,Z). (c) (1) 0 CALL: r(X,Y,Z)? (2) 1 CALL: p(X,Y)? (2) 1 EXIT: p(a,b) (3) 1 CALL: q(b,Z)? (3) 1 EXIT: q(b,d) (1) 0 EXIT: r(a,b,d) (1) 0 REDO: r(a,b,d)? (3) 1 REDO: q(b,d)? (3) 1 FAIL: q(b,Z) (2) 1 REDO: p(a,b)? (2) 1 EXIT: p(a,c) (4) 1 CALL: q(c,Z)? (4) 1 EXIT: q(c,f) (1) 0 EXIT: r(a,c,f) (1) 0 REDO: r(a,c,f)? (4) 1 REDO: q(c,f)? (4) 1 FAIL: q(c,Z) (2) 1 REDO: p(a,c)? (2) 1 FAIL: p(X,Y) (1) 0 FAIL: r(X,Y,Z)

PrologEvaluation:(a)Database,(b)Query,(c)EvaluationTrace

Prolog/Datalog Notation
Notation is based on providing predicates with unique names. A predicate has an implicit meaning, which is suggested by the predicate name, and a fixed number of arguments. If an argument are all constant values, the predicate simply states that a certain fact is true. If the predicate has variables for arguments, it is either considered as a query or as part of a rule or constraint. Prolog convention - all constant values in a predicate are either numeric or character strings; they are represented as identifiers (or names) starting with lowercase letters only, whereas variable names always start with an uppercase letter.

Prolog/Datalog: Example
(a) Prolog notation Facts
supervise(franklin,john). supervise(franklin,namesh). supervise(franklin,joyce). supervise(jennifer,alicia). supervise(jennifer,ahmad). supervise(james, franklin). supervise(james, jennifer).

(b) The supervisory tree james franklin john ramesh joyce jennifer alicia ahmad

Rules
superior(X,Y) :- supervise(X,Y). superior(X,Y) :- supervise(X,Z), superior(Z,Y). subordinate(X,Y) :- superior(Y,X).

Queries
superior(james,Y)? superior(james,joyce)?

Deductive Databases Summary


stores knowledge with the DB different methods of storing knowledge provide the terms:

KBMS or Expert Databases - use expert system IF..THEN..ELSE type rules Deductive or Logic-Based databases often use Prolog-type rules

Expert databases generally incorporate knowledge extracted from experts in the field to provide reasoning and inferencing capabilities. Logic-Based use axioms (logic theory) to store the data and deductive axioms (rules) to extend that information

eg: to store the fact that Anne is the parent of Betty use: parent (Anne, Betty); parent (Betty, Cameron); Now a grandparent can be defined by the rule: grandparent (X, Z) = parent (X, Y), parent (Y, Z);

Many forms of deductive databases exist including Deductive Object-Oriented Databases Applications include:

Enterprise modelling Hypothesis testing Electronic commerce

Potrebbero piacerti anche