L08 - Learning and Logic

11s1: COMP9417 Machine Learning and Data Mining
Acknowledgement: Material derived from slides for the book Machine Learning, Tom M. Mitchell, McGraw-Hill, 1997 http://www-2.cs.cmu.edu/~tom/mlbook.html and the book Inductive Logic Programming: Techniques and Applications by N. Lavrac and S. Dzeroski, Ellis Horwood, New York, 1994 (available at http://www-ai.ijs.si/SasoDzeroski/ILPBook/) and the paper by A. Cootes, S.H. Muggleton, and M.J.E. Sternberg The automatic discovery of structural principles describing protein fold space. Journal of Molecular Biology, 2003. (available at http://www.doc.ic.ac.uk/~shm/jnl.html) and the book Data Mining (2e), Ian H. Witten and Eibe Frank, Morgan Kaufmann, 2005. http://www.cs.waikato.ac.nz/ml/weka
Learning and Logic

April 19, 2010
Aims
This lecture will introduce you to theoretical and applied aspects of representing hypotheses for machine learning in rst-order logic. Following it you should be able to: outline the key dierences between propositional and rst-order learning describe the problem of learning relations and some applications reproduce the basic FOIL algorithm and its use of information gain outline the problem of induction in terms of inverse deduction describe inverse resolution and least general generalisation dene the -subsumption generality ordering for clauses [Recommended reading: Mitchell, Chapter 10] [Recommended exercises: 10.5 10.7 (10.8)]
COMP9417: April 19, 2010 Learning and Logic: Slide 1
Relevant programs
Progol http://www.doc.ic.ac.uk/~shm/progol.html Aleph http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph FOIL http://www.rulequest.com/Personal/ iProlog http://www.cse.unsw.edu.au/~claude/research/software/ Golem http://www.doc.ic.ac.uk/~shm/golem.html See also: http://www-ai.ijs.si/~ilpnet2/systems/
COMP9417: April 19, 2010
Learning and Logic:
Slide 2
Representation in Propositional Logic

Propositional variables: P, Q, R, . . . Negation: S, T, . . . Logical connectives: , , , Well-formed formulae: P Q, (R S) T , etc. Inference rules: modus ponens modus tollens Given B and A B infer A Given A and A B infer B
Meaning in Propositional Logic

Propositional variables stand for declarative sentences (properties): P Q the paper is red the solution is acid
Potentially useful inferences: P Q If the paper is red then the solution is acid
Meaning of such formulae can be understood with a truth table: P T T F F

Learning and Logic: Slide 3 COMP9417: April 19, 2010
Enable sound or valid inference.
Q T F T F
P Q T F T T
Learning and Logic: Slide 4
COMP9417: April 19, 2010
Representation in First-Order Predicate Logic

We have a richer language for developing formulae: constant symbols: Fred, Jane, Copper, Manganese, . . . function symbols: Cons, Succ, . . . variable symbols: x, y, z, . . . predicate symbols: Parent, Likes, Binds, . . . We still have: Negation: Likes(Bob, Footy), . . . Logical connectives: , , , but we also have quantication: xLikes(x, Fred), yBinds(Copper, y) And we still have well-formed formulae and inference rules . . .
COMP9417: April 19, 2010 Learning and Logic: Slide 5 COMP9417: April 19, 2010
Meaning in First-Order Logic

Same basic idea as propositional logic, but more complicated. Give meaning to rst-order logic formulae by interpretation with respect to a given domain D by associating each constant symbol with some element of D each n-ary function symbol with some function from Dn to D each n-ary predicate symbol with some relation in Dn For variables, essentially consider associating all or some domain elements in the formula, depending on quantication. Interpretation is association of a formula with a truth-valued statement about the domain.
Learning and Logic:
Slide 6
Learning First Order Rules

Why do that? trees, rules so far have allowed only comparisons of a variable with a constant value (e.g., sky = sunny, temperature < 45) these are propositional representations have same expressive power as propositional logic to express more powerful concepts, say involving relationships between example objects, propositional representations are insucient, and we need a more expressive representation E.g., to classify X depending on its relation R to another object Y
How to learn concepts about nodes in a graph ?
Cannot use xed set of attributes where each attribute describes a linked node (how many attributes ?) Cannot use xed set of attributes to learn connectivity concepts . . .
COMP9417: April 19, 2010
Learning and Logic:
Slide 7
COMP9417: April 19, 2010
Learning and Logic:
Slide 8
Prolog denitions for relational concepts

Some Prolog syntax: all predicate and constant names begin with a lower-case letter predicate (relation) names, e.g. uncle, adjacent constant names, e.g. fred, banana all variable names begin with an upper-case letter X, Y, Head, Tail a predicate is specied by its name and arity (number of arguments), e.g. male/1 means the predicate male with one argument sister/2 means the predicate sister of with two arguments
BUT in rst order logic sets of rules can represent graph concepts such as Ancestor(x, y) P arent(x, y) Ancestor(x, y) P arent(x, z) Ancestor(z, y) The declarative programming language Prolog is based on the Horn clause subset of rst-order logic a form of Logic Programming : Prolog is a general purpose programming language: logic programs are sets of rst order rules pure Prolog is Turing complete, i.e., can simulate a Universal Turing machine (every computable function) learning in this representation is called Inductive Logic Programming (ILP)
COMP9417: April 19, 2010
Learning and Logic:
Slide 10
predicates are dened by sets of clauses, each with that predicate in its head e.g. the recursive denition of ancestor/2 ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y). clause head, e.g. ancestor/2, is to the left of the :- clause body, e.g. parent(X,Z), ancestor(Z,Y), is to the right of the :-
each instance of a relation name in a clause is called a literal a denite clause has exactly one literal in the clause head a Horn clause has at most one literal in the clause head Prolog programs are sets of Horn clauses Prolog is a form of logic programming (many approaches) related to SQL, functional programming, . . .
COMP9417: April 19, 2010
Learning and Logic:
Slide 11
COMP9417: April 19, 2010
Learning and Logic:
Slide 12
Induction as Inverted Deduction

Induction is nding h such that (xi, f (xi) D) B h xi f (xi) f (xi) : B:

pairs of people, u, v such that child of u is v,
where
Child(Bob, Sharon) P arent(u, v) F ather(u, v)
xi is ith training instance f (xi) is the target function value for xi B is other background knowledge
xi : M ale(Bob), F emale(Sharon), F ather(Sharon, Bob)
What satises (xi, f (xi) D) B h xi f (xi)? h1 : Child(u, v) F ather(v, u) h2 : Child(u, v) P arent(v, u)

So lets design inductive algorithm by inverting operators for automated deduction!

Induction as Inverted Deduction Induction is, in fact, the inverse operation of deduction, and cannot be conceived to exist without the corresponding operation, so that the question of relative importance cannot arise. Who thinks of asking whether addition or subtraction is the more important process in arithmetic? But at the same time much dierence in diculty may exist between a direct and inverse operation; . . . it must be allowed that inductive investigations are of a far higher degree of diculty and complexity than any questions of deduction. . . . (W.S. Jevons, 1874)
A photograph of the Logic Piano invented by William Stanley Jevons. Photograph taken at the Sydney Powerhouse Museum on March 5, 2006. This item is part of the collection of the Museum of the History of Science, Oxford and was on loan to the Powerhouse Museum. [From: http://commons. wikimedia.org/wiki/File: William_Stanley_Jevons_ Logic_Piano.jpg, April 19, 2010]
We have mechanical deductive operators F (A, B) = C, where A B C need inductive operators O(B, D) = h where (xi, f (xi) D) (B h xi) f (xi)
Positives: Subsumes earlier idea of nding h that ts training data Domain theory B helps dene meaning of t the data B h xi f (xi) Suggests algorithms that search H guided by B
COMP9417: April 19, 2010
Learning and Logic:
Slide 17
COMP9417: April 19, 2010
Learning and Logic:
Slide 18
Deduction: Resolution Rule

P L P L R R
Negatives: Doesnt allow for noisy data. Consider (xi, f (xi) D) (B h xi) f (xi) First order logic gives a huge hypothesis space H overtting... intractability of calculating all acceptable hs
1. Given initial clauses C1 and C2, nd a literal L from clause C1 such that L occurs in clause C2 2. Form the resolvent C by including all literals from C1 and C2, except for L and L. More precisely, the set of literals occurring in the conclusion C is C = (C1 {L}) (C2 {L}) where denotes set union, and denotes set dierence.
COMP9417: April 19, 2010
Learning and Logic:
Slide 19
COMP9417: April 19, 2010
Learning and Logic:
Slide 20
Inverting Resolution
C : KnowMaterial 2 C : PassExam V 1 KnowMaterial
Inverting Resolution (Propositional)

1. Given initial clauses C1 and C, nd a literal L that occurs in clause C1, but not in clause C. 2. Form the second clause C2 by including the following literals C2 = (C (C1 {L})) {L} 3. Given initial clauses C2 and C, nd a literal L that occurs in clause C2, but not in clause C. 4. Form the second clause C1 by including the following literals
V Study
C : PassExam V 1
C : KnowMaterial 2 KnowMaterial
Study
C: PassExam V
Study
C: PassExam V
Study
C1 = (C (C2 {L})) {L}
COMP9417: April 19, 2010
Learning and Logic:
Slide 21
COMP9417: April 19, 2010
Learning and Logic:
Slide 22
Duce operators
Op Same Head Identication p A, B qB p A, q p A, q Intra-construction p A, B1 w B1 p A, B2 w B2 p A, w Dierent Head Absorption p A, B qA
First order resolution

First order resolution: 1. Find a literal L1 from clause C1, literal L2 from clause C2, and substitution such that L1 = L2 2. Form the resolvent C by including all literals from C1 and C2, except for L1 and L2. More precisely, the set of literals occurring in the conclusion C is C = (C1 {L1}) (C2 {L2})
p q, B qA
Inter-construction p1 A, B1 p1 w, B1 p2 A, B2 p2 w, B2 wA
Each operator is read as: pre-conditions on left, post-conditions on right.
COMP9417: April 19, 2010
Learning and Logic:
Slide 23
COMP9417: April 19, 2010
Learning and Logic:
Slide 24
Inverting First order resolution

Factor C = (C1 {L1})1) (C2 {L2})2 C2 should have no common literals with C1 C (C1 {L1})1) = (C2 {L2})2
1 By denition of resolution L2 = L112 1 1 C2 = (C (C1 {L1})1)2 {L112 }
Cigol
Father (Tom, Bob) GrandChild( y,x ) V Father ( x,z ) V Father ( z,y )
{Bob/y, Tom/z}
Father (Shannon, Tom )
GrandChild ( Bob,x) V
Father ( x,Tom )
{Shannon/x}
GrandChild ( Bob, Shannon)
COMP9417: April 19, 2010
Learning and Logic:
Slide 25
COMP9417: April 19, 2010
Learning and Logic:
Slide 26
Subsumption and Generality

-subsumption C -subsumes D if there is a substitution such that C D. C is at least as general as D (C D) if C -subsumes D. If C -subsumes D then C logically entails D (but not the reverse). -subsumption is a partial order, thus generates a lattice in which any two clauses have a least-upper-bound and a greatest-lower-bound. The least general generalisation (LGG) of two clauses is their least-upperbound in the -subsumption lattice.
LGG
Plotkin (1969, 1970, 1971), Reynolds (1969) LGG of clauses is based on LGGs of literals Lgg of literals is based on LGGs of terms, i.e. constants and variables LGG of two constants is a variable, i.e. a minimal generalisation
COMP9417: April 19, 2010
Learning and Logic:
Slide 27
COMP9417: April 19, 2010
Learning and Logic:
Slide 28
LGG of atoms
Two atoms are compatible if they have the same predicate symbol and arity (number of arguments) lgg(a, b) for dierent constants or functions with dierent function symbols is the variable X lgg(f (a1, ..., an), f (b1, ..., bn)) is f (lgg(a1, b1), ..., lgg(an, bn)) lgg(Y1, Y2) for variables Y1, Y2 is the variable X Note: 1. must ensure that the same variable appears everywhere its bound arguments do in the atom 2. must ensure introduced variables appear nowhere in the original atoms
LGG of clauses
The LGG of two clauses C1 and C2 is formed by taking the LGGs of each literal in C1 with every literal in C2. Clauses form a subsumption lattice, with LGG as least upper bound and MGI (most general instance) as lower bound. Lifts the concept learning lattice to a rst-order logic representation. Leads to relative LGGs with respect to background knowledge.
Learning and Logic:
Slide 30
Subsumption lattice
g
RLGG LGG relative to background knowledge

Example from Quinlan (1991)
g g
lgg(a,b)
Given two ground instances of target predicate Q/k, Q(c1, c2, . . . , ck ) and Q(d1, d2, . . . , dk ), plus other logical relations representing background knowledge that may be relevant to the target concept, the relative least general generalisation (rlgg) of these two instances is:
b
a
i i
Q(lgg(c1, d1), lgg(c2, d2), . . .)
mgi(a,b)
for every pair r1, r2 of ground instances from each relation in the background knowledge.
{lgg(r1, r2)}
i
COMP9417: April 19, 2010 Learning and Logic: Slide 31 COMP9417: April 19, 2010 Learning and Logic: Slide 32
RLGG Example
RLGG Example
This gure depicts two scenes s1 and s2 and may be described by the predicates Scene/1, On/3, Left-of/2, Circle/1, Square/1 and Triangle/1.
Predicate Scene On Left-of Circle Square Triangle

COMP9417: April 19, 2010
Ground Instances (tuples) {< s1 >, < s2 >} {< s1, a, b >, < s2, f, e >} {< s1, b, c >, < s2, d, e >} {< a >, < f >} {< b >, < d >} {< c >, < e >}
COMP9417: April 19, 2010
Learning and Logic:
Slide 33
RLGG Example
RLGG Example
To compute RLGG of the two scenes generate the clause: Scene(lgg(s1, s2)) On(lgg(s1, s2), lgg(a, f ), lgg(b, e)), Left-of(lgg(s1, s2), lgg(b, d), lgg(c, e)), Circle(lgg(a, f )), Square(lgg(b, d)), Triangle(lgg(c, e))
Compute LGGs to introduce variables into the nal clause: Scene(A) On(A, B, C), Left-of(A, D, E), Circle(B), Square(D), Triangle(E)
Example: First Order Rule for Classifying Web Pages

[Slattery, 1997] course(A) has-word(A, instructor), not has-word(A, good), link-from(A, B), has-word(B, assign), not link-from(B, C) Train: 31/31, Test: 31/34 Can learn graph-type representations.

to learn logic programs we can adopt propositional rule learning methods the target relation is clause head, e.g. ancestor/2 think of this as the consequent the clause body is constructed using predicates from background knowledge think of this as the antecedent unlike propositional rules rst order rules can have variables tests on more than one variable at a time recursion
learning is set up as a search through the hypothesis space of rst order rules
FOIL(T arget predicate, P redicates, Examples) P os := positive Examples N eg := negative Examples while P os, do // Learn a N ewRule N ewRule := most general rule possible N ewRuleN eg := N eg while N ewRuleN eg, do // Add a new literal to specialize N ewRule Candidate literals := generate candidates Best literal := argmaxLCandidate literals F oil Gain(L, N ewRule) add Best literal to N ewRule preconditions N ewRuleN eg := subset of N ewRuleN eg that satises N ewRule preconditions Learned rules := Learned rules + N ewRule P os := P os {members of P os covered by N ewRule} Return Learned rules
Specializing Rules in FOIL

Learning rule: P (x1, x2, . . . , xk ) L1 . . . Ln Candidate specializations add new literal of form: Q(v1, . . . , vr ), where at least one of the vi in the created literal must already exist as a variable in the rule. Equal(xj , xk ), where xj and xk are variables already present in the rule The negation of either of the above forms of literals
COMP9417: April 19, 2010
Learning and Logic:
Slide 40
Completeness and Consistency (Correctness)
Completeness and Consistency (Correctness)
COMP9417: April 19, 2010
Learning and Logic:
Slide 41
COMP9417: April 19, 2010
Learning and Logic:
Slide 42
Variable Bindings
A substitution replaces variables by terms Substitution applied to literal L is written L If = {x/3, y/z} and L = P (x, y) then L = P (3, z) FOIL bindings are substitutions mapping each variable to a constant: GrandDaughter(x, y) With 4 constants in our examples we have 16 possible bindings: {x/V ictor, y/Sharon}, {x/V ictor, y/Bob}, . . . With 1 positive example of GrandDaughter, other 15 bindings are negative: GrandDaughter(V ictor, Sharon)
Information Gain in FOIL

F oil Gain(L, R) t log2 p1 p0 log2 p1 + n 1 p0 + n 0
Where
L is the candidate literal to add to rule R p0 = number of positive bindings of R n0 = number of negative bindings of R p1 = number of positive bindings of R + L n1 = number of negative bindings of R + L t is the number of positive bindings of R also covered by R + L
Learning and Logic:
Slide 44
FOIL Example
Information Gain in FOIL
Learning with FOIL
Note log2 p0p0 0 is minimum number of bits to identify an arbitrary positive +n binding among the bindings of R log2 p1p1 1 is minimum number of bits to identify an arbitrary positive +n binding among the bindings of R + L F oil Gain(L, R) measures the reduction due to L in the total number of bits needed to encode the classication of all positive bindings of R
Background Family Tree Alice - Tom
Fred - Mary Bob - Cindy Ann - Frank Ted
John - Barb Carol
Target Predicate: ancestor

New clause: ancestor(X,Y) :-. Best antecedent: parent(X,Y) Gain: 31.02
Carol
Ted
Target Predicate: ancestor

Completeness and Correctness
FOIL Example
New clause: ancestor(X,Y) :-. Best antecedent: parent(X,Y) Gain: 31.02 Learned clause: ancestor(X,Y) :- parent(X,Y). New clause: ancestor(X,Y) :-. Best antecedent: parent(Z,Y) Gain: 13.65 Best antecedent: ancestor(X,Z) Gain: 27.86 Learned clause: ancestor(X,Y) :- parent(Z,Y), ancestor(X,Z). Denition: ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- parent(Z,Y), ancestor(X,Z).
7 0 2 1 x
COMP9417: April 19, 2010
8 3 4
6 5
represents
LinkedTo(x,y)
17
FOIL Example
FOIL as a propositional learner

target predicate is usual form of class value and attribute values Class1(V1, V2, . . . , Vm), Class2(V1, V2, . . . , Vm), . . . literals restricted to those in typical propositional learners Vi = const, Vi > num, Vi num plus extended set Vi = Vj , Vi Vj FOIL results vs C4.5 accuracy competitive, especially with extended literal set FOIL required longer computation C4.5 more compact, i.e. better pruning
Instances: pairs of nodes, e.g 1, 5, with graph described by literals LinkedTo(0,1), LinkedTo(0,8) etc. Target function: CanReach(x,y) true i directed path from x to y Hypothesis space: Each h H is a set of Horn clauses using predicates LinkedT o (and CanReach)
COMP9417: April 19, 2010
Learning and Logic:
Slide 49
FOIL learns Prolog programs from examples

from I. Bratkos book Prolog Programming for Articial Intelligence introductory list programming problems training sets by randomly sampling from universe of 3 and 4 element lists FOIL learned most predicates completely and correctly some predicates learned in restricted some learned in more complex form than in book most learned in few seconds, some much longer
FOIL learns Prolog programs from examples
COMP9417: April 19, 2010
Learning and Logic:
Slide 51
COMP9417: April 19, 2010
Learning and Logic:
Slide 52
Determinate Literals
adding a new literal Q(X, Y ) where Y is the unique value for X this will result in zero gain ! FOIL gives a small positive gain to literals introducing a new variable BUT there may be many such literals
Determinate Literals
Rening clause A L1, L2, . . . , Lm1 a new literal Lm is determinate if Lm introduces new variable(s) there is exactly one extension of each positive tuple that satises Lm there is no more than one extension of each negative tuple that satises Lm So Lm preserves all positive tuples and does not increase the set of bindings Determinate literals allow growing the clause to overcome greedy search myopia without blowing up the search space.
COMP9417: April 19, 2010
Learning and Logic:
Slide 53
COMP9417: April 19, 2010
Learning and Logic:
Slide 54
Identifying document components

Problem: learn rules to locate logical components of documents documents have varying numbers of components relationships (e.g. alignment) between pairs of components inherently relational task target relations to identify sender, receiver, date, reference, logo.
background knowledge 20 single page documents 244 components 57 relations specifying component type (text or picture) position on page alignment with other components test set error from 0% to 4%
COMP9417: April 19, 2010
Learning and Logic:
Slide 55
COMP9417: April 19, 2010
Learning and Logic:
Slide 56
Text applications of rst-order logic in learning

Q: when to use rst-order logic in machine learning ? A: when relations are important.
COMP9417: April 19, 2010
Learning and Logic:
Slide 57
COMP9417: April 19, 2010
Learning and Logic:
Slide 58
Representation for text

Example: text categorization, i.e. assign a document to one of a nite set of categories. Propositional learners: use a bag-of-words, often with frequency-based measures disregards word order, e.g. equivalence of Thats true, I did not do it Thats not true, I did do it First-order learners: word-order predicates in background knowledge has word(Doc, Word, Pos) Pos1 < Pos2
Learning information extraction rules

What is information extraction ? ll a pre-dened template from a given text. Partial approach to nding meaning of documents. Given: examples of texts and lled templates Learn: rules for lling template slots based on text
COMP9417: April 19, 2010
Learning and Logic:
Slide 60
Sample Job Posting

Subject: US-TN-SOFTWARE PROGRAMMER Date: 17 Nov 1996 17:37:29 GMT Organization: Reference.Com Posting Service Message-ID: 56nigp$mrs@bilbo.reference.com SOFTWARE PROGRAMMER Position available for Software Programmer experienced in generating software for PC-Based Voice Mail systems. Experienced in C Programming. Must be familiar with communicating with and controlling voice cards; preferable Dialogic, however, experience with others such as Rhetorix and Natural Microsystems is okay. Prefer 5 years or more experience with PC Based Voice Mail, but will consider as little as 2 years. Need to nd a Senior level person who can come on board and pick up code with very little training. Present Operating System is DOS. May go to OS-2 or UNIX in future. Please reply to: Kim Anderson AdNET (901) 458-2888 fax kimander@memphisonline.com Date: 17 Nov 1996 17:37:29 GMT
Example job posting

Subject: US-TN-SOFTWARE PROGRAMMER
Organization: Reference.Com Posting Service Message-ID: 56nigp$mrs@bilbo.reference.com SOFTWARE PROGRAMMER Position available for Software Programmer experienced in generating software for PC-Based Voice Mail systems. Experienced in C Programming. Must be familiar with communicating with and controlling voice cards; preferable Dialogic, however, experience with others such as Rhetorix and Natural Microsystems is okay. Prefer 5 years or more experience with PC Based Voice Mail, but will consider as little as 2 years. Need to nd a Senior level person who can come on board and pick up code with very little training. Present Operating System is DOS. May go to OS-2 or UNIX in future. Please reply to: Kim Anderson AdNET (901) 458-2888 fax kimander@memphisonline.com
COMP9417: April 19, 2010
Learning and Logic:
Slide 61
COMP9417: April 19, 2010
Learning and Logic:
Slide 62
Example lled template

id: 56nigp$mrs@bilbo.reference.com title: SOFTWARE PROGRAMMER salary: company: recruiter: state: TN city: country: US language: C platform: PC | DOS | OS-2 | UNIX application: area: Voice Mail req years experience: 2 desired years experience: 5 req degree: desired degree: post date: 17 Nov 1996
A learning method for Information Extraction

Rapier (Cali and Mooney, 2002) is an ILP-based approach which learns information extraction rules based on regular expression-type patterns Pre-Filler Patterns: what must match before ller Filler Patterns: what the ller pattern is Post-Filler Patterns: what must match after ller Algorithm uses a combined bottom-up (specic-to-general) and top-down (general-to-specic) approach to generalise rules. syntactic analysis: Brill part-of-speech tagger semantic analysis: WordNet (Miller, 1993)
COMP9417: April 19, 2010
Learning and Logic:
Slide 63
COMP9417: April 19, 2010
Learning and Logic:
Slide 64
A learning method for Information Extraction
Progol
Progol: Reduce combinatorial explosion by generating most specic acceptable h as lower bound on search space 1. User species H by stating predicates, functions, and forms of arguments allowed for each 2. Progol uses sequential covering algorithm. For each xi, f (xi) Find most specic hypothesis hi s.t. B hi xi f (xi) actually, considers only k-step entailment 3. Conduct general-to-specic search bounded by specic hypothesis hi, choosing hypothesis with minimum description length
Example rules from text to ll the city slot in a job template:
. . . located in Atlanta, Georgia. . . . oces in Kansas City, Missouri. Pre-Filler Pattern 1) word: in tag: in Filler Pattern 1) list: max length: 2 tag: nnp Post-Filler Pattern 1) word: , tag: , 2) tag: nnp semantic: state
where nnp denotes a proper noun (syntax) and state is a general label from the WordNet ontology (semantics).
COMP9417: April 19, 2010
Learning and Logic:
Slide 65
COMP9417: April 19, 2010
Learning and Logic:
Slide 66
Protein structure
Protein structure
H:5[111-113] H:3[71-84] H:1[19-37]
H:6[79-88]
fold(Four-helical up-and-down bundle,P) :helix(P,H1), length(H1,hi), position(P,H1,Pos), interval(1 <= Pos <= 3), adjacent(P,H1,H2), helix(P,H2).
H:4[61-64] H:5[66-70] H:1[8-17] H:2[26-33]
E:2[96-98] E:1[57-59]
The protein P has fold class Four-helical up-and-down bundle if it contains a long helix H1 at a a secondary structure position between 1 and 3 and H1 is followed by a second helix H2.
H:3[40-50]
H:7[99-106]
H:4[93-108] H:2[41-64]
1omd - EF-Hand
2mhr - Four-helical up-and-down bundle
COMP9417: April 19, 2010
Learning and Logic:
Slide 67
COMP9417: April 19, 2010
Learning and Logic:
Slide 68
Protein structure classication

Protein structure largely driven by careful inspection of experimental data by human experts Rapid production of protein structures from structural-genomics projects Machine-learning strategy that automatically determines structural principles describing 45 classes of fold Rules learnt were both statistically signicant and meaningful to protein experts
A. Cootes, S.H. Muggleton, and M.J.E. Sternberg The automatic discovery of structural principles describing protein fold space. Journal of Molecular Biology, 2003. available at http://www.doc.ic.ac.uk/~shm/jnl.html
Immunoglobulin:Has antiparallel sheets B and C; B has 3 strands, topology123; C has 4 strands, topology 2134. TIM barrel:Has between 5 and 9 helices; Has a parallel sheet of 8 strands. SH3:Has an antiparallel sheet B. C and D are the 1st and 4th strands in the sheet B respectively. C and D are the end strands of B and are 4.360 (+/- 2.18) angstroms apart. D contains a proline in the c-terminal end.
Summary
can be viewed as an extended approach to rule learning BUT: much more ... learning in a general-purpose programming language use of rich background knowlege incorporate arbitrary program elements into clauses (rules) background knowledge can grow as a result of learning control search with declarative bias learning probabilistic logic programs
COMP9417: April 19, 2010
Learning and Logic:
Slide 71
COMP9417: April 19, 2010
Learning and Logic:
Slide 72

L08 - Learning and Logic

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

L08 - Learning and Logic

Caricato da

Copyright:

Formati disponibili

11s1: COMP9417 Machine Learning and Data Mining

Learning and Logic

COMP9417: April 19, 2010

Learning and Logic:

Representation in Propositional Logic

Meaning in Propositional Logic

Meaning of such formulae can be understood with a truth table: P T T F F

Enable sound or valid inference.

COMP9417: April 19, 2010

Representation in First-Order Predicate Logic

Meaning in First-Order Logic

Learning and Logic:

Learning First Order Rules

Learning First Order Rules

How to learn concepts about nodes in a graph ?

COMP9417: April 19, 2010

Learning and Logic:

COMP9417: April 19, 2010

Learning and Logic:

Learning First Order Rules

Prolog denitions for relational concepts

COMP9417: April 19, 2010

Learning and Logic:

Prolog denitions for relational concepts

Prolog denitions for relational concepts

COMP9417: April 19, 2010

Learning and Logic:

COMP9417: April 19, 2010

Learning and Logic:

Induction as Inverted Deduction

Induction as Inverted Deduction

Child(Bob, Sharon) P arent(u, v) F ather(u, v)

xi : M ale(Bob), F emale(Sharon), F ather(Sharon, Bob)

What satises (xi, f (xi) D) B h xi f (xi)? h1 : Child(u, v) F ather(v, u) h2 : Child(u, v) P arent(v, u)

So lets design inductive algorithm by inverting operators for automated deduction!

Induction as Inverted Deduction

Induction as Inverted Deduction

Induction as Inverted Deduction

COMP9417: April 19, 2010

Learning and Logic:

COMP9417: April 19, 2010

Learning and Logic:

Induction as Inverted Deduction

Deduction: Resolution Rule

COMP9417: April 19, 2010

Learning and Logic:

COMP9417: April 19, 2010

Learning and Logic:

Inverting Resolution (Propositional)

C1 = (C (C2 {L})) {L}

COMP9417: April 19, 2010

Learning and Logic:

COMP9417: April 19, 2010

Learning and Logic:

First order resolution

Each operator is read as: pre-conditions on left, post-conditions on right.

COMP9417: April 19, 2010

Learning and Logic:

COMP9417: April 19, 2010

Learning and Logic:

Inverting First order resolution

Father (Shannon, Tom )

GrandChild ( Bob, Shannon)