Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Semantic Matching
ExaminationNumber:5858947
MScinSpeechandLanguageProcessing
THEUNIVERSITYOFEDINBURGH
2010
I have read and understood The University of
EdinburghguidelinesonPlagiarismanddeclarethat
this written dissertation is all my own work except
where I indicate otherwise by properuseof quotes
andreferences.
Acknowledgements
Iwouldliketoexpressmysincerestgratitudetomysupervisor,FionaMcNeillforher
continuoussupportandencouragementthroughoutthisproject.Shespenthoursreading
draftsandextendingthepreexistingsystemtoaccommodatemynewlycreatedmodule.A
specialthanksgoestoAlanBundy,mysecondsupervisor,whoneverrefusedtodevote
someofhisprecioustimetomeandwhoseeverypositivecommentmotivatedmetowork
even harder. I am deeply grateful to my friend Dimitris Kartsaklis, who guided me
throughmyfirststepsinprogrammingandwhowasalwaysthereeverytimeIneeded
someonetotalkto.ManythankstomyfriendJulienEychenneforthehelpheofferedto
meduringadifficulttime.Lastbutnotleast,Iwouldliketothankmyparentsforoffering
theirloveandsupporttomefrommilesaway.
Abstract
INTRODUCTION.................................................................................................................................................. 7
METHODOLOGY................................................................................................................................................. 9
1.1Ontologies............................................................................................................................................................... 11
1.2.Ontologymismatch..................................................................................................................................... 16
1.2.1OntologyRepairSystem........................................................................................................................ 17
Summaryofchapter1............................................................................................................................................... 21
2.1Ourproblem......................................................................................................................................................... 22
2.2Previouswork..................................................................................................................................................... 23
2.3.Challengesforonlinesemanticmatching........................................................................... 27
2.3.1Implementationchallenges................................................................................................................ 27
2.3.2Theoreticalchallenges............................................................................................................................ 28
2.3.3Theproposedsolution........................................................................................................................... 34
Summaryofchapter2............................................................................................................................................... 37
CHAPTER 3 Implementation..................................................................................................... 37
3.1TheSemanticMatcher............................................................................................................................... 37
3.1.1Buildingasearchengine....................................................................................................................... 39
3.1.1.1TrainingtheTextAcquisitionmodel.................................................................................. 42
3.1.1.2SensecreationandTermWeighting.................................................................................... 46
3.1.1.3Queryprocessing............................................................................................................................ 54
3.2Evaluation&AnalysisofResults..................................................................................................... 63
3.2.1Effectiveness................................................................................................................................................... 64
3.2.2Efficiency.......................................................................................................................................................... 71
3.3IntegrationwithORS................................................................................................................................... 72
Summaryofchapter3............................................................................................................................................... 74
CHAPTER 4 Discussion................................................................................................................. 75
4.1Theoreticaljustification............................................................................................................................. 75
4.2ImplicationsforOntologyEngineering................................................................................... 79
4.3ImplicationsforOntologyMatching.......................................................................................... 81
Summaryofchapter4............................................................................................................................................... 82
CONCLUDING REMARKS................................................................................................................... 82
REFERENCES........................................................................................................................................................ 84
APPENDIX.................................................................................................................................................................... 92
A.1.Glossary................................................................................................................................................................. 92
A.2Outputofevaluationmodule............................................................................................................ 94
A.3AdditionstoPA'sontology......................................................................................................................... 99
A.4ORSoutput.............................................................................................................................................................. 100
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
INTRODUCTION
Weareenteringanerawheretheamountofinformationproducedandstored(e.g.in
text, audio, images) makes our access to knowledge a complicated and time
consumingtask.Representingknowledgeinsuchawaythatitcanbehandledby
machines,andauthorisingintelligentagentstoperformactionsonbehalfofhumans
using this knowledge, is desirable. However, ambitious efforts to satisfy such a
demandatalargescale(e.g.theSemanticWeb)seemtohavereachedabottleneck
because of the lack of agreement on a shared ontology, that is, a common
representationoftheworld.Attemptstomatchdifferentontologiesandupdatethem
torepresentbeliefchangeusingontologymatchingtechniquesareoflimitedusein
SemanticWebtechnologiesandhavehadmixedsuccessbecausetheyarestilllargely
laborious'offline'proceduressincetheyusuallyrequirehumanexpertiseandtake
placebeforeagentinteraction.Matchingontologiesinadvanceisfruitlessinanon
lineenvironmentwhereservicerequestingagentsdiscoverserviceprovidingagents
automatically and may interact with them only once. Even the seemingly ideal
scenario,thatisestablishingauniversalontologytowhicheveryagentconforms,
wouldstillbefacedwithchallengessuchasaccommodatingdifferentopinions(e.g.
should'tomato'beclassifiedasafruitorasavegetable?),satisfyingtheontology
engineer'sneedforflexibility(e.g.whatif'hasPrice'isatwoplacepredicatebutthe
engineerneedstoaddathirdargument?)oradaptingtonewknowledge(e.g.howdo
weensurethatallontologiesonthewebareupdatedsimultaneously?).Moreover,a
universalontologythatencodesinformationfromalldomainsofknowledgewould
be too big to be usable in practical applications. It is obvious that insisting on a
shared ontology is not only nonrealistic but also nondesirable as it imposes
constraintsonhowagentscanrepresenttheirknowledge.
7
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
facilitated.ThisideawasfirstintroducedbyFionaMcNeill(McNeill2006),whobuilt
the Ontology Repair System (ORS); a system which tries to diagnose and repair
ontologymismatchesautomaticallyandonthefly,thatisduringagentinteraction,
forthepurposesofthecurrentcommunicationneeds.Mismatchescanbeofmany
types. Forexample,agentscanusedifferentwordstoexpressthesameidea(e.g.
loves(?X, ?Y) vs. likes(?X, ?Y) or capital(UK, London) vs.
capital(UnitedKingdom, London)), predicates with different arities (e.g.
hasPrice(ThisCD, 12)vs. hasPrice(ThisCD, 12, GBPounds))andmanyothercases
mentionedinthepaper.
Thefirsttypeofheterogeneities,thatisuseofdifferentwordsforthesamemeaning,
isdealtwithinthisproject.InthispaperIpresenttheSemanticMatcher,anewORS
modulethathelpsagentsmeasurethesemanticsimilarityofmismatchedtermsand
negotiatemeaning.
Throughout the study I test the hypothesis that combining formal ontologies with
folksonomies (i.e. informal, 'folk' taxonomies) allows for efficient and effective
matchingincaseswhererelyingonontologiesalonewouldleadtofailedorpoor
matching.IntheenditIwilldemonstratethatincorporatingmatchingtoolsbasedon
these ideas into ORS extends the abilities of this system to allow agents to
successfullyinteractevenwheretheirontologiesusedifferentwords.
8
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
METHODOLOGY
TheSemanticMatcheriswritteninPythonandwasincorporatedtothepreexisting
OntologyRepairSystem,writteninSicstusProlog3.
ThroughoutthepaperImakesome assumptionsandexplainonwhatbasistheyare
justified.Ialsoillustratemymainpointsusingtables,diagramsandformulas.
1 Theoriginalnameasusedin(McNeill2006)is'DynamicOntologyRefinement'
2 http://sigmakee.cvs.sourceforge.net/viewvc/sigmakee/KBs/
3 ThecompletecodeisavailablethroughFionaMcNeill(f.j.mcneill@ed.ac.uk)
9
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Thisthesisisorganisedasfollows:
InChapter2Iexplaintheproblemtobedealtwithinthisproject,summarisesome
previousworkintheareaanddiscussthetheoreticalandimplementationchallenges
thatoursystemhastomeet.
InChapter4Ishowhowmyimplementationdecisionsarejustifiedbyphilosophical
andcognitivetheoriesofconceptualstructureandexplainhowthesystem'sdesign
canprovideanargumentforthecombinationofontologiesandfolksonomiesand
improveontologymatchinginmultiagentenvironments.
TheAppendixcontainsaglossaryofbasicterms,outputfromtheevaluationmodule
andfromtheOntologyRepairSystemandalltheinformationthatwasaddedtothe
SUMOontologiesbeforedemonstratinganagentcommunicationscenario.
10
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
1.1 Ontologies
InthissectionIbrieflydefinewhatwemeanbyontologyinArtificialIntelligence,I
present the different types of ontologies according to their domain specificity,
expressivity and ability to act on the environment, and I describe some of their
characteristicsthatwillfacilitateourunderstandingoflatersections.
Thetermontologyoriginatesfromphilosophyandcanbereferredtoasthestudy
ofwhatthereis(Hofweber2004),thatisthestudyofhowtheworldaroundusis
organised.TheattempttocategoriseexistencedatesbacktoAristotle's Metaphysics,
andfromthenonmanyphilosophersattemptedtounderstandrealityinthisway
(Buchholz 2006; Hofweber 2004). In the context of Knowledge Representation, an
ontologyisa(usually)machineunderstandable4 modeloftheworldoraparticular
domainofinterest.Gruber(1993)definesanontologyasanexplicitspecificationofa
conceptualization that consists of terms (i.e. words forobjects) and the relations
between them. Ontologies are useful because they clarify how knowledge is
structured (Chandrasekaran et al. 1999), so they support complex querying and
inference,andenableagentstoreasonandperformtasks.Thisisthemaintoolfor
realisingthevisionoftheSemanticWeb (BernersLeeetal.2001);awebinwhich
informationissemanticallyexplicitandcanbesearchedintelligently5.Intheauthors'
4 Althoughintheory,anontologycanbespecifiedindifferentlanguages(eitherformalornatural),theutility
oftheSemanticWebrelatesprimarilytoformalontologieswhicharemachineinterpretable.(Fortierand
Kassel2006:747;myemphasis)
5 Forexample,supposethatastudentwantstofinda1yearpostgraduateprogrammeinNeurobiologyata
NorthAmerican university, which can provide full funding or pay for travel expenses. To find such a
programmethestudenthastospendtimelookingatdifferentwebsitesandrisksnotexploitingalltheoptions.
Theidealsituationwouldbetosubmitaqueryascomplexasthestudent'sinformationneedandwaitforalist
ofanswersthatsatisfythisconstraint.Withthecurrentdatawebthisisnotpossiblebecausemachinescannot
readnaturallanguagetextandperforminference.InaSemanticWeb,wherealltheknowledgeisrepresented
withontologies,thiswillbeachievable.
11
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
words:
TheSemanticWebwillbringstructuretothemeaningfulcontentofWeb
pages,creatinganenvironmentwheresoftwareagentsroamingfrompage
topagecanreadilycarryoutsophisticatedtasksforusers.
(BernersLeeetal.2001)
Ontologiescanmodeleitherspecificdomains(e.g.medicine,law,physics,fashion
etc.)orgeneralknowledgeabouttheworld.Examplesoftheformer,called domain
ontologies,aretheGeneOntology6,theEnterpriseOntology7,theFoundationalModel
of Anatomy8 and others. Examples of the latter, called upperlevel ontologies, are
SUMO(Peaseetal.2002)9,Cyc10,DOLCE11,WordNet(Milleretal.1993)12andothers.
Differentontologiescanhavedifferentexpressivepower(i.e.abilitytodescribethe
world).Thesimplestcaseistaxonomies,thatisgraphstructureswhichdefineaclass
hierarchy. For example: isSubclassOf(Place, Thing), isSubclassOf(Country,
6 http://www.geneontology.org/
7 http://www.aiai.ed.ac.uk/project/enterprise/enterprise/ontology.html
8 http://sig.biostr.washington.edu/projects/fm/
9 http://www.ontologyportal.org/
10 http://www.cyc.com/
11 http://www.loacnr.it/DOLCE.html
12 Thisqualifiesasanontologyunderourdefinitionoftheterm,whichisbroadenoughtoincludetaxonomies
(seelaterdiscussion).
12
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
useofbinaryrelationsonly(i.e.relationsthattaketwoarguments)andaretherefore
notappropriatetorepresentfactslike'Maryeatsfisheveryweek',whichtypically
requireaternarypredicate(e.g. eats(Mary, fish, week))14.Ifweallowarity(i.e.
number of arguments) to be greater than two and also write axioms (rules that
supportinference)inquantifiedformulas15,thenwehaveafullyfledgedFirstOrder
ontology.AgoodexampleofafirstorderontologyistheSuggestedUpperMerged
Ontology(SUMO)(Peaseetal.2002)andthedomainontologiesthatextendit.Much
oftheworkpresentedinthispaperfocusesonSUMOanditssubontologies.
Asmentionedabove,ontologiescandifferintermsofthedomainstheymodelandin
termsoftheirexpressivity.Anotherdistinctionthatwecanmakeisbetween static
13 Subsumption(i.e.setinclusion)relationscanarecalled'isa'andcanappearindifferentforms(e.g. is-a,
isa,subclass,subclassOf,hasSubclass(inverse)etc.).Setmembershiprelationscanberepresentedas
instanceOf,type,hasTypeetc.Partofrelationsarealsocalled'hasa'.
14 Thisdoesnotmeanthatitisimpossibletomodelsuchfactsinbinaryrelations.Forexample,wecouldsay
eatsFish(Mary, week).ButthisisnotagoodenoughsolutionsinceapredicatelikeeatsBreadwouldbe
considereddifferentfromeatsFish,andwouldnotanswerthequestion'WhatthingsdoesMaryeat?'.Another
workaround could be: argument1(Mary, eats), argument2(fish, eats), argument3(week,
eats),butthismakesqueryingmoredifficult.
15 e.g.Thereisastudentwholikesallbooks:x(student(x) y(book(y) likes(x, y))
16 http://www.w3.org/TR/owlfeatures/
13
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
anddynamicontologies.Mostexistingontologiesarestatic,thatistheydeclaratively
representfactsthatcanbedecidedastrue(ifthefactispartoftheontologyorcanbe
inferredfromotherfactsandaxioms)orfalse.However,thetruthorfalsityofafactis
only a statementanddoes not do anythingtotheworld.Inmultiagentsystems,
ontologiesserveasknowledgebasesforagents,whoareabletochangetheworld
withtheiractions(Baral2010).Forexample,anagentcanbeauthorisedbyahuman
to purchase abook on the internet; when itadds the fact,say, hasBought(agent,
thisParticularBook)initsontology(andthereforethefactbecomestrue)thisisnot
justastatementbutatransformationofreality,sincenowthehumanistheownerof
the book and his/her credit card balance is lower17. Dynamic ontologies are of
increasing importance in multiagent systems, which are quite often based on a
variantoftheBDImodel(Beliefs,Desires,Intentions)(see Wooldridge2009):they
havebeliefs(i.e.factsintheirknowledgebase),desires(i.e.goals;factsthattheywould
liketomaketrue,i.e.toaddamongtheirbeliefs)and intentions (i.e.actionswhose
effectsamounttothemfulfillingtheirdesires).Inordertobringaboutadesiredstate
ofaffairs,agentsneedtofollowacourseofaction,whichisdecidedbyplanning18.
Planningisanimportantnotiontoremembersinceitiscentraltothedesignofthe
OntologyRepairSystem(section1.2.1).Furthermore,theneedofagentstoperform
actionsinordertofulfiltheirdesiresnecessitatestheexistenceofdynamicontologies
andnotjuststaticrepresentationsoftheworld.
17 Itmightsoundcounterintuitivethatontologylanguages,whicharedeclarative,canperformactions,buttake
Prologasanexample:IfweopenaProloginterpreterandtype write('hello world'),our'fact'willbe
evaluatedtotruebutthiswillalsoproduceasideeffect,namelytheprintingof'helloworld'.Sonowwehave
notjustaskedifwrite('hello world')istrue,butwehavealsoaskedPrologtodosomethingforus.Of
course,oneobviousquestioniswhetherlanguageswithoutaninterpreter(e.g.OWL,KIFetc.;fordefinitions,
seelaterdiscussion)canhavethesameeffect.Theansweris'no'butthereisawaytorepresentactionsand
theirconsequencesintotallydeclarativeontologiessothattheycanaffecttheenvironmentwhentranslated
intoalanguagelikeProlog.Forexample,PDDL(PlanningDomainDefinitionLanguage)(Ghallabetal.
1998)representsactionsinsimpletextfilesintermsofpreconditions(i.e.stateoftheworldbeforetheaction)
andeffects(i.e.stateoftheworldaftertheaction).
18 Forexample,ifmygoal(desire)istobeinAmsterdaminaweek,Iplanaheadtoperformanumberof
actions(intentions),e.g.buyaticketsothatthestateofaffairsofmehavingaticketistrue(belief),thengo
totheairportsothatmybeingattheairportistrueandsoon.Planningaheadisknownas'practicalreason'or
'practical reasoning' and was first introduced in Philosophy by Michael Bratman (Bratman 1987); see
(Wallace2008)foranoverview.
14
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Anotherthingtoknowaboutontologiesisthattheyarewritteninontologylanguages.
SomenotableexamplesareOWLDL(WebOntologyLanguageDescriptionLogic),
aW3CrecommendationfortheSemanticWeb19,whichsupportsbinaryrelationsand
formaldefinitionsofconcepts; RDFS (ResourceDescriptionFrameworkSchema)20,
similar to OWLDL but without formal definitions, therefore does not support
complex inference; KIF (Knowledge Interchange Format)21 (Genesereth and Fikes
1992), a firstorderlanguagewhich includes arities belowandabove2as well as
quantification. It also supports nonmonotonic reasoning. SUOKIF was derived
from KIF [...] to support the definition of the Suggested Upper Merged
Ontology(Pease2009).Thisisthelanguageoftheontologiesusedinthisproject.For
anoverviewofontologylanguagessee(CorchoandGmezPrez2000).
Finally,weshouldbrieflydiscusshowwordsandformulasinfirstorderontologies
gettheirmeanings,asthiswillbeusedinsection2.3.2whereIshowthatontologies
inmultiagentsystemsarevulnerabletosymbolgroundingproblems.Theexample
ontologylanguageIwillbeusinghereisKIF.Anontologylanguagehasasyntax(i.e.
formationrules,someofthemrecursive,thatspecifywhatkindofformalpatterns
canbegeneratedfromthislanguage)andaformalsemanticsthroughwhichpatterns
(i.e.stringsofsymbols)arerelatedtotheobjectiveworld.Individualsareobjectsin
the'universeofdiscourse',thatisaconceptualisationofthethingsthatexistinthe
world(Pease2009),thereforetheyarealreadygroundedintherealworld22.Relations
aresetsofindividuals(thisistrueofunaryrelations,i.e.classes)orsetsoftuplesof
individuals. For example, the unary relation actress/123 which is just a string of
characters, therefore a symbol, achieves semantic grounding by pointing to its
denotation (i.e. set of individuals that make the relation true) through an
interpretation function. Under, say, interpretation 1, actress/1 means {Mary,
19 http://www.w3.org/TR/owlguide/
20 http://www.w3.org/TR/rdfschema/
21 http://wwwksl.stanford.edu/knowledgesharing/kif/
22 Butaswewillseelater,groundingisnotsuccessfulwhenontologiesareusedinagentinteraction.
23 Thenumberaftertheslashrepresentsarity.
15
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
MissBrown,SaraG,Paula}.Thebinaryrelationloves/2getsitsmeaningbypointingto
asetofpairsthatsatisfyit(e.g.loves/2canmean{<Mary,John>,<George,Alice>,
<Sophie,Mark>}underaparticularinterpretation. Formulas areconnectedtothe
worldbypointingtotruthvalues.Forexample,loves(George,Alice)returns'true'
because we can look at the already grounded relation loves/2 and confirm that
<George, Alice> is atuple in its denotation.This is all we need to know for the
moment.
Nowthatwehaveseenwhatontologieslooklike,wecanproceedtoournextsection,
which describes what happens when agents with disparate ontologies try to
communicate.
TheSemanticWeb,asimaginedbyTimBernersLee(BernersLeeetal.2001),involves
intelligentagentswhoareabletomanipulatesemanticallyexplicitinformationand
interoperatewithotheragentstocarryoutcomplextasksonbehalfofhumans.But
what happens when the interacting agents have different representations of the
world?Howiscommunicationpossible?TheoriginalviewwithintheWorldWide
WebConsortium(W3C)wasthatagentssomehowhavetoconformtoacommon
ontology. As Heflin (2003) puts it on the W3C website: Ontologies should be
publiclyavailableanddifferentdatasourcesshouldbeabletocommittothesame
ontologyforsharedmeaning..However,asdiscussedinourintroduction,thisisnot
only unattainable (Finkelstein et al. 1993) but also undesirable, since it imposes
constraintsontheontologyengineer'sviewoftheworld.
TheproblemofheterogeneousontologiesisaddressedwithinthefieldofOntology
Matching (see (Euzenat and Shvaiko 2007) for an extensive overview). There are
16
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
many different techniques that are currently being developed for the purpose of
finding correspondences between parts of different ontologies (e.g. Matching,
Merging,Mapping,Alignment,Translationetc.).Inallofthem,theideaisthatone
ontology is linked to another or two ontologies are linked to a central one with
relationsholdingbetweentheirparts.Anotheroptionistomergetwoontologiesin
one.However,thesetechniquesgenerallypresupposeaccesstotheontologiesofboth
agentsinvolved,makingthemunsuitableforcommunicatingagents.Asmentioned
intheintroduction,thisisnotrealisticsinceserviceprovidingagentsmightnotbe
willingtorevealpartsoftheirknowledgebasetoagentswhorequesttheirservices24.
Anotherreasonwhytheseapproachesarenotappropriateformultiagentsystemsis
thattheyareusuallytooslowfortherequirementsofonlineagentcommunication.
A solution to this problem was proposed by McNeill (McNeill 2006) (see also
(McNeill 2007)), who built the Ontology Repair System; the subject of our next
section.
TheOntologyRepairSystem(henceforthORS)isasystemdesignedtofacilitateagent
communication at runtime (i.e. during their interaction), not by establishing
correspondencesbetweentheontologiesoftheagentsinvolvedbutbybeingavailable
as a tool for one agent (known as the Planning Agent; henceforth PA), whose
representationoftheworldis'repaired'tomatchthatofa ServiceProvidingAgent
(henceforthSPA).ThePAtriestofulfilitsgoal(e.g.bookaplaneticket)byforming
plansandaskingoneormoreSPAstoperformactions.ORScanbeseenasaplugin
toPA,whichhelpstheagentfixpartsofitsontologyiftheycausecommunication
problems.Thisisanovelapproachtoontologymismatchasitproposesmeaning
negotiation'onthefly',evenbetweenagentsthathavenevercontactedeachother
24 Forexample,ifanagentwantstobuyaCDfromAmazon,thenitwouldbeunreasonabletoassumethatit
willhaveaccesstoprivateinformationwhichisknowntoAmazon.
17
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
before:
ORSisthefirstexampleofanewbreedofdynamic,automaticontology
repairmechanisms,whichwebelievewillbeessentialtorealisethevision
ofautonomous,interactingagents,suchasenvisagedintheSemanticWeb.
(McNeillandBundy2007)
ThemaincharacteristicsofORScanbesummarisedbelow:
ORS:
doesnotpresupposesharedontologicalrepresentationsamongagents
dealswithagentswhointeroperateinaplanningcontext,whichisnecessary
forsemanticwebservices
ispluggedintothePlanningAgent(PA),andhasaccesstoitsontology
hasnoaccesstoanyServiceProvidingAgent's(SPA)ontologyotherthanwhat
isrevealedduringcommunication
workswithfirstorderontologies,whicharerichenoughtosupportplanning
performs not just belief revision (i.e. changes in the PA's facts), but also
signaturerepairs(i.e.structuralchangesinthePA'sontology;changestothe
classhierarchy,axiomsetc.)
is minimal (fixes only the parts thatinhibitcommunication ataparticular
interaction)
isdynamicandfullyautomated(nohumaninvolvementisrequired,whichis
afterallthewholeideabehindmultiagentSemanticWebsystems)
iswritteninSicstusProlog25andadaptedtoOntolinguaportableontologies26
(writteninKIF);see(Gruber1992).Withthisproject,itwasalsomadepossible
forORStosupportthe SUMO ontologyanditssubontologies27 (writtenin
SUOKIF).
consistsofaTranslationsystem(whichtranslatesthePA'sontologyfromKIF
25 CurrentlyrunningonSicstusversion4;see(Carlssonetal.2010)
26 http://ksl.stanford.edu/software/ontolingua/
27 http://www.ontologyportal.org/
18
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
ORSisabletosolvedifferentkindsofmismatches(forexample,existencevs.non
existenceoffacts,argumentsappearingindifferentplacesetc.).InthisprojectIam
dealing with themostfrequently occurring typeofmismatches,namely semantic
mismatches28,untilrecentlyunsolvablebyORS.
In another MSc project, Akinsola listed different kinds of mismatches that are
expectedtooccurinmultiagentsystemsinthefuture(Akinsola2008:3642).Thislist
wasbasedonanobservationofthechangesthatoccurwhenontologiesareupdated
in one of the largest repositories of ontologies, Sigmakee, which lists versions of
SUMOanditssubontologies(i.e.MILO29 &domainontologies).Theideawasthat
thechangesoccurringfromoneversiontoanothercanprovideuswithanintuition
astowhatkindofmismatchestoexpectbetweentheontologiesofdifferentagents.
Forexample,ifweseethatchangingapredicate'sarityisacommonontologyrevision
practice,wecanassumethatinfuturemultiagentsystems,differentaritywillbea
commonkindofmismatch.AnadaptedlistcanbeseeninTable1:
28 i.e.classes,individualsorrelationshavingdifferentnamese.g.lovesvs.likes
29 MidLevelOntology
19
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
REVISIONSINSUMO EXPECTEDONTOLOGY
REPOSITORY MISMATCHES
(fromversionxtoversionx+1) (betweenontologiesofPAandSPA)
1 Changesinnamesforclasses, Differentnamesforclasses,relationsor
relationsorindividuals individuals(semanticmismatches)
2 Additionoffacts
Existencevs.nonexistenceoffacts
3 Removaloffacts
4 Recategorisationofclasses Classeswithdifferentplaceinthehierarchy
5 Redefinitionsofrelations Relationswithdifferenttyperestrictions
6 Changesinargumentsforrelations Relationswithdifferentarguments
7 Hierarchicalrecategorisationof Classesthataresubclassesvs.classesthatare
classes instancesofanotherclass30
8 Additionofconjunctioninaxioms
Axiomswithvs.axiomswithoutconjunction
9 Removalofconjunctioninaxioms
10 Removalofuniversalquantifiersin
axioms Axiomswithvs.axiomswithoutuniversal
11 Additionofuniversalquantifiersin quantification
axioms
12 Removalofexistentialquantifiers
inaxioms Axiomswithvs.axiomswithoutexistential
13 Additionofexistentialquantifiers quantification
inaxioms
14 Changesinvariablenames
(notamismatch;variablesarejust'containers')
15 Changesintyperestrictionsfor Axiomswithdifferenttyperestrictionsfor
variableswithinaxioms variables
16 Removaloffactsorarguments Axiomswithvs.axiomswithoutparticular
fromaxioms factsoraxioms
17 Fixingtypos Differencesinwording
18 Refinementofinferencerules Individualsaremembersofdifferentclasses
19 Swappingofargumentsin Relationswithdifferentpositionsin
relations arguments
20 Additionofsubclassestoclasses Classeswithvs.classeswithoutparticular
subclasses
21 Implicationsign(=>)changesto Axiomswithconditionalsvs.axiomswithbi
equalitysign(<=>) conditionals
Table1:Ontologymismatchesexpectedtobecommoninmultiagentsystems
30 Thelatterisasecondorderrelation.
20
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Currently ORS has the infrastructure for dealing with approximately 1/3 of the
mismatchespresentedearlier.Semanticmismatches,whichwereaddressedinthis
work,aredescribedintherestofthepaper.
Summaryofchapter1
InthischapterIexplainedwhatontologiesareanddescribedsomeoftheirbasic
characteristics.Ialsoaddressedtheproblemofontologymismatchandpresented
ORS,anonlineontologyrepairsystem,whichIhaveextendedforthepurposesof
thisproject.Afterhavingsetthesceneforourdiscussion,wecanproceedtoournext
section,whichlooksathowsemanticmismatchesaretobedealtwith.
Inthecontextofthispaper,semanticmatchingisdefinedasdeterminingsynonymy
betweenoneormorewordsfromthePA'sontologyandoneormorewordsfromthe
SPA's ontology31. These words are strings of characters (linguistic symbols) that
represent classes, individuals or relations. Throughout this study I refer to them
usingthecoverterm'lexeme'32.
31 Forexample,howcanwedeterminethat'Cat'and'FelisCatus'refertothesameclassofthingsintheworld?
32 Theterm'lexeme'isusedbyPease(2009)torefertostringsofcharactersseparatedbywhitespace.This
definitionallowsvariablenames(e.g.?X?Y),quantifiers('forall','exists')andoperators('and','or','=>'etc.)
tobelexemes.Mydefinitionis,therefore,morerestrictive.
21
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Semanticmatchinginthecontextofonlineagentcommunicationcanbeofusein
twosituations,whichIcall surprisinglexemesituation and neededlexemesituation.In
theformercase,thePAreceivesaqueryfromtheSPA33(e.g.intheformofchecking
preconditionsforanactionithasbeenaskedtoperform)andoneofthelexemesin
thequerybeitarelation,individualoraclassisunknowntothePA.ThePA's
plantoachievethegoalbyaskingtheSPAtoexecutetheactionwillfail.Thenthe
ORSDiagnosticSystemwillbeconsulted,whichcandetermineeitherthatthelexeme
ismissingfromthePA'sontology,orthatthereneedstobeacorrection(e.g.atypo)
inorderforthePAtorecognisethelexeme,orthatthereisanotherlexemewhichhas
thesamemeaningbutadifferentsymbol,inwhichcasesemanticmatchinghastobe
performed. In the needed lexeme situation, the PA wants the SPA to perform an
actionbuttheplanfailsbecausethePAusesdifferentwordsfromtheonestheSPA
expects.Iftheneededlexemeisaclassnameorthenameofanindividual,thePAcan
submit a query to the SPA to retrieve the relevant information. This of course
presupposesthatthenameofthepredicateisshared.Butifitisthepredicatethat
needstobematched,firstorderlogicisnotenoughforthis,thereforeweneedto
makesomeassumptions,forinstance,thatSPAssupporthigherorderlogics34orthat
the SPA is helpful enough to do the search for the PA and suggest a matching
predicate. The needed lexeme situation is outside my scope in this study, as is
complexdiagnosis35.TheproblemIamdealingwithis: GiventhatthePAreceivesa
surprising question and the diagnostic algorithm determines that there is a semantic
33 Tosimplifyourwork,weassumethatthePAcontactsonlyoneSPAinordertogetanactionperformed.
However,inreallifesituationsitisexpectedthatthePAwillachieveitsgoalsbyrequestingservicesfrom
variousagents,accordingtotheirabilitytoperformcertainactions.Thisassumptiondoesnotaffectthe
resultsofourworkonsemanticmatching;itjusthelpsusavoidcomplexscenarioswhichareunnecessaryfor
ourtask.
34 ImplementationwisethisispossibleinProlog.Forsomeideasonhowtoretrievepredicatesusingsecond
orderlogicinProlog,see(SterlingandShapiro1994),chapter16.
35 Diagnosing semantic mismatches can be a very complex task, and might need to involve probabilistic
techniques.Forthepurposesofthisproject,theDiagnosticAlgorithmissimplifiedtodetermineinadvance
thatSemanticMatchingisrequired;however,aspartoftheORSDiagnosticSystem,theSemanticMatcher
diagnoseswhereexactly(i.e.inwhichlexemes)theproblemlies.
22
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
mismatch, can we perform semantic matching (i.e. search your ontology and find the
synonymous lexeme)? After this stage, the Refinement module will replace the old
lexemewiththeSPA'swordandreplanuntiltheplansucceedsornoplancanbe
found.
Beforewediscusshowthisproblemcanbedealtwith,let'slookatsomeprevious
work done on semantic matching and some extra challenges that a multiagent
environmentposes.
The term semantic matching was first introduced by Giunchiglia and Shvaiko
(GiunchigliaandShvaiko 2003),whodescribeditasaprocessinwhicha'match'
operatortakestwographlikestructures(e.g.,databaseschemasorontologies)and
produces a mapping between elements of the two graphs that correspond
semantically to each other36. Semantic matching is contrasted to what they call
syntacticmatching,whichmatchesnodesofagraphbylookingat'labels',thatistheir
stringsimilarity(e.g.'phone'vs.'telephone').Whattheauthorsproposeinsteadisa
matchingonthebasisofmeaning,whichisencodedinthegraphstructure(i.e.by
looking at a node's 'ancestors', 'children' and 'sisters' in the treestructure). This
would enable nodes like 'Europe' and 'pictures' to be matched as equivalent
(synonymous)iftheirancestorsare'Images'and'Europe'respectively,giventhatboth
nodes mean 'pictures of Europe'37. SMatch (Giunchiglia et al. 2004) was the first
system that implemented semantic matching as described above. Its input is two
graphs and its output is pairs of nodes from the two graphs and their relations
(equivalence(=),moregeneral(),lessgeneral(),mismatch(),overlapping()).
36 Priortothis,theterm'semanticmatching'wasusedinabroaderway.
37 Weshouldalsonotethatrelations(i.e.arcsinthegraph)arenotlabelledandthatsynonymsarefoundin
WordNet(discussedinsection3.1.1.1)
23
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Mydefinitionof'semanticmatching'atthebeginningofthischapterisnarrower
withrespecttothesystem'sdesiredoutputasitisonlyconcernedwithfinding
equivalence (=) relations between entities in two ontologies and broader with
respecttothetechniquesthatthesystemcanexploitasmanipulatingthe'label'
itself38isalsopermissible.Semanticmatching,asseeninthisstudy,canbenefitfrom
what Euzenat and Shvaiko call 'namebased' techniques for ontology matching
(EuzenatandShvaiko2007),whichcomparelexemes(i.e.words)onthebasisoftheir
formand/ortheentitiestheydenote.Thesetechniquesaresubdividedinto string
based methods and languagebased methods. The former are very close to what
GiunchigliaandShvaikocalled'syntactic'(asopposedto'semantic')matchingsince
they try to compute correspondences between lexemes disregarding the meaning
behindit(GiunchigliaandShvaiko 2003).Forexample,usingsuchamethod,itis
easiertomatch'bold'to'bald'ratherthan'bold'to'fearless'.Stringbasedmatching
systems typically involve a number of steps before the lexemes are compared:
normalisation(e.g.casenormalisation('CD''cd'),diacriticssuppression('crpe'
'crepe'), blank normalisation ('world\tcup' 'world cup')39, link stripping ('easy
going' 'easy going'), digit suppression ('cat2144' 'cat') and punctuation
elimination('C.D.' 'CD')), string equality (i.e.checkingiftwostringsareequal
('catch'=='cat'? False)),substringtest(checkingifonestringispartoftheother
('cat'in'catch'?True)),editdistance(e.g.(Levenshtein1965);('cat'and'catch'have
distance 2)), path comparison (applicable to schemas or directories, e.g.
'Images/Europe/Italy' vs. 'Europe/Pictures/Italy') and tokenbased distances
(discussedlater). Languagebasedmethods,ontheotherhand,regardastringnotas
series of symbols (characters) but as little texts (e.g. 'goodwill ambassador' means
somethinglike'ambassadorwhohasgoodwill'or'ambassadorbecauseofgoodwill'
or'ambassador'+'good'+'will'40).Thesemethodscaninvolvelinguisticnormalisation
38 e.g.segmentingthelexeme(i.e.theentity's'label')intowords('BalletDancer'['ballet','dancer'])
39 \t represents tabulation.Other'empty'characters usedinblanknormalisationare \r (carriagereturn), \n
(newline)etc.
40 Iintendthisasaconcatenationofmeaningsandnotstrings.
24
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Examples of systems that make use of 'namebased techniques' are too many to
discusshere.Alistcanbefoundin(EuzenatandShvaiko2007:187192)andbrief
descriptions of their main components can be seen on pages 153187. What is
importanttonoteisthatallofthesesystemssupportgraphlikestructures(e.g.XML,
RDF etc.) or ontologies based on Description Logics (e.g. OWL), which are less
expressivethanfirstorderontologieslikeKIF,whichisusedinORS.Moreover,none
of them is designed for multiagent systems, and therefore they are not directly
applicable to the problem addressed in this paper. In chapter 3 we will see that
althoughthemethodsIusearelanguagebased(becausetheycomparemeaningsand
notstringsofcharacters),mysystemalsobenefitsfromstringbasedtechniquessuch
as normalisation and tokenbased distances, but diverges significantly from their
traditionaluse(seechapter3forimplementationdetails).
Onesystem,builtbyQuandhiscolleagues(Quetal.2006),isworthmentioning,
since it provided the inspiration for the Semantic Matcher, which I am going to
presentlater.ThissystemcomputescorrespondencesbetweennodesofRDFgraphs
(whicharewords)43 byconstructing'virtualdocuments'foreachoneofthem.The
term'document'comesfromInformationRetrievalandmeansa'bag'ofunordered
words that represent a web page. In the system described, 'virtual documents'
41 i.e.naturallanguagedefinitions
42 e.g.'animal'isahypernymof'koala'
43 Infact,bothnodesandarcsinRDFgraphsareUniformResourceIdentifiers(URIs;discussedlater)
25
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
representingnodesarecomparedforsimilarityusingthevectorspacemodel(see
section3.1.1.3).Thebagsaregeneratedfromthetokenisedlexeme(i.e.nameofnode)
butalsofrom'neighbouringinformation',thatisnamesofothernodesconnectedtoit
inthegraph.Theresearchersavoidtheuseofexternallinguisticresourcessuchas
WordNetonthebasisthatitistoocomputationallyexpensive,whichisreasonable
sincetheirsystemcomparesallvirtualdocumentsofonegraphagainstallvirtual
documentsofanothergraph.ThedesignoftheSemanticMatcherisinfluencedby
thissystem,butaswewillseeithasmajordifferences(e.g.informationtobethrown
in thebags isaggregatedfromall possiblesources,WordNetisusedbecausewe
don't need to compare whole graphs etc.). Euzenat and Shvaiko classify this
approachunder'tokenbaseddistances'becauseittreatsalexemeasa'bagofwords'
(i.e. a set of tokens) (Euzenat and Shvaiko 2007). It is important to note that the
authorsregarditasa'stringbased'methodonthebasisthatsimilaritycomparison
betweenbagsisinfactstringcomparisonbetweenwordsinthebag.However,asI
willshowintherestofthestudy,creatingappropriatebagsforlexemesallowsusto
predict similarity of meaning even between pairs such as 'corn' and 'maize' and
thereforeitdeservestobeviewedasaproper semantic asopposedto syntactic44 or
stringbasedmatchingtechnique.
Nowthatwehaveseensomebasicsemanticmatchingtechniques,wecanlookat
whydesigningasystemformeaningnegotiationbetweenagentscannotfollowall
thesemethodstotheletter.
44 according to Giunchiglia and Shvaiko's use of the term 'syntactic', because in other contexts 'syntactic
matching'ismatchingofstructures(GiunchigliaandShvaiko2003)
26
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
SemanticmatchinginORScannotbeperformedbystrictlyfollowinganyoftheways
describedabove.Asdiscussedearlier,ORSoperatesinarealisticenvironment,thatis
it tries to resolve agent communication conflicts that will inevitably occur in the
futurewhiletakingintoaccountallthelimitationsthatthisinvolves:i)timepressure,
ii) agents' privacy, iii) ontologies have FirstOrder expressivity, because planning
can'tbesuccessfullysupportedbylessexpressiveontologies,butthisautomatically
raisesthecomplexityoftherepresentation45iv)ontologiesaredynamicasopposedto
static46.Theseconsiderationswillplacesomeconstraintsonourchoiceofsemantic
matchingtechnique.Inourproject,theneedforefficiency,combinedwiththesemi
decidability of FirstOrder Logic (henceforth FOL) will discourage us from
employingcomplexreasoning(thatis,tryingtoinfertheword'smeaningbymeans
of the relations itis involved in ortheaxioms thatencodefacts abouttheentity
referredtobytheword.(Theexpressivepoweroftheontologywillalsopreventus
fromthinkingintermsofgraphlikeconstructs.Graphsaretreestructures(usually
classifications,taxonomies)whereallrelationsarenecessarilybinary:everyarcisa
relation47anditsendsaretwonodes;theonlytwoargumentsitcantake.However,in
FOLwecanhavehigherarities(e.g. likes(john, squash, summer))orlower(i.e.
unary relations like philosopher(socrates)). Another issue is that many of the
mismatches are at the predicate level (section 1.2), which means thatwe need to
matchnotonlyclassesandindividuals(i.e.'nodes'inagraph)butalsorelations.
Finally,evenifwethinkofPA'sontologyasbeyondtreestructures,wehaveanother
constraint:thereisnootherontologywithwhichtofindcorrespondences.GivenPA's
45 althoughcurrentlylessexpressiveontologiesaremorewidespread
46 i.e.theyincludeactionconceptswithpreconditionsandeffects,withwhichagentsactontheenvironment
andchangetheworld.
47 e.g.isa,partof,instanceof,orperhapsancestorofinfamilytrees,higherthaninmilitaryhierarchiesetc.
27
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
(andORS's)limitedaccesstotheSPA'sontologybecauseofprivacyissues,wecannot
takeforgrantedthatthelatter'sontologicalrepresentationwillbeavailabletous.The
mostlikelyscenarioisthatthePAhasonlyseenafewwordsfromtheotheragent's
ontology;theonesthattheSPAwaswillingtorevealduringtheinteraction48.Hence,
there is no issue of matching two ontologies, but rather, matching lexical items
againstanontology.Thismeansthatinasemanticmatchingsituation,weonlyhave
onelexemeandwetrytofinditssynonyminsidePA'sontology.
Thetheoreticalchallengethatoursystemhastomeetistackletherootoftheevil,
thatisuncoverandsolvetheproblemthatcausessemanticmismatch.Thissection
attempts to explain what happens in an agent's mind, that is how the agent
representstheworldandwhattheworldisanyway.Ourdiscussionwillexposethe
problemofsymbolgrounding,which,asIclaim,isthehiddencauseofsemantic
mismatch andwill providethemotivation forthedesign ofasemanticmatching
system thatis theoretically foundedon amorerobustnotion of meaning.This
section will also provide the background for a better understanding of the final
chapter,wherethenatureofthismeaningisexplained.But,beforeweproceed,we
havetoanswerthisquestion:Whatdoesmeaningmean?
48 Thisisalsousefulasmatchingtwo(potentiallylarge)ontologiescanbeexpensiveandlargelyunnecessary.
49 Notethatthewordisspeltwithan's'andisnottobeconfusedwith'intention'
50 Ofcourse,meaningisnotonlyapropertyofwords,butofeverythingcomposedoutofwords,including
sentences.Forexample,theintensionalaspectofsentencemeaningistheproposition,thatisthe'thought'
encodedinthesentenceandtheextensionalaspectisthesetofsituationswhichmakeittrue.
28
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
51 FromnowonI'mgoingtousethepairsintensionextensionandsensereferenceinterchangeably
52 ortuplesofindividualsforaritygreaterthan1
53 Afurtherproblemwillbeexposedlater,whenwetalkaboutagentcommunication.
29
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Thetriangleofreference(alsoknownas'meaningtriangle'),introducedinOgden
andRichards'classicbook TheMeaningofMeaning (OgdenandRichards1923),isa
schematicrepresentationofhowwordsarerelatedtoentitiesintheworld.Ascanbe
seen in diagram 1, the linguistic symbol54 cat on the left stands for the set of
entitiesthatthewordextendsto(i.e.allthecatsintheworld),butthisrelationisonly
indirecthencethedottedlinebecausewhatmediatesisthe'thought',thatisthe
senseCAToftheword.
Diagram1:Humanmentalrepresentations
Apartfromsolvingphilosophicalproblems,asdiscussedabove,intensionalmeaning
alsoaccountsforwhatwecall'understanding'oftheword.Aswewillseelater,even
ifagents'ontologieshaveaperfectlysoundsemanticsforlexemes,itisuselessfor
54 i.e.stringofcharactersorsequenceofspeechsoundsthatrepresentalexicalunit.Tobeprecise,thisisnota
symbolbutasigninthatitisconventionallyassociatedwithitsreferent(needcitation).Inthecontextof
thisstudyIwillusethetwotermsinterchangeably.
30
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
semanticmatchingbecausetheagentsdon'tunderstandanyofthemeaning.Nowthat
we have seen how words get their meaning in human language, let's see what
happens in the case of agents. Imagine an agent having the lexeme cat in its
ontology,say,asaclass.Usually,ontologieshaveextensionalsemantics(e.g.SUOKIF
ontologies(Pease2009)),thatis,wordscontainedinthemareassignedmeaningwith
the help of an interpretation function (see chapter 1), which maps them to their
denotation(i.e.setofindividualsXfromthedomainofdiscourseforwhichcat(X)is
true).Therefore,linguisticsymbolsgettheirmeaningbypointingtoindividualsin
thedomainofdiscourse.Thatmakesaverygoodsemanticsforontologies,butthere
issomethingwrong:thedomainofdiscourseisnottheobjectiveworld55itselfbuta
symbolicworldwrittenwithothersymbolsinsidetheontology.Forexample,ifthe
predicatecat/1returns'true'wheninstantiatedinthefactscat(Fluffy),cat(Kitty)and
cat(Cat25633),thenthemeaningofcatwillbetheset{Fluffy,Kitty,Cat2563}.But
these individuals are still lexemes in the agent's knowledge base, which are not
groundedintherealworld:
Whentheknowledgeofadomainisrepresentedinadeclarativeformalism,
thesetofobjectsthatcanberepresentediscalledthe universeofdiscourse.
This set of objects, and the describable relationships among them, are
reflectedinthe representationalvocabulary withwhichaknowledgebased
programrepresentsknowledge.
(Gruber1993;myemphasis)
So,thefollowingsituationseemstobethecase:
55 ourworld;whatwewouldcallthe'contingent'worldinmodalphilosophy
31
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Diagram2:Agent'mental'representations(Thesituationnow)
Asshownintheabovediagram,thelexemecatfailstorefertotheextensionofcats
in our world but instead designates a set of other uninterpretable lexemes. Even
worse,thissetisintheagent'sontology,thatisintheirmind.Whatseemstobe
happeninghereisaconflationofsenseandreference:Ifsemanticsisextensional,
whyarethereferentsontheagent'smind?Andifitisintensional,whydoesn'tthe
mentalrepresentationmediatetoestablishtheconnectionbetweencatandtheset
ofcatsinourworld?
However,onecouldraisetheobjectionthattheworlddoesnotneedtobeanything
more than a virtual world in the ontology and that semantics in ontologies has
nothing todo with thecontingentworld.Thisrescues formal semantics andit is
perfectly acceptable, but useless if agents have to interact with each other. For
example,let'ssaywehavetwoagents,JerryandTom.Jerryknowsthatcat/1means
{Fluffy,Kitty,Cat2563}andTomknowsthatfelisCatus/1means{MyLovelyCat,Kitty,
Max,Smokey,Kitten4}.HowcantheagentsdeterminethatcatandfelisCatusare
synonymous? They can't because in extensional semantics, synonymous words
shouldbecoreferential,thatis,theyshouldpointtotheverysamesetintheworld.
But in Jerry and Tom's case, sets are ontologyinternal, therefore agentspecific,
thereforeunabletorefertothe'verysame'set.Evenifthewordingisthesameand
32
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
setsappeartobeidentical,synonymycannotbeestablishedbecausenamesarenot
grounded56. Here it seems to be the case that each agent has its own private
language (to use Wittgenstein's 1953 terminology (Wittgenstein's 1953)) which is
comprehensible only to its single originator because the things which define its
vocabularyarenecessarilyinaccessibletoothers(Candlish2008)57.Whatweneedin
ordertoachieveagentcommunicationisaconnectionbetweentheagent'slexemes
andtherealworld.Butaswesaw,thishastobeachievedthroughintension.Inother
words,weshouldbuildasenseforeverylexemeintheagent'sontology;amental
representationthatcanhelptheagentunderstandthewords'meaningsandmediate
inordertodesignatethesetofcatsintherealworld.Asshowninthediagrambelow,
whatweneedtoachieveistocreateasenseCAT58intheagent'smind.
Diagram3:Agent'mental'representations(Whatwewouldliketohave)
Thenatureofthismentalrepresentationwillbediscussedinthelastchapter,where
56 One could say that the ontology engineers of the two agent assign meaning to 'cat' and 'felisCatus'
respectively,sothroughtheengineer'smindthelanguageworldconnectionwerequireisestablishedbutthat
wouldonlybeusefuliftherewascommunicationbetweentheengineers,thereforehumanandnot agent
interaction.
57 Toavoidconfusion,itisimportanttomentionthathumanlanguageincludesbothsyntacticrulesandwords,
whosecombinationgeneratessentences(Chomsky1957).However,ontologylanguagessuchasKIFcontain
onlysyntactic rules,basic operators and quantifiers; words are generatedand their meaningis assigned
pseudoextensionally,asIclaimed.Thatexplainswhylexemesandtheirmeaningsarenotalreadyshared
amongagentswhohaveacommonontologylanguage.
58 IcapitalisewordsformentalrepresentationsaccordingtotheconventioninPhilosophyofLanguageand
CognitiveScience.
33
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
wewillseehowtheSemanticMatcheriscompatiblewiththisnotionofmeaningand
whatconsequencesthishasforontologyengineering.
Belowisatableofsynonymsforsymbols,mentalrepresentationsandentitiesinthe
world,asfoundinfamousworksinPhilosophyofLanguage(Frege1892;Carnap
1947),Semiotics59 (Saussure1916;Peirce19311958collectedworks)andLinguistics
(OgdenandRichards1923).ThetermsthatIwillmainlybeusingare lexeme, sense
andreferencerespectively.
Table2:Triangleofreference(terminology)
59 i.e.thestudyofsigns
34
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
(henceforthIR)task:givenasurprisinglexeme,thesystemhastosearchthroughall
thePA's 'candidate'lexemesandreturntheones thataresemanticallycloseto it.
Candidatelexemes includenames ofclasses,individuals orrelationsandexclude
variablenames(e.g.?X?Y),quantifiers('forall','exists')andoperators('and','or','=>'
etc).Itwillhopefullyturnoutintheendthatthisisanadequatewayofmeetingboth
theimplementationalandtheoreticalchallengesdiscussedabove.
Thesystemwhichcomputessemanticsimilarity,calledSemanticMatcher,isasearch
engine whose queries and documents are intensional meanings (senses). As a
thoughtexperimentimagineaGoogleofmeanings,wherewecaninputamental
representationandgetasoutputarankedlistofsimilarmentalrepresentations.In
the Semantic Matcher these meanings are simulated by bags of words (that is
multisets60ofwords(Lewis1998:6)).Thebagofwordsmodel(oftenabbreviatedto
bow)isapopularassumptioninInformationRetrievalwherebydocuments(i.e.
representationsofwebpagesaftertextprocessing)aresetsofindexterms(words)
whichcanretaintopicalmeaningwithoutretainingtheoriginalorder.Thefrequency
ofeachwordinthebagdeterminesitsimportanceinthedocument.Forexample,a
web page which talks about the song The rain in Spain (see diagram 8) can be
represented as the bag [where, soggy, plain, spain, spain, rain, spain,
term from Folksonomy (Vander Wal 2007). Folksonomy is a folk taxonomy that
emergesoutofcollaborativetaggingofinformationresources.Forexample,usersof
Delicious61 can label bookmarked web pages with keywords that they consider
relevanttothepage'scontent.Wordsthatarerepresentativeofthepagetendtobe
usedmoreoftenandgraduallyabottomupnotcentrallycontrolledclassificationof
bookmarkswillemerge.Atagcloudisthedepictionofabagofwordscreatedby
usersforaparticularresourceandconsistsofacollectionofkeywordswhosesize
60unorderedlistswhosemembers(wordsinourcase)canappearmorethanonce;alsoknownasweightedset
(Blizard1988).Forexample, [scissors, pen, ink, pencil] isasetwhile [scissors, pen, pen,
pen, ink, pencil, pencil]isamultiset.
61 http://www.delicious.com/
35
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
differsaccordingtotheirimportance.Inthefollowingtwopictureswecanseewhata
bagofwordsandatagcloudforthewordbachelorcanlooklike.
bachelor,bachelor,
unmarried,male,
man,single,young
enjoy,nightclubs,
man,wife,whiskey,
bachelor,marry,women
life,women,unmarried,
single,outgoing,male
Picture1:Bagofwordsforlexeme
'bachelor'
whiskey
wife marry
single
nightclubs bachelor young
unmarried man
outgoing
Picture2:Tagcloudforlexeme'bachelor'
Folksonomywillbediscussedfurtherinchapter4whilethebagofwordsmodelin
IRwillbeshowntoworkinpracticeinchapter3.TheSemanticMatcherconstructs
sensesbycollectinginformation,whichisparsedtorenderabagofwords.Inthe
nextchapter,thesystem'sarchitecturewillbedescribedbutitisonlylaterthatIwill
showwhyIchoosetocreatementalrepresentationsinthisway.
36
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Summaryofchapter2
In this chapter I explained the problem that our system has to deal with and
summarised some previous work done in the area of semantic matching. Then I
discussedthechallengesthatonlinesemanticmatchinghastomeetandintroduced
theprinciplesthatguidedthedesignoftheSemanticMatcher.Nowwecanproceed
tothedetailsofthesystem'simplementation.
CHAPTER 3 Implementation
TheSemanticMatcherisasearchenginethattriestofindthe'bestmatch'forthe
SPA'slexemeamongthecandidatelexemesinthePA'sontology.TheSPA'slexemeis
expandedintoa'bagofwords'thatmakeupitsintensionalmeaning(i.e.sense).This
37
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
bagwillserveasaquerytothesearchengine62.ThePA'scandidatelexemesarealso
bagsofwords,actingasacollectionofdocumentswhichwillberankedfromthemost
totheleastrelevant63afterthesearchhasbeenperformed.
Throwingunorderedsetsofwordsinsideanontology,someofwhichmightfailto
capturethemeaningofthelexemetheyaresupposedtodescribe,mightseemtobe
outsidethespiritofontologyengineeringandformalreasoning,wheredefinitions
havetobeprecise.However,asdiscussedearlier,theformalapproachhaslimitations
as to semantic assignment and the approach proposed in this study has its own
merits.First,itproducesaccurateandfastresults(seesection3.2),asisrequiredin
an agent communication situation. Second, it establishes a notion of meaning (in
particular, combination of formal definitions of lexemes from the ontology and
informalknowledgeassociatedwiththemwhichiscompatiblewithcurrentresearch
in PhilosophyofLanguageandCognitiveScience(seesection 4.1).Thethirdand
mostimportantcontributionofsuchamodelofsemanticmatchingisthatitbrings
together ontologies with folksonomies64, and points to an interesting research
direction in Ontology Engineering, where lexemes can achieve some semantic
grounding (i.e.relationbetweenthewordsintheontologyandtheobjectiveworld),
which can in turn i) compensate for the potential absence of Uniform Resource
Identifiers(URIs)65, ii) partlysolvethesymbolgroundingproblemassociatedwith
62 e.g.somethinglikewhatwewouldsubmitasinputtoGoogle;nottobeconfusedwithaformalquerywritten
inSQL,SPARQLetc
63 RelevanceisafundamentalnotioninInformationRetrieval,whichwecouldthinkofintwodifferentways:1)
asameasureofhowwellthesearchenginesatisfiestheuser'sinformationneeds(knownasuserrelevance)
and2)asameasureofhowsimilarthequery'scontentistothedocument'scontent(topicalrelevance)(Croft
etal.2010:4).Inrealuserenvironments,thefirstkindofrelevanceiscrucialforarankingalgorithmto
predict,butinmanyothercases,includingtheSemanticMatcher,allwecareaboutistopicalrelevance.
Therefore,inourcontext'relevance'isanotherwordfor'similarity',andthiswillbecomemoreevidentlater
whenwetalkabouttheVectorSpaceModel,whichcomputesthesimilaritybetweenaqueryandevery
documentinthecollection.
64 i.e.informalandimprecisemodelsoftheworldthatarecreatedinawaysimilartocollaborativetagging(the
folk'definition'thatemergesoutofthetagsgivenbyuserstoe.g.photographsonFlickr)
65 URIsareusedinRDFandOWLasameanstorefertoa'resource'thatisanentityintheworld,includinga
webpage,inwhichcaseaURIisaUniformResourceLocator(URL)(BernersLeeetal.1998).URIsare
betterthanlexemesatachievingsymbolgrounding,althoughaswewillseeinchapter4,theyalsohave
limitations.Inthecontextoffirstorderontologies,itisreasonabletoexpectlexemestobeannotatedwith
URIs,andthiswillbecomeusefullaterinthisstudy.
38
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
the formal semantics of ontologies (see section 4.2) and iii) provide a theoretical
frameworkonwhichtojudgesemanticproximityasopposedtosemanticidentity,
which might be crucial in agent communication with heterogeneous ontologies
(section4.3).
In what follows I will describe the Semantic Matcher architecture and draw the
analogybetweenmyimplementationandInformationRetrievalmethodologies.Itis
onlyinthefinalchapterthatIwilldemonstratehowthisissupportedbyresearchin
otherfieldsandwhatimplicationsithasfortheideaof'formalontologies'.
FortheSemanticMatchersomeoftheabovestagesareuseful,somenotapplicable,
whilesomeothers(e.g.parsingWordNetandSUMOontologies)hadtobeadded.
Therougharchitectureofthesystemisshowninthefollowingdiagram:
39
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
all_words
Compute similarity
mappings sumo_
wordnet Bag1
2nd
SUMO parser
Bag3
SUMO,
MILO, docum_ kth
Domain subcl_inst Bagn
ontol.
...fork= ...fork=
threshold threshold
Stop
words
list
INPUT
INPUT
Bag
THE
Diagram8:SemanticMatcherarchitecture
40
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
ThewholepurposeoftheprocessistotakethePA'sontology,findthe'candidate'
lexemes, that is lexemes eligible for matching (see section 2.3.3), aggregate
informationfromdifferentdatabasesinordertocreatea'bagofwords'associated
witheachoneofthem,thencomputeabagofwordsfortheSPA'slexeme(butina
differentwayaswewillsee),comparetheSPA'slexemewithallofthePA'scandidate
lexemesforsimilarityandreturnarankingofbestpossiblematches.Fourdatabases68
areusedintotal.Thefirstthreeareparsed,resultinginthecreationofanotherfive
databases69whichhavebeencreatedonceanddon'thavetoberecomputed.Therole
ofthelatteristoholdtherelevantcontentextractedfromtheoriginaldatabasesinan
easilyprocessableformat.Thenextphaseinthediagramisthecreationofthebags
for the PA's candidate lexemes. The process ends with the Semantic Matcher
returningtherankinggiventhe'query'(i.e.theSPA'sbag).
Thesystemcanbebrokendownintothefollowingcomponents:
TrainingtheTextAcquisitionModel(diagram5).
ThisisnotanactualstageinInformationRetrieval,butisusefulforour
purposes(seesection3.1.1.1).
SensecreationandTermWeighting(diagram6)
This is equivalent to the indexing process in IR but with significant
differencesinthesubprocessesinvolved.
QueryProcessing(diagram7)
ThisissimilartothequeryprocessinIRbutwithoutuserinvolvement
68 shownasgreencylindersinthediagram
69 yellowcylindersinthediagram
41
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Diagram5:TrainingtheTextAcquisitionmodel Diagram6:TextAcquisition&TextTransformation
THE 3 STAGES
OF THE
SEMANTIC
MARCHER
Diagram7:QueryProcessing
Thesestagesaredescribedinmoredetailbelow.
TheTextAcquisitionModelisasetofdatabasescreatedbytheWordNetandSUMO
42
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
parserstakingasinputthelexicalresourceWordNet(Milleretal1998)70,adatabaseof
SUMOWordNetmappings71(NilesandPease2003)andacollectionof645ontologyfiles
(different versions of 38 ontologies that extend SUMO)72. WordNet (WN) is an
electronic lexical reference system for English, designed in accordance with
psycholinguistictheoriesoftheorganizationofhumanlexicalmemory(Milleretal.
1988).Itcanbedescribedasageneralthesaurusinwhichwordsaregroupedinto
fourpartsofspeech,namelynoun,verb,adjectiveandadverb.Withineachpartof
speechwordsareorganisedintosynsets,thatissetsofcognitivesynonyms,which
collectivelyrepresentasense,or'meaning'.Forexample,thenoun'bat'isamember
of5differentsynsets,twoofthembeing{bat,chiropteran}and{squash_racket,bat},
referringtotheanimalandtheracketrespectively.Someofthesynonymsetswhere
thenoun'home'belongsare{family,household,house,home,menage},{base,home},
{home, nursing_home, rest_home}, referring to the social unit, location and
institution respectively. The verb 'hold' is part of the sense representations {hold,
support, sustain, hold_up} and {keep, maintain, hold}, among others. WordNet
capturesaveryimportantaspectofLexicalSemantics:thefactthatrelationssuchas
synonymy, antonymy (semantic opposition) and hypernymy (subsumption) hold
betweensensesandnotbetweenwords73.Forinstance,wecanonlyclaimthatthe
noun'head'issynonymoustothenoun'principal'withrespecttooneoftheirsenses
(i.e. the meaning they share) and WordNet captures this notion of synonymy by
listing these two lexical entries in the same synset, namely {principal,
school_principal,head_teacher,head}74.Synsetsareorganisedinacomplexhierarchy
withrelationssuchasISAandPARTOF;therefore,theycanhaveothersuchsense
groupingsashyponyms,hypernymsandmeronyms75.Synonymsetsusuallyhavea
70 Dependingonthefocus,WordNetcanbedescribedasathesaurus(becauseofitsuseofsynonymsforlexical
entries)orasalightweightontology(becauseofitstaxonomicalinformation,i.e.hypernyms,hyponyms,
meronymsetc).WordNetisdownloadablefromhttp://wondnet.princeton.edu.
71 http://sigmakee.cvs.sourceforge.net/viewvc/sigmakee/KBs/WordNetMappings/
72 foundintheSUMOrepository(http://sigmakee.cvs.sourceforge.net/viewvc/sigmakee/KBs/)
73 Sinceasemanticrelationisarelationbetweenmeanings,andsincemeaningscanberepresentedbysynsets,
itisnaturaltothinkofsemanticrelationsaspointersbetweensynsets(Milleretal.1993:6)
74 Ofcourse,theword'head'isamemberof32othersynsetsand'principal'belongstoanother5synsets.
75 e.g.'car'isahyponymof'vehicle','vehicle'isahypernymof'car','wheel'isameronymof'car'
43
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Theabovedatabaseswereparsedtocreateasetofneweasilyreadabledatabases.
WordNet and SUMOWordNet mappings were scanned with regular expressions
usingtheWordNetparserandtheontologyfileswereparsedwiththeSUMOparser.
Thesecondparserwasmoresophisticatedbecauseoftworestrictions:1)SUMOand
itssubontologies(MILOandthevariousdomainontologies)arewritteninSUOKIF
withbalancedparentheses.Thisprecludedthepossibilityofextractingthenested
76 FormoreinformationonthevarioususesofWordNetsee(Fellbaum1998).
77 NilesandPease(2003)callthisrelation'hypernymy'(i.e.thelexemeisahypernymoftheWNsynset).Inthis
contextIavoidthistermbecauseitmightmisleadusintothinkingthatweextracthypernymsofthelexeme,
butthisisnottruesincenorelationoflexeme synsethasbeenestablished.
78 Setinclusion()andsetmembership()respectively
79 http://www.ontologyportal.org/SUMOhistory/SUMO1.22.txt
Forexample,documentationofalexemeintheGeographysubontologyversion4mightbefoundinthe
CountriesAndRegionsontologyversion6.
44
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
structures with regular expressions (regex) given that languages with balanced
bracketinghave contextfree expressivity(see(Russeletal1995:656;Manna1974))
becausetheirsyntacticrulessupportrecursion80,whileregexarestringsofaregular
language, which is lower in the Chomsky hierarchy (Chomsky and Schtzenberger
1963). While regular expressions can be generated andrecognised (and therefore,
matchedwithpatterns)usingFiniteStateAutomata(FSA), contextfreestringscan
onlybeparsedwithPushDownAutomata(PDA).Therefore,theSUMOparserwas
essentiallyaPDAwithacounterthatkeptarecordofhowmanybracketsareopen
andhowmanyareclosed. 2) Asecondcomplicationwasthatasmallnumberof
ontologyfilescontainederrorssuchasunbalancedparenthesesorunbalancedquotes
(i.e.......)whicheithercausedtheprogramtoexit,ormadethewrongpredictions.
Thereforetheparserwasdesignedtoberobusttofailuresandsolvetheseproblems
inamoresophisticatedway.
ThedatabasescreatedcanbesummarisedinTable3:
80 ThisistrueoftheSUOKIFsyntax(seePease2009)
45
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
FULLNAME INFORMATIONCONTAINED
all_words Allwords A set of all words found in
WordNet
Table3:DatabasescreatedbyWordNetparserandSUMOparser
ThesedatabasesformwhatIcalltheTextAcquisitionModel,thatisacollectionof
resources (later read as lookup tables; dictionaries) that will determine what
informationcanenterthebagforeachlexemeinthePA'sontology.Thesefileshave
beencreatedonceandwillonlyhavetoberecomputediftheontologiesorWordNet
versionhavetobeupdated.Theirformatisveryeasilyprocessablesoastodecrease
thecomputationaltimeofthenextstage.
This module takes as input the PA's ontology and the databases created in the
previous stage and returns a collection of bags of words, each one of which
representstheintensionalmeaning('sense')ofacandidatelexeme.Eachbagcontains
46
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
weightedwords,thatiswordswithacoefficientofimportance.
SensecreationismycovertermforTextAcquisition andTextTransformation;thetwo
subprocesses of creating the bags of words. Both of these phases are actual
components of an Information Retrieval task. The main difference is that in the
SemanticMatchertextisnotacquiredbyparsingHTMLpagesandextractingtheir
content,butbyaggregatinginformationfromdifferentdatabases81.Furthermore,in
the Semantic Matcher acquisition and transformation takes place many times for
everylexemesincebagsarefilledincrementally.Acomparisonofdocumentcreation
inIRandsensecreationherecanbeseeninthefollowingtwodiagrams:
Andwhere'sthat
and,where's,that,
soggyplain?In
soggy,plain,in,
Spain!InSpain!
spain,in,spain,the,
TheraininSpain
rain,in,spain,stays,
staysmainlyinthe
mainly,in,the,plain
plain!
where's,
where,soggy, soggy,plain,
plain,spain,spain, spain,spain,
rain,spain,stay, rain,spain,
mainly,plain stays,mainly,
plain
Diagram8:DocumentcreationinInformationRetrieval
81 AnexceptioniscomputingtheSPA'ssense.Wewillcometothislater.
47
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
VisaCard
<rawtext> <tokenisedtext>
card,bank,
money,card,visa, <stopword
visa,financial,card, freemultiset>
currency
Diagram9:SensecreationintheSemanticMatcher
Beforeweseewhatthesensecreationalgorithmlookslike,weneedtodiscusshow
textistransformedintowordsafteritisacquiredfromthedatabases.Thisprocessis
generally the same as the one used in Information Retrieval (i.e. involves
tokenisation,stoppingandstemming)(Manning2008;Croftetal.2010)withsome
additionsthatIdiscussbelow.
48
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
removedbeforehand82.Anextrafeaturethathasbeenadded,however,isresolving
word sequences from strings in camel case or with underscores (e.g. CreditCard,
credit_cardcreditcard),whichareverycommoninSUMOandotherontologies83.
Thenextstageisstopping,wherefunctionwords(e.g.'and','the','of'etc.),whichare
consideredsemanticallyvacuous,arepreventedfromenteringanybag.Thisisdone
inordertoreducestoragespaceandmakesearchduringthequeryprocessingphase
more efficient. Another reason is that such words are found in almost every
document('bagofwords'),sotheydon'thelpusdiscriminatebetweenthedifferent
optionstorank.Aswewillseelater,rarewords(i.e.wordsthatappearinonlyafew
documents)aremoreimportantthancommonwords.Termsasfrequentlyusedas
'the' are negligible. In the Semantic Matcher the list of tokens generated after
tokenisationischeckedagainstastopwordlist84andblacklistedwordsareremoved.
After that,thetokensthatremain inthelistgothroughastemmingfunction, as
describedbelow.
Stemming(alsoknownasconflation)isthemorphologicalanalysisperformedona
word(e.g.'providing')inanattempttoreduceittoitsstem(e.g.'provid')orbase
form(e.g.'provide').Thereasonforincorporatingastemmerintoasearchengineis
that words of the same class but with different inflections (e.g. drop, dropping,
dropped) are essentially thesame wordappearing morethan once butdisguised
undermorphologicalvariation;therefore,theyshouldbeconflated.Ifnostemming
wasinvolved,thedocumentwouldbeinaccuratelyrepresented,whichwouldinturn
affecttermweighting.Generallyspeaking,wordsthatappearmanytimesaremore
important or more 'representative' of the document than words that appear less
82 TheonlyreasonIusealightweighttokeniseristhatanythingmorecomplexthanthatwouldbetooambitious
forthetimescaleofthisproject.Itwouldbeinterestingtoseeinthefutureifamoresophisticatedtext
segmentationmodulewouldrendertheSemanticMatchermoreeffectiveandwhetherthiswouldhaveany
noticeablenegativeeffectoncomputationaltime.
83 Onethingthatcanbeimplementedinthefutureisabbreviationexpansiontoproducemeaningfultokensout
of
84 IusePedersen'sWordNetstopwordlist(availableon
http://www.d.umn.edu/~tpederse/Group01/WordNet/wordnetstoplist.html),towhichImadesomeadditions.
49
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
often.Withoutstemming,thepictureofthedocumentisaltered.Stemmerstypically
dealwithinflection,thatismorphologicalvariationwithoutchangingthewordclass
(i.e.thepartofspeech).Examplesofinflectionalsuffixationarepluralisation(s,ies,
es etc.), past tense (ed, ied), progressive form (ing) and others. Derivational
morphology (e.g. ion) creates a word that has a different class (e.g. create
creation) and consequently a different meaning. This means that words with
derivationalvariationarenotsupposedtobeconflated.FortheSemanticMatcherI
implementedaslightlyalteredversionoftheKrovetzstemmingalgorithm(Krovetz
1993), whichstripsinflectionalsuffixesandbringsthewordtoitsrootform.The
mainadvantageoftheKrovetzstemmeroverothers(e.g.theLovinsStemmer(Lovins
1968)orthePorterStemmer(Porter1980))isthatitisbasedondictionarylookup 85,
therefore it produces actual words rather than stems (e.g. 'describe' instead of
'describ' from the word 'describing')86. For the same reason it can also handle
irregularpluralsorpasttenses.Oneofitsdisadvantagesisthatsinceitperforms
deeperanalysisandmightincludeanumberofdictionarylookupsforasingleword,
itcanslowdowntheindexingprocessinlargescalesearchengines.However,inthe
contextoftheSemanticMatcher,whereonlyafewthousandwordsarestemmed,this
wasnotanissue.
Asdescribedabove,theSemanticMatchercreatesbagsofwordsforlexemesintwo
phases:textacquisition,wheresometextrelevanttothelexeme'smeaningisextracted
fromthedatabases,and texttransformation,wherethistextisconvertedintowords
(called'indexterms'inIR).Theprocessoffillingthebagswithwordsisroughly
describedwiththeSenseCreationalgorithm:
85 e.g.itremovesasuffixandaddsletterswhilerepeatedlycheckingagainstadictionaryuntilthebaseformis
foundornomorerulesapply.
86 Foracomparisonofdifferentstemmerssee(FullerandZobel1998)
50
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
SenseCreationalgorithm
Transform the lexeme and throw the resulting word(s) in the bag,
Transform all the strings extracted and throw them in the bag,
Extract the i.d. of the most frequently occuring synset for this word,
Transform all the strings extracted and throw them in the bag
Transform all the strings extracted and throw them in the bag,
If there are comments in the ontology that contain the lexeme, then:
Now that the senses for our candidate lexemes have been constructed, we can
proceedtotermweighting,thatisassigningtoeverywordineverybaganumber
whichmeasureshowwellthatwordrepresentsitsbag.Thiswillbeessentialforthe
retrieval model used in the next stage (section 3.1.1.3) and can be precomputed
beforetheagentsstarttocommunicate.
51
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
etal.2010:22)explain,therearemanyvariationsoftheseweights,buttheyareall
based on acombination ofthefrequency orcountofindextermoccurences in a
document(thetermfrequency,ortf)[i.e.numberofwordsinabag]andthefrequency
ofindextermoccurencesovertheentirecollectionofdocuments(inversedocument
frequency,or idf)[i.e.numberofbagsthatcontainthiswords](authors'emphasis).
Below I will briefly explain the intuition behind the version I am using in the
SemanticMatcher(adaptedfrom(Lavrenko2009)).
Imaginethatwehaveacollectionofterms(i.e.allwordsinallbags)andforevery
bagwewanttodeterminetheweightofeachoneoftheseterms.Thiswillbedoneon
thebasisof6observationsabouthowwellwordsrepresenttheirbag:
1)Presenceorabsenceofawordisthemaincriterion.Ifawordisnotinthebag,its
weightshouldbe0;otherwise,itshouldbegreaterthan0(e.g.1).
2) Frequencyofawordinthebagmightindicatethatthiswordisa'keyword'(i.e.
indicativeofthebag'stopic).Therefore,morefrequentwordsshouldbegivenhigher
weights.Observations1and2explaintheroleofthenumeratorinformulas1.1and
1.2.
3)Ifawordisoftenrepeatedinabag,itmayonlybebecausethebagistoobigand
notbecausethewordisakeyword.Thus,allotherthingsbeingequal,longerbags
shouldhavesmallerweights.Thisexplainstheexistenceof|D|(or|PBc|)inthe
denominator.
4)Rarewords(i.e.wordsthatdonotappearinmanybags)candifferentiatebetween
bags betterandthereforecarry moremeaning(ibid).Thisis whatSprckJones
(SprckJones 1972) calls 'term specificity', the rationale behind inverse document
frequency:thefewerthebagsthatcontainthisword,themorepowerfulthewordisin
52
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
its own bag. This explains the existence of the denominator dfw in the formulas
below.
5) Thefirstoccurrenceofawordinabagismoreimportantthanitssubsequent
occurrences.Forexample,ifweseetheword'prosopagnosia'inatext,itisverylikely
thatthistexttalksaboutprosopagnosiaorperceptiondisordersorsomethingsimilar,
butifweseethewordagain,thisdoesnotaddasmuchmeaningtoitsbagasthefirst
time it was encountered. Hence, we need some correction so that subsequent
encountersofthetermarelessandlessimportant.Thisexplainstheadditionoftfw,D
(ortfw,PBc)inthedenominator.
6)Inlongerdocuments,repetitionsaremoreimportant,therefore,thelongerthebag
theweakertheabovecorrectionshouldbe;hencetheexistenceofthedenominator
avg.doc.len(oravg.bag.len).
tf w , D
tfidf w ,D =
tf w , D
kD
log
C
df w
avg.doc.len
where:
w is a word,
D is a document,
tfidfw,D is the term-frequency/inverse-document-frequency weight of the word in the document,
tfw,D is the frequency of the word in the document,
k is a constant (usually set to small values, e.g. 0.1),
D is the length of the document (i.e. how many words it contains),
avg.doc.len is the average document length in the collection,
C is the length of the collection (i.e. how many documents are available to the search engine),
dfw is the document frequency of the word (i.e. how many documents from the collection contain
this word)
Formula1.1:tfidftermweighting(InformationRetrievalnotation)
53
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
tf
tfidf w ,PB c =
tf w , PB
w , PB c
kPB c
log
C
df w
c
avg.bag.len
where:
w is a word,
PBc is the Planning Agent's bag of words for a particular lexeme,
tfidfw,PBc is the term-frequency/inverse-document-frequency weight of the word in the PA's bag for this
particular lexeme,
tfw,PBc is the frequency of the word in the PA's bag for this particular lexeme,
k is a constant (usually set to small values, e.g. 0.1),
PBc is the length of the PA's bag for this lexeme,
avg.bag.len is the average bag length in the PA's ontology,
C is the length of the collection (i.e. how many bags there are in the PA's ontology),
dfw is the document frequency of the word (i.e. how many bags from the collection contain this word)
Formula1.2:tfidftermweighting(SemanticMatchernotation)
Thenextstageinoursystem'sarchitecture(highlightedindiagram7above)isquery
processing(alsoknownasqueryexecution).Thisphaseoccurs'online',thatisduring
agent interaction: when the PA receives a surprising lexeme from the SPA, plan
execution fails.Then the PAsends therelevantsurprising queries to ORS,which
identifies a word in the query which is not in the PA's ontology and asks the
SemanticMatchertosearchthroughallofPA'scandidatelexemesfortheonewhich
54
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
ismorelikelytobesynonymoustothesurprisinglexeme.Ascanbeseenindiagram
7, the input to this stage is the PA's bags of weighted words and the SPA's bag.
However,whatwehavenotexplainedyetishowtheSPA'slexemeacquiresasense
sincetheSenseCreationandTermWeightingphasehasalreadybeencompleted.
TheSemanticMatcherispartofORS,whichispluggedintothePlanningAgentto
facilitateitsinteractionwithServiceProvidingAgents.Atthemoment,ORSsupports
ontologies written in KIF while its newly created semantic matching module is
suitableforPAsthatuseSUMOsubontologiesinparticular87(writteninSUOKIF).
However, as mentioned in the previous two chapters, ORS can't make any
assumptionsabouttheSPA'sontology,nordoesithaveaccesstoanypartsofitunless
theSPAitselfiswillingtorevealpartofitsrepresentationduringtheinteraction.All
thatORSknowsisthatSPAscanbequeried(andthisisdoneusingProloginthe
currentimplementation).ThismeansthatiftheSemanticMatcherhastocreatea
sensefortheSPA'slexeme,itcan'tfollowthesameprocessastheonedescribedinthe
SenseCreationandTermWeightingstage,eveniftheSPAhappenstohaveaSUMO
ontology.Moreover,giventhelimitedaccesstoSPA'sontology,weareleftwithout
contextfromwhichtobuildamentalrepresentation(sense)forthelexeme.Itseems
tomethatifpowerfulsemanticmatchingsystemsaretobeconstructedinthefuture,
theyshouldbeprovidedwithsomecluesastolexeme'smeaning,whichthesystem
canenrichwithsynonymsandotherrelatedwordsinordertoachievesensecreation.
Someideasare:
87 EventhoughORSisnotexpectedtoknowpartsofthePA'sontologybeforehand,thisisnotanadhocdesign
decisiongivenhowwidespreadSUMOis.ThatmeansthatthesemanticmatchingsupportingversionofORS
isinfactaSUMOpluginwheretheknowledgeofpredicatessuchas'subclass','instance'or'documentation'
canbesafelypresupposedsincetheyoccurinalltheSUMOontologies.Furthermore,itisnothardto
imagineasystemwhichactuallyguessesthewordsusedforsuchpredicates.Forexample,asystemcantry
wordsthathasaverysmallLevenshteindistance(Levenshtein1965)to'subclass','isa','isa','subClassOf'or
'instance','type'andothers.
55
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
EngineeringpracticethatIsuggestinthelastchapter:engineerscantagtheir
lexemes withkeywords oruseURLs (URIs thatpointtoretrievabledigital
resources)whichdirectthesystemtoacollaborativelycreated'tagclouds'(in
fact'broadfolksonomies'(seesection4.2)).
Somecontextismadeavailabletothesemanticmatchingsystemthroughthe
useofURLspointingtonaturallanguagedefinitions,relatedtexts,WordNet
synsets(orsimilargroupingsof'senses'indifferentthesauri)orperhapsother
formalontologies(e.g.OWL,RDF(S)andothers)88.
IntheSemanticMatcher,Ihaverestrictedtheimplementationtooneoftheseoptions,
namelylexemesbeingannotatedwithURLswhichpointtonaturallanguagetext89.
WhentheORSDiagnosisSystemdeterminesthatthereisasemanticmismatch,it
queriestheSPAtoobtaintheURLforthisparticularlexeme90.TheURListhengiven
totheSemanticMatcher,whichextractsthecontentfromthewebpage(i.e.removes
'noise'suchasscripts,tags,advertisementsetc.)byusingregularexpressionsandthe
TagPlateauoptimisationalgorithm(Finnetal.2001).Thisalgorithmisbasedonthe
observationthatthecontentofawebpagecontainsfewerHTMLtagsthanother
partsofthepage(e.g.advertisements).We representthepageasalistof'tokens'
(whichareeithertagsor nontags)andtrytofindthetagpositionsiandjthat
maximisethefollowingobjectivefunction:
56
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
i1 j N1
b n 1bn bn
n=0 n=i n = j1
where:
bn is the nth token in the web page; bn=1 if the nth
token is a tag and bn=0 otherwise,
i and j are the two values (token positions in the page)
which are assumed to include the page's content if they
maximise the above objective function
N is the number of tokens in the web page
Objectivefunctionforfindingtagplateau
ThistechniqueisusedinInformationRetrievalintheTextAcquisitionphase,before
thecreationofdocuments.HereIuseitinasomewhatunorthodoxway,inorderto
extracttextforthe'query'(i.e.theSPA'sbagofwords)91.Oncethecontentisextracted,
thelistofwordsacquiredundergoesstoppingandstemmingandtheresultingbagis
thesenseoftheSPA'slexeme92.Aswewillseelater,termweightingisverysimplein
queries(justfrequencyofthewordinthequery;thisisobvioussincethereisno
notionof'collection'ofqueriesor'average'querylengthetc.)soitdoesnotneedto
bestoredatthisstage.
Now it is time for query processing: the PA's bags and the SPA's bag enter the
compute_similaritymodule,whichreturnsarankingofthebestcandidatelexemes.
OurrankingalgorithmisbasedontheVectorSpaceModel (Saltonetal.1975;Salton
and McGill 1983) (see (Raghavan and Wong 1986) and (Lee et al. 1997) for an
overview),aretrievalmodelwhichtreatsdocumentsandqueriesasvectorsinahigh
dimensionalspace,whereeachwordinthecollectionisadimension.Eachvectorhas
a particularpositionintheVectorSpaceaccordingtowhatvalues itgetsineach
dimension.Thesevaluesarenothingbuttermweights(seeprevious section).For
91 Ofcourse,inrealIRsituations,queriesareinputbytheuseranddon'thavetobecreated(althoughtheycan
beexpanded(Manningetal.2008,chapter10)).
92 InthisprojectIdidnotattempttoexpandthesensewithsynonymsorhypernymsfromWordNet,butthisis
somethingthatcanbeimplementedinthefutureandmightproduceevenbetterresults.
57
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
example, suppose that there are 5 words in the collection ('go', 'California', 'sun',
'yoghurt'and'battery')and3documentsD1 = {go, sun, sun},D2 = {California,
battery, sun}andD3 = {yoghurt}.Thevectorsofthethreedocumentswouldbe5
tuples (i.e. would have 5 coordinates in the space), which, given a simplistic tf
weightingscheme,wouldlooklikethis:D1 = (1, 0, 2, 0, 0),D2 = (0, 1, 1, 0,
1),D3 = (0, 0, 0, 1, 0, 0)93.Withthemoresophisticatedtfidfweightingscheme
thatwedescribed,thevectorswouldhavethefollowingcoordinates:D1 = (0.61934,
0, 0.65675, 0, 0), D2 = (0, 0.61934, 0.61934, 0, 0.35261), D3 = (0, 0, 0,
0.67025, 0, 0).Diagram10showshowtwodocumentsandonequerycouldbe
positionedina3dimensionalspace.Diagram11illustratesthesameinthecontextof
theSemanticMatcher.
93 Thisistheweightingschemeusedforqueries,asshowninformula...
58
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Diagram10:VectorSpaceModelwith3dimensionsand3vectors(InformationRetrievalnotation)
59
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Diagram11:VectorSpaceModelwith3dimensionand3vectors(SemanticMatchernotation)
60
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Similaritybetweenaqueryandadocumentisdeterminedbytheirproximityinthe
multidimensionalspace(i.e.thesmallerthedistancebetweenthem,themoresimilar
theyare)andcanbemeasuredinmanyways(e.g.cosinecoefficient,diceproduct,
Jaccardcoefficientetc.)94.IntheSemanticMatcherIusethetfidf weightedsum,
which is shown in Formulas 2.1and2.2.Formulas 3.1and3.2showthesame in
simplernotation.
Q D
s D , Q = tf w ,Q tfidf
w ,D i j
i=1 j=1
precomputed
where:
s (D, Q) is the similarity of the document to the query,
tfwi,Q is the frequency of word i in the query,
tfidfwj,D is the term-frequency/inverse-document-frequency weight of word j in the document,
Q is the query length (i.e. how many words it contains),
D is the document length,
Formula2.1:tfidfWeightedSum(InformationRetrievalnotation)
94 Foranintroductiontothesemethodssee(Hillenmeyer2005).
61
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
SB PBc
s PBc , SB = tf w , SB tfidf
i=1 j=1
w , PB
i j c
precomputed
where:
s (PBc , SB) is the similarity of the Planning Agent's bag for a particular lexeme to the Service
Providing Agent's bag,
tfwi,SB is the frequency of word i in the Service Providing Agent's bag,
tfidfwj,PBc is the term-frequency/inverse-document-frequency weight of word j in the Planning
Agent's bag for a particular lexeme,
SB is the length of the Service Providing Agent's bag (i.e. how many words it contains),
PBc is the length of the Planning Agent's bag for a particular lexeme,
Formula2.2:tfidfWeightedSum(SemanticMatchernotation)
where:
s (D, Q) is the similarity of the document to the query,
tfwi,Q is the frequency of word i in the query,
tfidfwj,D is the term-frequency/inverse-document-frequency weight of word j in the document,
Q is the query length (i.e. how many words it contains),
D is the document length,
Formula3.1:tfidfWeightedSum(easierInformationRetrievalnotation)
62
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
where:
s (PBc , SB) is the similarity of the Planning Agent's bag for a particular lexeme to the Service Providing
Agent's bag,
tfwi,SB is the frequency of word i in the Service Providing Agent's bag,
tfidfwj,PBc is the term-frequency/inverse-document-frequency weight of word j in the Planning Agent's
bag for a particular lexeme,
SB is the length of the Service Providing Agent's bag (i.e. how many words it contains),
PBc is the length of the Planning Agent's bag for a particular lexeme,
Formula3.2:tfidfWeightedSum(easierSemanticMatchernotation)
TheVSM,asopposedtootherretrievalmodels95,isdistancebasedandthisproves
useful in a semantic matching system as it can address the issue of similarity of
meanings,whichisnotpossiblewithastrictlyformalapproachtoontologies(see
section4.3).
HavingdiscussedtheSemanticMatcherarchitectureandtheintuitionsbehindit,we
canproceedtothesystem'sevaluation,whichisthesubjectofournextsection.
Asmentionedintheintroduction,theimplementationoftheSemanticMatcheris
tryingto testthehypothesisthatintegratingfolksonomies(seenasbagsofwords
here)intoformalontologiesallowsforeffectiveandefficientmatchingwhereusing
ontologiesalinewouldhavecausedfailedorpoormatching.Belowwewillseeto
whatextentourhypothesisisconfirmed.
95 OtherretrievalmodelsaretheBooleanModel,RegionModels,theProbabilisticModel,the2PoissonModel
etc.;see(Hiemstra2009;Croftetal2010:237300)
63
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
3.2.1 Effectiveness
In this section I describe how the Semantic Matcher was evaluated and how the
results are to be interpreted. The evaluation process is inspired by techniques in
Information Retrieval but had to diverge from mainstream methodologies due to
some special restrictions thatIwillexplain.TheSUMOrepositoryofontologies96
servedasatestbedforobjectiveevaluation.Somedegreeofsubjectivity(intheformof
assumptions)wasalsonecessarybutIwilltrytoshowthatthisdoesnotaffectthe
qualityoftheresults.
96 http://sigmakee.cvs.sourceforge.net/viewvc/sigmakee/KBs/
97 Relevancejudgementsshouldbedoneeitherbythepeoplewhoaskedthequestionsorbyindependent
judgeswhohavebeeninstructedinhowtodeterminerelevancefortheapplicationbeingevaluated.(Croftet
al.2010:308)
98 For example, precision at rank 5 means: Among the top 5 documents in the ranking, how many were
relevant?(theidealwouldbe5,inwhichcasewehave100%precision)andrecallatrank5means:Among
alltherelevantdocumentsinthetestcollections,howmanywereretrievedinthetop5ranks?(Obviously,
hereitisimpossibletohave100%recallunlessthenumberofrelevantdocumentsinthewholecollectionis
5).
64
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
example, the lexeme 'Corn' from the SUMO Midlevel ontology version 3399 has
changed to 'MaizeGrain' in version 34. Since this is an official renaming of the
concept,wecanbesurethatthesetwolexemesaresynonyms,thereforewehave
excellentrelevancejudgementsifweusethebagcreatedfor'MaizeGrain'asaquery
and all the bags corresponding to lexemes in version 33 of the ontology as
documents: 'Corn' is relevant to 'MaizeGrain'. The second requirement is more
difficult to meet, as it is hard to find ontology versions where more than one
renaminghasoccurred.Inotherwords,whilewehaveasetofdocumentswecan't
haveasetofquerieswithrelevancejudgementsforthisparticularsetofdocuments.
Usuallywehaveoneoratmosttwoqueries.Anotherconstraintisthatforevery
query,thereisonlyonerelevantdocument.Forexample,for'MaizeGrain'only'Corn'
isrelevant100.Thismeansthatevaluationmeasureslikerecallandprecisionwouldbe
meaningless:forexample,if'Corn'returnsinrank1,wehave100%recalland100%
precisionatrank1,100%recalland50%precisionatrank2andsoon;ifitreturnsin
rank2,wehave0%recalland0%precisionatrank1,100%recalland50%precisionat
rank2.Evenifwefindsuchnumbersuseful,onequeryisnotenoughtoassessthe
qualityoftheSemanticMatcher.Ifwetakeanotherpairofsynonyms(againofficially
renamed concepts), say, 'TelevisionReceiver' from the Communications ontology
version4and'Television'inversion3,wehaveadifferentsetofdocumentsnow
(sincewearesearchingthroughlexemesofadifferentontology).Allthisshowsthat
IRtechniquescannotbeappliedtotheletter.WhatIdidinsteadwastakeallthe
guaranteedsynonymsfromdifferentSUMOontologies101,treatthebagofthenew
99 http://sigmakee.cvs.sourceforge.net/viewvc/sigmakee/KBs/Midlevelontology.kif?revision=1.33
100RelevancejudgementsareusuallybinaryinInformationRetrieval(i.e.adocumentiseitherrelevantornon
relevant;nothinginbetween).Similarly,inthisproject,Itrytofindtheoneandonlyexactsynonymforthe
surprisinglexemefromlexemesinthePlanningAgent'sontology.Sensesimilarityisonlyusedasameansto
predictidentity.Aswewillseeinthelastchapter,itwillbeinterestingtoextendagentcommunication
systemslikeORStohandlesemanticsimilarityasopposedtoidentity.Forexample,ifthePAdoesnothave
an exact synonym in its ontology, can it achieve its goals or something close to its goals with similar
predicates,classesorindividuals?(seesection4.3)
101Ihadalreadyexcludedcasesofcorrection,inwhichbothtermsappearedwhensomeoldnameshadbeen
mistakenlyleftintheontology.Ialsoexcludedcaseswhereanamewasdifferentinthenextontologyversion
buttherewasnoofficialrenaming.Thisreasonablesince,forinstance,thedifferentassertions (subclass
Investor CognitiveAgent) and (subclass Investor SocialRole) [from FinancialOntology
versions2and3respectively]don'tmeanthat'CognitiveAgent'issynonymousto'SocialRole'.
65
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
lexemeasaqueryandthebagsofthelexemesintheontologypriortotherenaming
asdocuments;thenseehowwelltheSemanticMatchercanpredictsynonymy.
Belowareallthetestcollectionsavailableforevaluationandthesystem'sprediction.
Theoutputoftheevaluationmodulecanbeseenintheappendix.
102Forexample,intheMidlevelontologiesthatIuseformyagentcommunicationscenario(section3.3)the
enginehastosearchthroughmorethan1,000differentlexemes.
103ideallyinrank1,buteveniftheycomeinrank2or3(orslightlylowerdependingonourthreshold),itwill
notbedisastrousbecausethePAcantryallofthemonebyone.IntheORSscenariopresentedinsection3.3,
bothofthesemanticmismatchesthatarise,havetheirsynonymreturnedinrank1and,forthemoment,the
agentdoesnottryanyotherlexemesinaloopeveniffailuredidoccur.Thisiseasytoimplementinthe
futurebutnotpracticallynecessaryforourdemonstration.
66
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
67
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
14 version87
Test RepublicOfGeorgia Economy.kif RepublicOfGeorgia 1
collection version18 =Georgia
15
Test RiceGrain pictureList.kif RiceGrain 2
collection version23 =Rice
16
Test ScientificLaw engineering.kif ScientificLaw 3
collection version4 =Law
17
Test TelevisionReceiver Communications.kif TelevisionReceiver 1
collection version3 =Television
18
Test TurkeyBird Economy.kif TurkeyBird 1
collection version25 =Turkey
19
Test VehicleTire Midlevel VehicleTire 1
collection ontology.kif =Tire
20 version86
Test WaterVehicle Midlevel WaterVehicle 1
collection ontology.kif =Watercraft
21 version13
Table4:TestcollectionsfortheevaluationoftheSemanticMatcher
Theresultsinthelastcolumncanberepresentedinthefollowingpiechart,which
showswhatproportionofthe21correctmatcheswasreturnedinrank1rank3orin
lower ranks. As we can see, the majority of the correct lexemes are in rank 1,
approximately1/5ofthemcomeinrank2andanequalnumbercomeinrank3.A
smallportionofthemcomesinalowerrank.
68
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
5%
19%
Rank 1
Rank 2
Rank 3
Lower ranks
57%
19%
Howarewetointerprettheseresults?Thisiswheresomequalitativeanalysishasto
comeintoplay,sincetheSemanticMatcheroutputhastobeevaluatedinthecontext
ofORS.Inanagentcommunicationenvironmentitisimportantthattherightlexeme
isina'highenough'rank.Itdoesnotneedtoberank1sincethePlanningAgentcan
trythesecondoptionandthenthirdandsoon,ifitsplanfails104.Buthowhighis
'highenough'?SincePlanningAgentsincanperformanumberofontologyrepairs
(if we assume dissimilarity between their ontologies) before they achieve their
goals105, it would be acceptable to set something like rank 3 as a threshold for
semanticmatching.InpracticethismeansthatifthePAhastotrythreedifferent
lexemesbeforeitsplansucceeds,itwouldnotbedisastrous.Infact,evenamore
permissivethresholdmightbepossiblesincethePAdoesnotneedtoreplan106;only
tryonewordaftertheother,whichwouldtakeperhapslessthan1secondtogether
withtheSPA'sresponses.Giventhisreasonableassumption,thegraphabovetellsus
that95%ofthe'right'lexemescomeinahighenoughrankandamongthem,abig
104notcurrentlyimplemented
105whichisreasonablesincethePAandSPAshavedifferentontologies.
106althoughinthecurrentimplementationitdoes;thiscaneasilychangeinthefuture
69
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
portioncomesinrank1,whichisideal.Whatisalsointerestingtoseeisthatresults
in the first ranks contain words with similar meanings to the 'correct' one. For
example, the top6 results for 'VehicleTire' are 1) Tire (the synonymous one), 2)
VehicleWheel, 3) Wheel, 4) MaterialHandlingEquipment, 5) ArtilleryGun, 6)
Motorcycle.ThefulllistscanbeseenintheAppendix.
Another point where some subjectivity is involved is the choice of URIs107 which
annotatetheSPA'slexemes.AsIexplainedinsection3.1.1.3,wemakethisassumption
abouthowontologyengineersmighthelptodisambiguatethewordstheyuse.This
issomethingthatalreadyhappensinRDF(S)andOWL108ontologies,so,Ibelieve,it
is only a matter of time until this becomes a common practice for firstorder
ontologies,liketheonesderivedfromSUMO.TheURLswerechoseninasystematic
way but with some 'filtering': the lexeme which served as a 'query' (e.g.
'DrinkingCup')wastokenised(i.e.'DrinkingCup')andtypedintoGoogle;theURL
waschosenfromthetop20results.Thisprocesscouldnotbeautomatedbecausewe
hadtomakesurethatthewebpageis appropriate (i.e.static(HTML)asopposedto
dynamic(e.g.php),withenoughtext(typicallyoneparagraphormore),withatext
thatsomehowdescribesthemeaningofthelexeme109).
TheabovediscussionsuggeststheSemanticMatcherissuccessfulif,givenappropriate
URLsfor'query'lexemes110,itcanreturntherightlexemeinahighenoughrank.What
the'right'lexemeiscanbeobjectivelydeterminedfromtherenamingsintheSUMO
repository.'Highenough'and'appropriate'aresubjectivejudgements,butIshowed
whatthesenotionsmightmeaninanagentcommunicationcontext.Aswesawin
section3.1.1.3,the'appropriate'bagforalexemedoesnothavetocomefromaURL.
Thebestpracticeistogetontologyengineerstoprovidetagdataforthewordsthey
107URLsinparticular
108althoughquiteoftentheURIspointtononretrievableresources
109Forexample,ifwetypetheword'university'intoGoogle,thetoprankedresultsareverylikelytobepagesof
particularuniversitiesratherthanpagesthatdescribeswhatauniversityis.
110whatwewouldcall'surprisinglexemes'inthecontextofORS.
70
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
useasthiswouldcapturetheirintendedmeaningbetterthanURLscan(seesection
4.2).
With the above assumptions in mind, the Semantic Matcher produced very
satisfactoryresults.Ofcourse,becauseofthesmalldatasetusedforevaluation,it
wouldbetoooptimistictoinfertoomuchfrom95%ofcorrectresultsinhighranks.
However,theseresultsaredefinitelyencouragingandshowthatusingInformation
Retrievaltechniquestosolvesemanticmismatchesinthefutureisontherighttrack
andcanbeveryeffective.
3.2.2 Efficiency
Aswesaidinchapter1,efficiencyisveryimportantinagentcommunicationsincea
PlanningAgentmighttrytoachieveagoalbycontactingmanydifferentagentsand
repairingitsontologyanumberoftimes.Thiswasoneofthereasonswhyreasoning
or other 'deep' processing was avoided. To make the semantic matching process
faster,thesenses(bags)ofthelexemesinthePA'sontologyandtheirweightscanbe
precomputedbeforeagentcommunication(e.g.whilethe'ORSplugin'isinstalledto
thePA),eventhoughtheprocessdoesnottakemorethan4secondsforthelargest
ontologies111.ThebagswiththeirweightedwordsarestoredinaPythonfileasa
dictionaryandcanbeloadedbythecompute_similaritymoduleveryfast.Theactual
matchingtakeslessthan1second,whichissuitableforORSsinceitcanbecomputed
once during agent interaction even when the PA needs to try the lexeme ranked
secondorthird.Thiswasachievedbyusingdatastructuressuchassetsinsteadof
listswherepossible112,andalsolimitingthesearchspaceaccordingtowhetherthe
matcherislookingforarelationoraclass/individual113.
111 i.e.theMidlevelontologies(wherebagsofmorethan2,000lexemeshavetobecreated).Computational
timeismeasuredonalaptopwithanIntelCore2processor.
112 Setslackpositioninformationandcontainnoduplicates,thereforetheyarea'lighter'datastructurethan
lists.
113Classesandindividualsaretreatedassimilar(bothstartwithacapitalletter)intheSUMOontologies.Their
onlydifferenceisthatindividualsareinstancesandnotsubclassesofthetoplevelconcept'Entity'.
71
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Theaboveresultsseemtosupportourinitialhypothesisthatmatchingcanbemade
not just possible but also effective and efficient when semistructured data (here
representedasbagsofwords,later(section4.2)relatedtofolksonomies)enterthe
semanticsofformalontologies.
114Infact,manySPAscouldhavebeenused,butthiswouldnotprovideuswithabetterdemonstrationofthe
SemanticMatcher.
115AlthoughitcaneasilybeextendedtosupportanythingwritteninSUOKIF
116http://sigmakee.cvs.sourceforge.net/viewvc/sigmakee/KBs/Midlevelontology.kif?revision=1.34
72
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
syntax, reusing existing SUMO classes and relations and respecting all the type
restrictionsspecified117.
OurPlanningAgent(called'JerrytheBot')wantstobeemployedasanartistatthe
ScottishNationalGalleryofModernArtandthereforeitsgoalistobringabouta
stateofaffairswherethefollowingistrue:
employs(scottishNationalGalleryOfModernArt, jerryTheBot).
BeforeagentinteractionbeginsORScallstheSemanticMatcher,whichprecomputes
allthebagsofwordsfromthecandidatelexemesinJerry'sontology.Toachievehis
goalJerryplanshiscourseofaction(specifiedinthepreconditionsandeffectsofhis
actionconcepts(seeAppendix))andwilltryto 1) beapproved(bychangingthe
worldsothatexpressingApproval(jerryTheBot)istrue)andoncethisisachieved2)
behired(bymakingappointing(jerryTheBot) true).Theseactionscanbeperformed
byoneormoreSPAs.Inthisscenarioitis'TomRecruiterAgent'thatcanofferthis
service. Jerry contacts Tom and asks him to perform the first action. Tom starts
submitting queries to Jerry in order to check that he meets all the prerequisites
('preconditions')forbeingapprovedasanartist.DuringthisprocessJerryreceivesa
surprisingquestion,soplanexecutionfails.ORStriestoresolvethisbydiagnosing
theproblemanddecidesthatweneedtoperformsemanticmatchingbecausethe
word drinkingCup was a surprising lexeme. Then it requests Tom for the URI of
drinkingCupandgivesitasinputtotheSemanticMatcher,whichreturnsaranking
ofthebestcandidatelexemesfromJerry'sontology.Thefirstcandidate(cup)istried:
theORSRefinementSystemreplacescupwithdrinkingCupinJerry'sontology.Then
Jerry,whopersistsinhisgoal,replans118andthefirstactionsucceeds.Thenhegoes
117Thismeansthatifwerunatheoremprover(e.g.Vampirehttp://www.vprover.org/)ontheseontologyfiles,
theyshouldn'tfindanyinconsistencies.
118 CurrentlyORSreplanseverytimeitencountersamismatchedlexeme,althoughitwouldbemuchmore
efficienttotryeachoneofthe'candidates'withouthavingtofindanewplan.Furthermore,thebagsof
weightedwords,canbeprecomputedatthebeginningoftheinteraction,arecomputedeverytimethePA
hastoformanotherplan.ThisisalimitationofthecurrentORSimplementationbutthesystemcaneasilybe
73
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
on to ask Tom to appoint him. Plan execution will fail again because of another
semanticmismatch,butitwillresolvedagainandintheendJerrywillachievehis
initialgoal(SeeAppendixforORSoutput).
TheintegrationofthetwosystemswasdoneinconjunctionwithFionaMcNeill,the
creatorofORS.TheSemanticMatcherisimplementedinPythonandiscalledfrom
within ORS (written in Prolog) through the Unix shell. The Python modules are
designed to execute their main function with arguments specified externally, as
commandlineoptions.
Summaryofchapter3
InthischapterIpresentedtheSemanticMatcher,asystemthatresolvessemantic
mismatchesbetweenagents'ontologiesduringagentcommunicationandIdescribed
howitwasintegratedintotheexistingOntologyRepairSystem(ORS).ThisnewORS
modulewasdesignedinanattempttomeetbothimplementationchallenges(i.e.need
foreffectivenessandefficiency;limitedaccesstoSPA'sontologyetc.)and theoretical
challenges (i.e.constructionofintensionalmeaningintheagent's'head').Evaluation
gave us encouraging results and showedthatpractical problems wereadequately
solved,hence,ourfirstgoalwasachieved.Whetherthisisanacceptabledesignata
theoreticallevelwillbeansweredinthenextchapter.
extendedtohandlethisinthefuture.
74
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
CHAPTER 4 Discussion
Insection2.3.2wesawthatsenses,or'concepts'suchasCATresideinourmindand
actasreferencedeterminingmediatorsbetweenaword('cat')andtheworld(setof
catsintheworld).Butwhatisthestructureoftheserepresentations?WhatisCAT
composedof?
ManymodelsofconceptualstructurehavebeenproposedintheCognitiveScience
and Philosophy of Language literature119. Below I will briefly examine four basic
theories:1)theClassicalTheory,2)thePrototypeTheory,3)theAtomicTheoryand4)
theDualTheory.
119 Allthetheoriesofconceptualstructureexaminedinthispaperhavebeendiscussedextensivelyinboth
PhilosophyofLanguageandCognitiveScience.
75
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
According to the Classical Theory, which dates back to Plato's dialogues (see
(LaurenceandMargolis1999:14))andhasbeenpopularforcenturies,conceptshave
definitional structure, that is they are composed of features that are individually
necessaryandjointlysufficientforfixingthedenotation.ForexampleBACHELOR
canberepresentedinthemindasUNMARRIED+MALE.Thistheorywasappealing
because it was in keeping with the Principle of Compositionality120 and could
determinereferencebyassemblingfeatures:wecandeterminewhatentitiesinthe
worldbelongtothesetofbachelorsifwecheckwhethertheysatisfyalloftheabove
conditions(i.e.beinginintersectionofthesetofunmarriedentitiesandmaleentities)
(Thisiscalledthe'checklisttheory'in(Aitchison1994)).But,isthePopeabachelor?
Although he fulfils the requirements,he is nota typical bachelor.Such a mental
representation isvery closetoformal definitions in ontologies,butifwewant to
achievealanguageworldconnectionforagentcommunication,weshouldlookfor
another theory of concepts; one that can achieve the same for humans. To solve
typicalityproblems,asinthecaseofthePopebeinga'bachelor',RoschandMervis
(RoschandMervis1975)developedthefamousPrototypeTheory.
In the Prototype Theory concepts point to fuzzy sets in the world, where
membershipisgraded.Forexample,oneentitycanbeapropermemberofaset,
anotherentitycanbeless'welcome'intheset121.Referenceisdeterminedbychecking
howmanyfeatureseachentitysatisfies(e.g.thePopeisMALEandUNMARRIED
butnotELIGIBLEFORMARRYING),whichtellsushowwellthisentitybelongsto
thesetofthingsthattheconceptdenotes.Oneseriousproblem,however,isthatthis
theorydoesnotconformtothePrincipleofCompositionality,sincetypicalitydoes
notcompose(e.g.atypicalpetfishisnottheintersectionoftypicalpetsandtypical
120AccordingtothePrincipleofCompositionality,themeaningofacomplexexpressionisafunctionofthe
meaningsofits partsplus syntax.Thisprinciple,whichisfundamentalinsemantictheory,isattributedto
FregeandwasadoptedandfurtherdevelopedbyRichardMontague(Montague1970).
121 Thishasconsequencesforclassicallogic,whichpresupposesthatanassertioniseithertrueorfalse.In
ProtypeTheoryweareallowedtosaythatbachelor(Pope)is,say,30%truewhileforamanwhoiseligible
formarriagebutpreferstobesingle,letscallhimGeorge, bachelor(George)canbecloseto100%true,
becauseGeorgeenjoysahighstatusinthesetofbachelorswhilethePopedoesnot.Tohandlefuzzysets,we
needFuzzyLogic(Zadeh1965),wheretherearemorethantwotruthvalues.
76
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
fish,becausetypicalpetsarecats,dogsetc.;thus,themeaningofthewholeisnota
functionofthemeaningsofitsparts).JerryFodor(Fodor1998)attemptedtotackle
thisproblembyintroducingtheAtomicTheory.
TheAtomicTheory(alsoknownasConceptualAtomism)positsthatconceptshave
nostructureatall,thatistheyareatoms.Forexample,BACHELORisrepresentedin
themindasBACHELORandnothingelse.Buthowisreferencedeterminednowthat
wehavenofeaturestocheckentitiesagainst?AccordingtoFodor(ibid)thereisa
causallinkbetweenthepropertyexhibitedbythesetofthingsintheworldandthe
concept.Forexample,whenweseeabachelor,theirpropertyofbelongingtotheset
ofbachelorscausesustoentertaintheconceptBACHELOR122.Thisisaveryattractive
theory because it solves the compositionality problems discussed above (since
complex concepts can be composed of atoms) but not as strong as the Prototype
Theoryinexplainingwhyaconcept,sayBIRD,appliesbettertosomeentities(e.g.
sparrows)thantoothers(e.g.penguins).Togetroundthisproblem,Laurenceand
Margolis (Laurence and Margolis 1999) propose the Dual Theory of conceptual
structure.
122JustificationforthiscausalrelationcanbefoundinDretske'sInformationbasedSemantics(Dretske1981)
andKripke's'causaltheoryofreference'(Kripke1972):whenaneventAcausesaneventB,thenBcarries
informationaboutA.Forexample,abrokenwindowcarriesinformationaboutsomekind ofeventwhich
precededthebreaking,thewrinklesonsomeone'sfacecarryinformationabouttheperson'sageandsoon.
ForadiscussionofInformationBasedSemanticssee(Margolis1998:349).
Furthermore,weshouldnotethatFodor'sinitialargumentthatallconceptsshouldbeinnate(Fodoretal1980,
citedinLaurenceandMargolis1999:63)wasabandonedlater(Fodor1998).
77
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
explainingourintuitionsof'typicality'.Theperipheryoftheconceptcanbeseenas
anepistemicstructure(i.e.astructurewhichencodesencyclopaedicknowledge)since
ourknowledgeofwhatcountsas,say,atypicalbachelorhelpsusidentifygoodand
badexamplesofBACHELOR.
Now,let'sgobacktothesensesforlexemesthatIproposedinthepreviouschapters.
Asmentionedearlier,itisreasonabletoexpectthatontologyengineersinthefuture
will annotate the lexemes of their formal ontologies with URIs123 since these
identifiersarealreadyusedasnamesforrelationsandentitiesinRDFandOWL.
URIs,ifpresent,canserveasthecoreofthementalrepresentation(inLaurenceand
Margolis'terms(ibid)),sincetheirjobistouniquelyandunambiguouslyidentifyan
informationresource(digital,physicalorabstract)(BernersLeeetal.1998).Inour
context this means that if two different lexemes in two agents' ontologies are
annotatedwiththesameURI,theycanbeseenassynonymous.Inchapter3Ishowed
thatlexemescanacquiremeaningbybeingenrichedwith'bagsofwords':thesebags
willserveastheperipheryofthementalrepresentation,thatistheworldknowledge
that surrounds thecore.An analogybetween theDualTheoryandtheapproach
followedinthispapercanbeseeninthetablebelow:
Table5:Thestructureofthementalrepresentationsfor'cat'intheDualTheoryandthe
SemanticMarcher
Oneproblem,however,isthattheexistenceofURIs,thoughreasonable,cannotbe
123SUMOalreadyhastherelationuniqueIdentifier/2,whichcantakeastring(e.g.aURI)asargument1
andanentity(whatIcall'lexeme')asargument2.However,itisfairtosaythatannotationwithURIshardly
existsatthemoment.
78
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
guaranteedforeveryontologylexeme.Inaddition,URIsmightnotbeasuniqueand
unambiguousastheyclaimtobe124.Hence,bagsthemselvesshouldbeabletofixthe
denotation. Reference determination might not be perfect or completely
unambiguous but some semanticgroundingcanbeachieved:thelargerandmore
'appropriate'thebags,thebetterthesemanticgrounding125.
Nowthatwehaveseenwhatmodelofconceptualstructureliesbehindourapproach
to sense creation within the Semantic Matcher, we can go on and see what
implicationssuchanideacanhaveforOntologyEngineering.
In section 2.3.3 we saw that bags of words can seen as little folksonomies and
visualised as tag clouds. Equipping lexemes of formal ontologies with informal
sensesamountstobringingfolksonomiesinsideontologies.BelowIwillshowthatifthe
combination ofthesedifferentkinds ofrepresentation is adoptedas an Ontology
Engineeringpractice,itcanminimisetheproblemofsymbolgroundingandhelp
agentsinteroperatemoreeffectively.
Folksonomyisabottomup,notcentrallycontrolledclassificationsysteminwhich
structureemergesoutofthepracticeofuserslabellingdigitalresources('objects')
withkeywords('tags').VanderWaldistinguishesbetween broad folksonomiesand
narrow folksonomies (Vander Wal 2005; see also Weller 2007). The former are
createdwhenaparticularobjectcanbetaggedbydifferentpeoplesothesametag
124 For example, two ontology engineers might use the same URIs with a different intended meaning
(ambiguityofURIs),ormightcreateanewURIforobjectsforwhichaURIisalreadyavailable('synonymy'
ofURIs).(seealsoHayesandHalpin2008)
125Asmentionedinsection3.2.1,'appropriate'bagsareonesthatarerepresentativeofthelexeme'sintended
meaning.
79
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
canbeassignedmorethanonce.ForexampleaDelicious126 bookmarkabout,say,
chocolatecanhavetheword'recipes'assignedtoit600times,theword'chocolate'578
times,theword'food'423timesandsoon.Thispatternrevealssometrendsasto
what vocabularies are generally considered appropriate to describe this resource.
Narrowfolksonomies,ontheotherhand,areformedinsystemswhere oneobject
canbelabelledonlybyitsauthorwithdistincttags.Forexample,aFlickr127usercan
submitaphotographandannotateitwithkeywordssuchas'surfing','waves','beach'
and'summer'.Ifitismadepubliclyavailable,itcanbefoundbyotheruserswho
searchforphotosabout'surfing'or'waves'andsoon.
Hereweareconcernedonlywithbroadfolksonomiesbecausetheyhavethesame
structureasourbagsofwords.WhatIsuggestistousebroadfolksonomiesasfolk
descriptionsofnotjust digitalresources (e.g.Deliciousbookmarks)butalso physical
andabstractresources(i.e.entitiesintheobjectiveworldorideas)denotedbylexemes
inaformalontology.Asdiscussedinsection3.1.1.3,ontologyengineerscanannotate
theirlexemeswithtagsoftheirchoiceorwithURLsthatpointtocollaboratively
created broad folksonomies. Alternative or complementary practices could be
annotationswithWordNetsynsetsorURLswhichpointtonaturallanguagetext,as
wasimplementedintheSemanticMatcher.Anybags,orfolksonomiesthathavebeen
createdbyontologyengineerscanbeenrichedwithprocessesinthesamespiritas
theSenseCreationalgorithmthatIdescribedinchapter3.Thisisnotonlyapractical
way to achieve some semantic grounding for lexemes in ontologies, but also a
demonstrationofhowontologiesandfolksonomiescanworkintandemtofacilitate
meaning disambiguation, which has immediate applications in agent
communication. The fact that a semantic matcher with bags of words produces
encouragingresultsmakessuchasuggestionforontologyengineeringplausible.
126http://www.delicious.com/
127http://www.flickr.com/
80
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
The 'bag of words' model of representing senses for lexemes enables the
implementation oftheideaofmeaningsimilarityas opposedtoidentity,because
even in human language perfect synonymy is impossible. Furthermore, since
different ontologies are created by different humans, their conceptualisations of
ontologytermsmightbedifferentandcanonlybecomparedforsimilarity,thatis
semantic 'distance'. Similarity cannot be measured with analytic tools such as
theoremprovers.
IntheVectorSpaceModelwesawhowlexemesarerankedfromthemosttotheleast
relevantonthebasisofhowsemantically similar theyaretotheSPA's lexeme.In
section2.1wetalkedaboutidentity:thereisonecandidatelexemewhosemeaningis
identicaltothatoftheSPA.Thisisnotcontradictoryastherankingalgorithmistrying
topredictidentity(i.e.samedenotationunderthesameinterpretation)onthebasisof
similarity(i.e.similarfolksonomiesaroundthelexeme).
However,ifontologiescomefromcompletelydifferentsources,wecan'tassumeany
sort of identity unless we happen to find the same URIs. In the future, when
communication between agents with disparate ontologies will a be tractable task,
semantic similarity will be a very important issue. Two relations might have a
differentURIbutstillbesimilar.ThePlanningAgentwillhavetousesomeofthem
tocommunicatewiththeSPAeventhoughtheymightnothaveexactlythemeaning
originallyintended.Forexample,thePAmightwanttobuyapurplebagwiththe
action buy(me, purple_bag) but after semantic matching itmightask the SPA to
execute the action buy(me, LilacBag). It might not be exactly what the PA was
intendingtobuybutstillclosetoit.
81
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
identity)asacriterionformatchinglexemesmightpavethewayforsystemslikeORS
to deal with more heterogeneous ontologies, where meaning identity cannot be
presupposed.
Summaryofchapter4
InthischapterIbrieflydescribedfourmodelsofconceptualstructureandshowed
howtheDualTheoryprovidedthetheoreticalframeworkfortheapproachtosense
creation that I followed during the design of the Semantic Matcher. I also
demonstrated how such a design favours the combination of ontologies with
folksonomiesandmadesomesuggestionforontologydesigninthefuture.Finally,I
showed how the system I presented supports semantic matching with respect to
similarityandnotnecessarilyidentity.
CONCLUDING REMARKS
Inthepreviouschapterswesawthatontologymismatchinanagentcommunication
environment is inevitable and ORS is the first example of a system that has the
infrastructure(theoreticalorimplemented)fordealingwithitinthisway.Themost
frequently occurring type of heterogeneities, namely semantic mismatch, was
addressedinthispaper,wheretheSemanticMatcherwaspresented.
InthecourseofbuildingthisnewORSmodule,manydesigndecisionshadtobe
made,themostimportantbeingextendingthecurrentsystemtoworkwithgenuine
ontologies, hence, the SUMO parser and the SUOKIFtoProlog translator were
implemented.ForthepurposesofdemonstratingagentinteractionwithinORS,some
82
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
slightmodificationtotheinputontologieshadtobeperformed.Thiswasdonewith
the addition of some facts and Action Concepts to the SUMO ontologies, always
makingsurethatthenonadhocnatureoftheSemanticMatcherisnotaffected.
TheSemanticMatcherwascreatedonthebasisofInformationRetrievalprinciples,
treatingsensesforlexemesintheontologyas'bagsofwords'.Aswesawearlier,this
wasnotjustanengineeringdecisionbutalsoaproposalforincorporatingunordered
sets of words into the semantics offormal ontologies in order to achievesymbol
grounding.ThissuggestionfoundtheoreticaljustificationintheoriesofPhilosophy
ofLanguageandCognitiveScienceandwentfurthertoshowhow folksonomiescan
enrichformalontologiesinpractice.
Givensomereasonableassumptions,theSemanticMatchergaveusveryencouraging
resultsandsupportedourinitialhypothesisthattheintegrationoffolksonomiesinto
formal ontologies can lead to effective and efficient matching, in cases where
ontologiesalonewouldhaveresultedinfailedorpoormatching.Ifnothingelse,this
study showed that facilitating communication between agents with disparate
ontologiesisnotatotallyintractabletask.Thetakehomemessageisthatontology
mismatchdeservestobeviewedmoreoptimistically.
83
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
REFERENCES
Aitchison,J.(1994)WordsintheMind:AnIntroductiontotheMentalLexicon,Oxford:
Blackwell.
Akinsola,T.M.(2008)AutomatedOntologyEvolution,MScthesis,Universityof
Edinburgh.
Bach,T.L.,DiengKuntz,R.andGandon,F.(2004)Onontologymatchingproblems
(forbuildingacorporatesemanticwebinamulticommunitiesorganisation,in
Proceedingsof6thInternationalConferenceonEnterpriseInformationSystems(ICEIS),
Porto,pp.236243.
Baral,C.(2010)Reasoningaboutactionsandchange:Fromsingleagentactionsto
multiagentactions(extendedabstract),Proceedingsofthe12thInternational
ConferenceonthePrinciplesofKnowledgeRepresentationandReasoning,KR2010.
Bergman,M.(2009)TheFundamentalImportanceofKeepinganABoxandTBox
Splitonlineathttp://www.mkbergman.com/489/ontologybestpracticesfordata
drivenapplicationspart2/
BernersLee,T.,Fielding,R.T.andMasinter,L.(1998)Uniformresourceidentifiers
(URI):Genericsyntax,InternetRFC2396,August1998,onlineat
http://www.ietf.org/rfc/rfc2396.txt
BernersLee,T.,Hendler,J.andLassila,O.(2001)Thesemanticweb,Scientific
American,284(5):3443.
Blizard,W.D.(1988)Multisettheory,NotreDameJ.FormalLogic30(1):3666.
Bratman,M.E.(1987)Intentions,Plans,andPracticalReason,CambridgeMass:
HarvardUniversityPress.
Brown,K.(ed.)(1994,2006)EncyclopediaofLanguage&Linguistics,Oxford:Elsevier.
Buchholz,W.(2006)OntologyinD.G.Schwartz(ed.)(2006),pp.694702.
Candlish,S.(2008)PrivatelanguageinZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/privatelanguage/
Carlsson,M.etal.(2010)SICStusProloguser'smanual,SwedishInstituteof
84
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
ComputerScience,onlineathttp://www.sics.se/sicstus/docs/4.0.7/pdf/sicstus.pdf
Carnap,R.(1947)MeaningandNecessity,Chicago:UniversityofChicagoPress.
Casati,R.(2006)EventsinZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/events/
Chandrasekaran,B.,Josephson,J.andBenjaminsV.(1999)Whatareontologies,and
whydoweneedthem?,IEEEIntelligentSystems14(1):2026.
Chomsky,N.(1957)SyntacticStructures,TheHague:Mouton.
Chomsky,N.andSchtzenberger,M.P.(1963)Thealgebraictheoryofcontextfree
languagesinBraffort,P.andHirschberg,D.(1963)ComputerProgrammingandFormal
Languages,Amsterdam:NorthHolland,pp.118161.
Cleverdon,C.(1970)Evaluationtestsofinformationretrievalsystems,Journalof
Documentation26(1):5567.
Corcho,O.andGmezPrez,A.(2000)Aroadmaptoontologyspecification
languagesinProceedingsofthe12thEuropeanWorkshoponKnowledgeAcquisition,
ModelingandManagement,pp.8096.
Cohen,S.M.(2008)Aristotle'smetaphysicsinZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/aristotlemetaphysics/
Colucci,S.,DiNoia,T.,DiSciascio,E.,M.Donini,F.M.andMongiello,M.(2006)
DescriptionlogicbasedinformationretrievalinSchwartz,D.G.(ed.)(2006),105
114.
Croft,W.,Metzler,D.andStrohman,T.(2010)SearchEngines:InformationRetrievalin
Practice,Boston:AddisonWesley.
Daconta,M.C.,Obrst,L.J.andSmith,K.T.(2003)TheSemanticWeb:Aguidetothe
futureofXML,Webservicesandknowledgemanagement,Indianapolis,IN:Wiley.
Devlin,K.(1993)TheJoyofSets:FundamentalsofContemporarySetTheory,NewYork:
SpringerVerlag.
Dretske,F.(1981)KnowledgeandtheFlowofInformation,Cambridge,Mass:MITPress.
Enderton,H.B.(2009)Secondorderandhigherorderlogic,inZalta,E.N.(ed.)
(2003),http://plato.stanford.edu/entries/logichigherorder/
85
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Euzenat,J.andShvaiko,P.(2007)OntologyMatching,Berlin:SpringerVerlag.
Fellbaum,C.(ed.)(1998)WordNet:AnElectronicLexicalDatabase,CambridgeMass.:
MITPress.
Finkelstein,A.,Gabbay,D.M.,Hunter,A.,Kramer,J.,andNuseibeh,B.(1993)
Inconsistencyhandlinginmultiperspectivespecifications,inEuropeanSoftware
EngineeringConference,pp.8499.
Finn,A.,Jushmerick,N.andSmyth,B.(2001)Factorfiction:Contentclassification
fordigitallibraries,inDELOSworkshop:PersonalisationandRecommenderSystemsin
DigitalLibraries(2001).
Fitting,M.(2007)IntensionalLogic,inZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/logicintensional/
Flouris,G.,Plexousakis,D.andAntoniou,G.(2006)"Evolvingontologyevolution",in
Proceedingsofthe32ndInternationalConferenceonCurrentTrendinTheoryand
PracticeofComputerScience(SOFSEM2006).
Fodor,J.A.(1998)Concepts:WhereCognitiveScienceWentWrong,Oxford:Clarendon
Press.
Fortier,J.andKassel,G.(2006)OrganizationalSemanticWebsinSchwartz,D.G.
(ed.)(2006),pp.772779
Frege,G.(1892)OnSenseandReference,inLudlow,P.(ed.)(1997),pp.563583.
Fuller,M.andZobel,J.(1998)Conflationbasedcomparisonofstemming
algorithms,inProceedingsofthe3rdAustralianDocumentComputingSymposium,
Sydney,1998.
Grdenfors,P.andRott,H.(1995)Beliefrevision,inHandbookofLogicinArtificial
IntelligenceandLogicProgramming,Vol.4,Oxford:OxfordUniversityPress,pp.35
132.
Genesereth,M.andFikes,R.(1992)KnowledgeInterchangeFormatversion3.0
referencemanual,LogicGroup,ReportLogic92(1),StanfordCalifornia:Stanford
University.
Ghallab,M.,Howe,A.,Knoblock,C.andMcDermott,D.(1998)PDDL:Theplanning
domaindefinitionlanguage,TechnicalReportDCSTR1165,YaleCenterfor
86
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
ComputationalVisionandControl.
Giunchiglia,F.andShvaiko,P.(2003)Semanticmatching,inTheKnowledge
EngineeringReview18(3):265280.
Giunchiglia,F.,Shvaiko,P.andYatskevich,M.(2004)SMatch:analgorithmandan
implementationofsemanticmatching,ProceedingsoftheEuropeanSemanticWeb
Symposium(ESWS),pp.6175.
GmezPrez,A.,CorchoGarcia,O.andFernandezLopez,M.(2003)Ontological
Engineering,NewYork:SpringerVerlag.
Grefenstette,G.andTapanainen,P.(1994).Whatisaword,whatisasentence?
Problemsoftokenization,inProceedingsof3rdConferenceonComputational
LexicographyandTextResearch1994.
Gruber,T.R.(1992)Ontolingua:Amechanismtosupportportableontologies,
TechnicalReport,KnowledgeSystemsLaboratory9166,StanfordCalifornia:Stanford
University.
Gruber,T.(1993)Atranslationapproachtoportableontologyspecifications,
KnowledgeAcquisition5(2):199220.
Gruber,T.(2007)Ontologyoffolksonomy:Amashupofapplesandoranges,
InternationalJournalonSemanticWebandInformationSystems3(1):111.
Gruber,T.(2009)OntologyinLingLiu,L.andTamerzsu,M.(eds.)(2009),
EncyclopediaofDatabaseSystems,SpringerVerlag.
Halpin,H.,Robu,V.,Shepherd,H.(2007)Thecomplexdynamicsofcollaborative
tagging,inProceedingsofthe16thInternationalConferenceontheWorldWideWeb,
Banff,pp.211220.
HarnadS.(1990)ThesymbolgroundingproblemPhysicaD42:335346.
Hawley,K.(2004)TemporalpartsinZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/temporalparts/
Hayes,P.andHalpin,H.(2008)Indefenseofambiguity,InternationalJournalof
SemanticWebandInformationSystems4(3):118.
Heflin,J.(2003)OWLWebOntologyLanguage:Usecasesandrequirements,W3C,
onlineathttp://www.w3.org/TR/webontreq/
87
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Hiemstra,D.(2009)InformationRetrievalModels,inGoker,A.andDavies,J.(eds)
(2009)InformationRetrieval:Searchinginthe21stCentury,WileyBlackwell,pp.119.
Hillenmeyer,M.(2005)DistanceMetrics,inHillenmeyer,M.(2005)Machine
Learning,StanfordUniversity,onlineat
http://www.stanford.edu/~maureenh/quals/html/ml/node47.html
Hofweber,T.(2004)LogicandOntologyinZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/logicontology/
Kalfoglou,Y.(2002)Exploringontologies,inChang,S.(ed)HandbookofSoftware
EngineeringandKnowledgeEngineering,vol.1:Fundamentals,WorldScientific
PublishingCompany.
Kripke,S.(1972)NamingandNecessity,Cambridge,Mass:HarvardUniversityPress.
Krovetz,R.(1993)ViewingmorphologyasaninferenceprocessinR.Korfhageet
al.,Proceedingsof16thACMSIGIRConference,Pittsburgh,June27July11993,pp.191
202.
Laurence,S.andMargolis,E.(1999)'Conceptsandcognitivescience'inMargolis,E.
andLaurence,S.(eds.)(1999)Concepts:CoreReadings,CambridgeMass:MITPress,
pp.381.
Lavrenko,V.(2009)VectorSpaceModel,onlineat
http://www.inf.ed.ac.uk/teaching/courses/tts/pdf/vspace2x2.pdf
Lebanon,G.,Mao,Y.andDillon,J.(2007)Thelocallyweightedbagofwords
frameworkfordocumentrepresentation,JournalofMachineLearningResearch8:2405
2441.
Lee,D.L.,Chuang,H.,andSeamos,K.(1997)DocumentrankingandtheVector
SpaceModel,IEEESoftware14(2):6775.
Levenshtein,V.(1965)Binarycodescapableofcorrectingdeletions,insertionsand
reversals,DokladyAkademiiNaukSSSR163(4):845848,translatedintoEnglishin
SovietPhysicsDoklady10(8):707710.
Lew,M.S.,Sebe,N.,Djeraba,C.andJain,R.(2006)Contentbasedmultimedia
informationretrieval:Stateoftheartandchallenges,TransactionsonMultimedia
Computing,CommunicationsandApplications2(1):119.
Lewis,D.(1998)Naive(Bayes)atforty:Theindependenceassumptionin
88
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
informationretrieval,ProceedingsofECML98,10thEuropeanConferenceonMachine
Learning,Chemnitz,DE:SpringerVerlag,pp.415.
Lovins,J.B.(1968)Developmentofastemmingalgorithm,MechanicalTranslation
andComputationalLinguistics11:2231.
Ludlow,P.(ed.)(1997)ReadingsinthePhilosophyofLanguage,Cambridge,Mass.:MIT
Press.
Manna,Z.(1974)MathematicalTheoryofComputation,NewYork:McGrawHill.
Manning,C.D.,Raghavan,P.,andSchtze,H.(2008)IntroductiontoInformation
Retrieval,Cambridge:CambridgeUniversityPress.
Margolis,E.(2006)Concepts,inZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/concepts/
McNeill,F.(2006)DynamicOntologyRefinement,PhDthesis,UniversityofEdinburgh.
McNeill,F.andBundy,A.(2007)Dynamic,automatic,firstorderontologyrepairby
diagnosisoffailedplanexecution,InternationalJournalonSemanticWeband
InformationSystems,specialissueonOntologyMatching3(3):135.
McNeill,F.,Bundy,A.andWalton,C.(2003)Planexecutionfailureanalysisusing
plandeconstruction,presentedatPlanningSpecialInterestGroup,Glasgow,
December2003.
Miller,G.A.,Beckwith,R.,Fellbaum,C.,Gross,D.andMiller,K.(1993)Introduction
toWordNet:Anonlinelexical
database,inFellbaum,C.(ed.)1998,onlineat
http://courses.media.mit.edu/2002fall/mas962/MAS962/miller.pdf
Miller,G.A.,Fellbaum,C.,Kegl,J.andMiller,K.(1988)Wordnet:Anelectronic
lexicalreferencesystembasedontheoriesoflexicalmemory,RevueQubecoisede
Linguistique17,181211.
Montague,R.(1970)Englishasaformallanguage,inVisentini,B.etal.(eds)
LinguaginellaSocietenellaTecnica,pp.189224,Milan:EdizionidiComunit.
ReprintedinMontague(1974),pp.188221.
Niles,I.,andPease,A.(2001)TowardsaStandardUpperOntology,inProceedings
ofthe2ndInternationalConferenceonFormalOntologyinInformationSystems(FOIS
2001),Welty,C.andSmith,B.(eds),Ogunquit,Maine.
89
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
Niles,I.andPease,A.(2003)"Linkinglexiconsandontologies:MappingWordNetto
theSuggestedUpperMergedOntology",inProceedingofthe2003International
ConferenceonInformationandKnowledgeEngineering(IKE03),LasVegas.
Ogden,C.K.andRichards,I.A.(1923)TheMeaningofMeaning,NewYork:Harcourt,
Brace&WorldInc.
Pease,A.,Niles,I.andLi,J.(2002)TheSuggestedUpperMergedOntology:Alarge
ontologyfortheSemanticWebanditsapplications,inProceedingsoftheAAAI2002
WorkshoponOntologiesandtheSemanticWeb,Edmonton,Canada.
Pease,A.(2009)"StandardUpperOntologyKnowledgeInterchangeFormat",online
athttp://sigmakee.cvs.sourceforge.net/viewvc/sigmakee/sigma/suokif.pdf
Peters,I.(2009)Folksonomies.IndexingandRetrievalinWeb2.0.,Berlin:DeGruyter
Saur.
Peirce,C.S.(19311958)CollectedPapersofC.S.Peirce,ed.byC.Hartshorne,Weiss,P.
andBurks,A.,8vols.,Cambridge,Mass:HarvardUniversityPress,.
Porter,M.F.(1980)Analgorithmforsuffixstripping,Program14(3):130137.
Putnam,H.(1975)'Themeaningofmeaning'inGunderson(ed.)Language,Mind
andKnowledge,MinessotaStudiesinthePhilosophyofScience,vol.7,Minneapolis:
UniversityofMinessotaPress.ReprintedinMind,LanguageandReality,Philosophical
Papers,vol.2,Cambridge:CambridgeUniversityPress,pp.215271.
Qu,W.,Hu,W.andChen,G.(2006)Constructingvirtualdocumentsforontology
matching,inProceedingsofthe15thInternationalWorldWideWebConference,
Edinburgh,pp.2331.
Raghavan,V.V.andWong,S.K.M.(1986)AcriticalanalysisofVectorSpaceModelfor
informationretrieval,JournaloftheAmericanSocietyforInformationScience37(5):279
287.
RamosJ.(2003)UsingTFIDFtoDetermineWordRelevanceinDocumentQueries,
FirstInternationalConferenceonMachineLearning,RutgersUniversity.
Robertson,S.andSprckJones,K.(1976)Relevanceweightingofsearchterms,
JournaloftheAmericanSocietyforInformationScience27(3):129146.
Rosch,E.andMervis,C.(1975)Familyresemblances:Studiesintheinternal
90
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
structureofcategories,CognitivePsychology7:573605.
Russell,S.J.andPeterNorvig(1995)ArtificialIntelligence:AModernApproach,New
Jersey:PrenticeHall.
Salton,G.,Wong,A.andYang,C.S.(1975)Avectorspacemodelforautomatic
indexing,CommunicationsoftheACM18(11):613620.
Salton,G.andMcGill,M.J.(1983)IntroductiontoModernInformationRetrieval,New
York:McGrawHill.
Saussure,F.(1916),NatureoftheLinguisticsSign,inBally,C.andSechehaye,A.
(eds),CoursdeLinguistiqueGnrale,London:McGrawHillEducation.
Schwartz,D.G.(ed.)(2006)EncyclopediaofKnowledgeManagement,Hershey,PA:Idea
GroupReference.
Searle,J.(1980)Minds,brainsandprograms,BehavioralandBrainSciences3(3):417
457.
Shapiro,S.(2009)Classicallogic,inZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/logicclassical/
Schmid,H.(2007)Tokenizing,inLdeling,A.andKyt,M.(eds)Corpus
Linguistics:AnInternationalHandbook,Berlin:MoutondeGruyter.
SprckJones,K.(1972)Astatisticalinterpretationoftermspecificityandits
applicationinretrieval,JournalofDocumentation28(1):1121.
Speaks,J.(2010)TheoriesofMeaning,inZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/meaning/
Sterling,L.andShapiro,E.(1994)TheArtofProlog:AdvancedProgrammingTechniques
(2ndedition),CambridgeMass:MITPress.
Steup,M.(2006)TheanalysisofknowledgeinZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/knowledgeanalysis/
VanderWal,T(2005)Explainingandshowingbroadandnarrowfolksonomies,
onlineathttp://www.vanderwal.net/random/entrysel.php?blog=1635
VanderWal,T.(2007)Folksonomy,onlineat
91
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
http://www.vanderwal.net/folksonomy.html
Wallace,R.J.(2008)PracticalreasoninZalta,E.N.(ed.)(2003),
http://plato.stanford.edu/entries/practicalreason/
Weller,K.(2007)FolksonomiesandontologiesTwonewplayersinindexingand
knowledgerepresentation,inProceedingsofOnlineInformation,London,pp.108115.
Wittgenstein,L.(1953)PhilosophicalInvestigations,translatedbyG.E.M.Anscombe,
Oxford:Blackwell.
Wooldridge,M.(2009)AnIntroductiontoMultiagentSystems,NewYork:JohnWiley&
Sons.
Zadeh,L.(1965)Fuzzysets,InformationandControl8(3):338353.
Zalta,E.N.(ed.)(2003)TheStanfordEncyclopediaofPhilosophy(online),
http://plato.stanford.edu
APPENDIX
A.1 Glossary
The vocabulary below is defined in the context of the Semantic Matcher. Some terms might have
different meanings across disciplines. Where there is no author citation, the definition is mine, for the
purposes of this work only.
92
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
documentfrequencyetc).Thestructureofabagofwordscanbecomparedtothatof
atagcloud.Seealsoconcept,tagcloud.
candidatelexeme: AlexemeinthePlanningAgent'sontologywhichiseligiblefor
semanticmatching,thatis,itcanbecomparedtothesurprisinglexemeforsimilarity.
Seealsosurprisinglexeme.
IR:InformationRetrieval
lexeme:astringofcharactersrepresentingapredicate(relation),classorindividual
inanontology
needed lexeme: A lexeme that the Planning Agent needs to use in order to be
understood by the Service Providing Agent, but has never been presented to the
formeragentandthereforethePAwillhavetoguess.Seealsosurprisinglexeme.
PA:SeePlanningAgent
PlanningAgent:Anagentwhoformsplanstoachieveaparticulargoalandrequests
services(actionstobeperformed)fromServiceProvidingAgents.Seealso Service
93
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
ProvidingAgent.
sense:seeconcept
ServiceProvidingAgent: AnagentwhoprovidesservicestoPlanningAgentsby
performingactions.SeealsoPlanningAgent.
SPA:SeeServiceProvidingAgent
surprising lexeme: A lexeme in the Service Providing Agent's ontology that has
appearedinthequeriessubmittedtothePlanningAgent(usuallywhilecheckingthe
preconditionsofanaction)andhassurprisedthelatter,whohasneverseenitbefore.
Seealsocandidatelexeme,neededlexeme.
tagcloud: thevisualisationofamultisetofwordsinwhichmoreimportantwords
appearinabiggersize.Thestructureofatagcloudcanbecomparedtothatofabag
ofwords.Seealsoconcept,bagofwords.
user@user:~/ORS/semantic_matching/modules/sem-matching/evaluation$ python
evaluation.py
1 NetworkResource
2 Report <---
3 ProcessTask
4 ComputerResource
5 BusNetwork
6 Database
7 CPU
8 NetworkAdapter
9 Server
10 UserName
94
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
1 coordinate <---
2 partition
3 range
4 measure
5 located
6 inScopeOfInterest
7 geographicSubregion
8 origin
9 subProcess
10 prevents
1 Cup <---
2 Bottle
3 Beverage
4 Tooth
5 Chewing
6 Birth
7 Toothbrush
8 Dentist
9 Eating
10 DistilledAlcoholicBeverage
1 Conductor <---
2 ResistorElement
3 InsulatorSubstance
4 Electrical
5 Current
6 SemiconductorComponent
7 ElectricalEngineeringMethod
8 Brushless
9 DcMotor
10 ElectricalDrivesDomain
1 lastName <---
2 familyRelation
3 cohabitant
4 legalGuardian
5 stranger
6 acquaintance
7 friend
8 coworker
9 mutualStranger
10 mutualAcquaintance
1 Fishing <---
2 FishAndSeafoodWholesalers
3 FinfishFishing
4 FishingHuntingAndTrapping
5 ShellfishFishing
6 AgricultureForestryFishingAndHunting
7 FishAndSeafoodMarkets
8 OtherMarineFishing
9 DeepSeaPassengerTransportation
10 FinfishFarmingAndFishHatcheries
95
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
1 FluidContainer
2 LiquidMixture
3 Spraying
4 Diluting
5 Bubble
6 Combustible <---
7 Drinking
8 LiquidBodySubstance
9 Stirring
10 Liquid
1 PressureControlValve
2 Device
3 Cylinder <---
4 Pressure
5 FluidPowerDomain
6 DirectionalControlValve
7 VolumeControlValve
8 ReliefValve
9 FluidPower
10 Valve
1 beforeTaxIncome
2 afterTaxIncome
3 income <---
4 taxDeferredIncome
5 lender
6 loanForPurchase
7 customer
8 inflationRate
9 monetaryValue
10 issuedBy
1 MattressManufacturing
2 GrantmakingFoundations
3 Information <---
4 HardwareStores
5 HardwareWholesalers
6 SpecialDieAndToolDieSetJigAndFixtureManufacturing
7 OnLineInformationServices
8 AllOtherInformationServices
9 HardwareManufacturing
10 InformationServices
1 CollegesUniversitiesAndProfessionalSchools
2 JuniorColleges <---
3 BarberShops
4 ContinuingCareRetirementCommunities
5 AdministrationOfUrbanPlanningAndCommunityAndRuralDevelopment
6 AdministrationOfHousingProgramsUrbanPlanningAndCommunityDevelopment
7 CommunityHousingServices
8 CommunityCareFacilitiesForTheElderly
9 CommunityFoodAndHousingAndEmergencyAndOtherReliefServices
96
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
10 AdministrationOfEducationPrograms
1 Constructing
2 Corn <---
3 IndustrialPlant
4 FinancialService
5 Outdoors
6 GovernmentBuilding
7 Garage
8 Residence
9 PoliceFacility
10 Store
1 Launcher <---
2 ArrowProjectile
3 Bullet
4 Missile
5 ProjectileShell
6 GunBarrel
7 TouristSite
8 Motion
9 AutomaticGun
10 GunPowder
1 ProjectileLauncher <---
2 Projectile
3 Gun
4 WeaponOfMassDestruction
5 Spear
6 Sword
7 Bomb
8 Bullet
9 AutomaticGun
10 GunBarrel
1 Georgia <---
2 GeorgianLari
3 Italy
4 Kazakhstan
5 Ukraine
6 Russia
7 Portugal
8 Substance
9 Indonesia
10 Croatia
1 RiceFarming
2 Rice <---
3 CerealGrain
4 Flour
5 Whiskey
6 Baking
7 PotOrPan
8 HayFarming
97
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
9 PeanutFarming
10 PreparedFood
1 MultipolePostulate
2 CircuitTheoryDomain
3 Law <---
4 NewtonsLaw
5 ScienceDomain
6 NaturalSciencesDomain
7 Set
8 Proposition
9 Process
10 Method
1 Television <---
2 TelevisionSystem
3 Radio
4 CableTelevisionSystem
5 CommunicationRadio
6 MobileCellPhone
7 BroadcastingStation
8 Internet
9 TelevisionStation
10 GeopoliticalArea
1 Turkey <---
2 TurkishLira
3 Poultry
4 UnitedStates
5 Meat
6 Canada
7 Seed
8 Ethiopia
9 Brazil
10 AnimalSkin
1 Tire <---
2 VehicleWheel
3 Wheel
4 MaterialHandlingEquipment
5 ArtilleryGun
6 Motorcycle
7 LetterBombAttack
8 VehicleController
9 Ballot
10 Mailing
1 Watercraft <---
2 Ice
3 Submarine
4 WaterMotion
5 Water
6 SwimmingPool
7 FreshWaterArea
98
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
8 SalineSolution
9 WaterArea
10 Washing
99
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
(=>
(and
(instance ?AP Appointing)
(involvedInEvent ?AP ?AGENT))
(causesProposition
(and
(hasSkill Painting ?AGENT)
(not (employs ?X ?AGENT))
(lastName ?LN ?AGENT))
(and
(employs ScottishNationalGalleryOfModernArt ?AGENT)
(instance YourEmploymentCertificate Certificate)
(titles ?LN YourEmploymentCertificate))))
(=>
(and
(instance ?EX ExpressingApproval)
(involvedInEvent ?EX ?AGENT)
(instance ?AGENT CognitiveAgent))
(causesProposition
(and
(exists (?P ?PROCESS ?ITEM)
(instance ?P PaintedPicture)
(instance ?PROCESS Painting)
(instance ?ITEM Cup)
(represents ?P ?ITEM)
(agent ?PROCESS ?AGENT)
(result ?PROCESS ?P)
(contestParticipant MugPaintingContest2010 ?AGENT)
(involvedInEvent MugPaintingContest2010 ?P)
(result MugPaintingContest2010 Won)
(property ?AGENT Won)))
(and
(hasSkill Painting ?AGENT))))
_________________________________________________
100
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
_________________________________________________
|: employs(scottishNationalGalleryOfModernArt,jerryTheBot).
__________
| |
| Goal is: | employs(scottishNationalGalleryOfModernArt,jerryTheBot)
|__________|
Translating ...
__________
| |
| Goal is: | employs(scottishNationalGalleryOfModernArt,jerryTheBot)
|__________|
Translating ...
101
AutomatedOntologyEvolution:SemanticMatchingExaminationNumber:5858947
answer is class(jerrysCup,drinkingCup)
expressingApproval(jerryTheBot) completed satisfactorily
__________
| |
| Goal is: | employs(scottishNationalGalleryOfModernArt,jerryTheBot)
|__________|
Translating ...
102