Sei sulla pagina 1di 3

Objective:

To overcome the drawback in Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules; several methods were proposed in the literature such as item nset concise representations, redundancy reduction, and post processing. However, being generally based on statistical information, most of these methods do not guarantee that the extracted rules are interesting for the user. Thus, it is crucial to help the decision-maker with an efficient post processing step in order to reduce the number of rules. This paper proposes a new interactive approach to prune and filter discovered rules. First, we propose to use ontologies in order to improve the integration of user knowledge in the post processing task. Second, we propose the Rule Schema formalism extending the specification language proposed by Liu et al. for user expectations. Furthermore, an interactive framework is designed to assist the user throughout the analysing task. Applying our new approach over voluminous sets of rules, we were able, by integrating domain expert knowledge in the post processing step, to reduce the number of rules to several dozens or less. Moreover, the quality of the filtered rules was validated by the domain expert at various points in the interactive process.

PROBLEM DEFINITION
o

Ontology A Account Personal Account Current Savings Corporate Account Salary account Employee Account Corporate FD Account Personal Ontology B Account Personal Account Savings Account Corporate Account Employee Account Current Account Organization FD Company

Deployment Diagram

Algorithm and Description

Domain knowledge, defined as the user information concerning the database, is described in our framework using ontologies. Compared to taxonomies used in the specification language proposed in [12], ontologies offer a more complex knowledge representation model by extending the only isa relation presented in a taxonomy with the set R of relations. In addition, the axioms bring important improvements permitting concept definition starting from existing information in the ontology. In this scenario, it is fundamental to connect ontology concepts to the database, each one of them being connected to one/several items of I. To this end, we consider three types of concepts: leafconcepts, generalized concepts from the subsumption relation, and restriction concepts proposed only by ontologies. In order to proceed with the definition of each type of concepts, let us remind that a set of items in a database is defined as I. The leaf-concepts (C0) are defined as C0 fc0 2 C j 6 9c0 2 C; c0 _ c0g: They are connected in the easiest way to databaseeach concept from C0 is associated to one item in the database: f0 : C0 ! I; 8c0 2 C0; 9i 2 I; i f0c0: Generalized concepts (C1) are described as the concepts that subsume other concepts in the ontology. A generalized concept is connected to the database through its subsumed concepts. This means that, recursively, only the leaf-concepts subsumed by the generalized concept contribute to its database connection: Restriction concepts are described using logical expressions defined over items and are organized in the C2 subset. In a first attempt, we base the description of the concepts on restrictions over properties available in description logics. Thus, the restriction concept defined could be connected to a disjunction of items.