Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Communication
Technology
Luk Yin Shun
Chapter 12 Data Mining and Its Applications
12.1 Data Mining
Data Query Data Mining
The output is the query result, which is a subset of The output is new knowledge, which cannot be
the current database(s) found directly in the current database(s)
The output is precise The output is fuzzy
Step Description
Selection Selecting an appropriate subset of data to perform
knowledge discovery
Pre-processing A data cleaning step in which inappropriate data
item are removed
Transformation Transforming the data into a more usable format,
including reducing the effective number of data
items
Data Mining Using some appropriate data mining techniques to
extract patterns form the data
Interpretation Interpreting the extracted patterns, in order to
f=transform them into knowledge, which in turn
facilities
Descriptive Techniques
Clustering:
o unsupervised learning: a machine learning technique without the manual class labelling of inputs
o No predefined classes
Association rules:
o refers to finding frequent patterns, associations or correlations among data sets
o helps discover relations between variables in a data warehouse
Summarization
o the technique that investigates a collection of patterns from raw data and generate a compact description
for them
Sequence Discovery
o looks up all the frequent patterns reaching a specific supporting threshold
Application Examples: