Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Completeness:
never breaks a long pattern of any transaction
tree
Method
For each item, construct its conditional pattern-base,
FP-tree
Until the resulting FP-tree is empty, or it contains only
one path (single path will generate all the combinations of its
sub-paths, each of which is a frequent pattern)
Header Table {}
Conditional pattern bases
Item frequency head f:4 c:1
f 4 item cond. pattern base
c 4 c:3 b:1 b:1 c f:3
a 3
b 3 a fc:3
a:3 p:1
m 3 b fca:1, f:1, c:1
p 3 m:2 b:1 m fca:2, fcab:1
p:2 m:1 p fcam:2, cb:1
February 17, 2020 Data Mining: Concepts and Techniques 22
Properties of FP-tree for Conditional
Pattern Base Construction
Node-link property
For any frequent item ai, all the possible frequent
patterns that contain ai can be obtained by following
ai's node-links, starting from ai's head in the FP-tree
header
Prefix path property
To calculate the frequent patterns for a node ai in a
path P, only the prefix sub-path of ai in P need to be
accumulated, and its frequency count should carry the
same count as node ai.
February 17, 2020 Data Mining: Concepts and Techniques 23
Step 2: Construct Conditional FP-tree
c:3
f:3
am-conditional FP-tree
c:3 {}
Cond. pattern base of “cm”: (f:3)
a:3 f:3
m-conditional FP-tree
cm-conditional FP-tree
{}
m-conditional FP-tree
February 17, 2020 Data Mining: Concepts and Techniques 27
Principles of Frequent Pattern
Growth
Pattern growth property
Let be a frequent itemset in DB, B be 's
conditional pattern base, and be an itemset in B.
Then is a frequent itemset in DB iff is
frequent in B.
“abcdef ” is a frequent pattern, if and only if
“abcde ” is a frequent pattern, and
“f ” is frequent in the set of transactions containing
“abcde ”
70
Run time(sec.)
60
50
40
30
20
10
0
0 0.5 1 1.5 2 2.5 3
Support threshold(%)
100
Runtime (sec.)
80
60
40
20
0
0 0.5 1 1.5 2
Support threshold (%)
February 17, 2020 Data Mining: Concepts and Techniques 31
Presentation of Association Rules
(Table Form )
Level-by-level independent
Level-cross filtering by k-itemset
Level-cross filtering by single item
Controlled level-cross filtering by single item
Back
February 17, 2020 Data Mining: Concepts and Techniques 40
Reduced Support
Multi-level mining with reduced support
Level 1 Milk
min_sup = 5%
[support = 10%]
Back
February 17, 2020 Data Mining: Concepts and Techniques 41
Multi-level Association: Redundancy
Filtering
1. Binning
2. Find frequent
predicateset
3. Clustering
4. Optimize
February 17, 2020 Data Mining: Concepts and Techniques 51
Limitations of ARCS
· confidence
Succinctness
Anti-monotonicity Monotonicity
Convertible constraints
Inconvertible constraints
February 17, 2020 Data Mining: Concepts and Techniques 69
Property of Constraints:
Anti-Monotone
S v, { , , } yes
vS no
SV no
SV yes
SV partly
min(S) v no
min(S) v yes
min(S) v partly
max(S) v yes
max(S) v no
max(S) v partly
count(S) v yes
count(S) v no
count(S) v partly
sum(S) v yes
sum(S) v no
sum(S) v partly
avg(S) v, { , , } convertible
(frequent constraint) (yes)
February 17, 2020 Data Mining: Concepts and Techniques 71
Example of Convertible Constraints:
Avg(S) V
min(S.Price ) v is succinct
Optimization:
If C is succinct, then C is pre-counting prunable. The
S v, { , , } Yes
vS yes
S V yes
SV yes
SV yes
min(S) v yes
min(S) v yes
min(S) v yes
max(S) v yes
max(S) v yes
max(S) v yes
count(S) v weakly
count(S) v weakly
count(S) v weakly
sum(S) v no
sum(S) v no
sum(S) v no
avg(S) v, { , , } no
(frequent constraint) (no)
February 17, 2020 Data Mining: Concepts and Techniques 74
Chapter 6: Mining Association
Rules in Large Databases
Association rule mining
Mining single-dimensional Boolean association rules
from transactional databases
Mining multilevel association rules from transactional
databases
Mining multidimensional association rules from
transactional databases and data warehouse
From association mining to correlation analysis
Constraint-based association mining
Summary