Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
UNIV/POLTEK
DIGITAL TALENT
SCHOLARSHIP
2019
digitalent.kominfo.go.id 1
LOGO
UNIV/POLTEK
Sesi 29
Algoritma Asosiasi
Big Data Analytics
2
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Artificial Intelligence
Machine Learning
Association Rules
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Outline
1. Introduction
2. A two-step process
3. Applications
6. Example
7. Representation of D
8. Definitions cont’d
10. Example
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Introduction
• Unsupervised task.
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Applications
• Collaborative filtering
• Web organization
• Symptoms-diseases associations
• Supervised classification
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
A two-step process
Example:
B r e a d ^ B u t t e r is a strong rule
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Definitions
• Transaction dataset: D = { ( t i d , X t i d ) / t i d e T , X t i d c I}
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Example
D
item name tid transaction
a coffee 1 a b
b milk 2 a c
c butter 3 c d
d bread 4 bed
5 abed
I=
T=
V =
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Example
I = {a, b , c , d }
T = {1, 2, 3, 4, 5}
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Example
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Example
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Definitions cont’d
• Mapping t:
t : P (I) — P (T)
X — t(X) = {tid E T \3Xtid, ( tid, Xtid ) EV A X C X tid }
• Mapping i:
i : P (T) —— P (I)
Y — i(Y) = {x E I\V(tid, Xtid) E V, tid E Y ^ x E Xtid}
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Definitions cont’d
F = { X C I| supp(X)> MinSupp}
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Example
MinSupp=40%
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
BFS and DFS
abcd
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Apriori pseudo-algorithm
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Example
D
tid transaction
1 ab
Minsupp=2/5 (40%)
2 ac
3 cd
4 bcd
5 abcd
Ci Fi
Itemset Support Itemset Support
a 3/5 a 3/5
b 3/5 b 3/5
c 4/5 c 4/5
d 3/5 d 3/5
C2 C2
F2
Itemset Itemset Support
ab Itemset Support
ab 2/5
ac ab 2/5
ad ac 2/5
ac 2/5
bc ad 1/5 bc 2/5
bd bc 2/5
cd bd 2/5
bd 2/5 cd 3/5
cd 3/5
C3 C3
Scan
F3
Itemset Itemset Support
Itemset Support
abc -T abc 1/5
of D bcd 2/5
bcd bcd 2/5
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Apriori bottleneck
1. Billions of transactions,
3. Tera-bytes of data.
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Representation of D
Row-wise Column-wise
abcd
1 1 2 3
2 4 3 4
5 5 4 5
5
Boolean
a b c d
1 1 1 0 0
2 1 0 1 0
3 0 0 1 1
4 0 1 1 1
5 1 1 1 1
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Definitions cont’d
• Generate rules:
(l - C) ^ C
,, . . supp(l)
conf((l — C) ^ C) = > MinConf
supp(l — C)
Property
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Example
Minconf=60%
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Example
Minconf=60%
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Probabilistic Interpretation
Brin et al. 97
R: A —► C
P(A A C)
conf (A ^ C) = P(C|A)
P(A)
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Support-Confidence: cons
Strong rule?
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Support-Confidence: cons
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Other evaluation Measures
P (A A C) s u p p ( A U C)
Interest(A ^ C)
P(A) x P(C) supp(A) x supp(C)
conf (A ^ C) conf (C ^ A)
Interest(A ^ C)
supp(C ) supp(A)
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Other evaluation Measures
coffee coffee
tea 0.89 2
tea 1.03 0.66
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Multi-dimensional rules
• One-dimensional rules:
• Multi-dimensional rules:
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Post-processing of AR
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Implementations
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
FP algorithms
HMine, Cofi)
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Uniform notion of item
to literals:
item = (a t t r i b u t e , v a l u e )
1. categorical, forex. ( c o l o r , b l u e ) ,
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Example
D : people
id age married? #cars
1 23 no 1
2 25 yes 1
3 29 no 0
4 34 yes 2
5 38 yes 2
Examples of frequent itemsets
itemset support
(age, 20..29) 3
(age, 30..39) 2
(married?, yes) 3
(married?, no) 2
(#cars, 1) 2
(#cars, 2) 2
(age, 30..39),(married?, yes)
Examples of rules 2
rule support confidence
(age, 30..39) et (married?, yes) —> (#cars, 2) 40% 100%
(age, 20..29) —> (#cars, 1) 60% 66.6%
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Quantitative AR_
Question: Mining Quantitative AR is not a simple extension of
mining categorical AR. why?
• Infinite search space: In Boolean AR, the Ariori property
allows to prune the search space efficiently, but we do explore
the whole space of hypothesis (lattice of itemsets), which is
IMPOSSIBLE for Quantitative AR.
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Approaches to mine QARs
• Discretization-based approaches
• Distribution-based approaches
• Optimization-based approaches
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Approaches to mine QARs
Discretization-based approaches
• A pre-processing step
• Lent et al., 1997; Miller and Yang, 1997; Srikant and Agrawal,
1996; Wang et al., 1998
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Approaches to mine QARs
Distribution-based approaches
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Approaches to mine QARs
Optimization-based approaches
• Numerical attributes are optimized during the mining process
• Fukuda et al., 96, Rastogi and Shim 99, Brin et al. 2003.
Techniques inspired from image segmentation.
Gain(A ^ B) = Supp(AB) — MinConf * Supp(A)
Form of the rules restricted to 1 or 2 numerical attributes.
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Approaches to mine QARs
Optimization-based approaches
U i
Domain of Attribute^
ii ui
l i _ui
ii ui l i _u
li ui
i i u ii
i
ui
1 mutation.
H ui
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Approaches to mine QARs
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
QuantMiner
http://quantminer.github.io/QuantMiner/
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
QuantMiner
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
References
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
IKUTI KAMI
digitalent.kominfo
digitalent.kominfo
DTS_kominfo
Digital Talent Scholarship 2019
digitalent.kominfo.go.id 45
digitalent.kominfo.go.id