Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
AssociationRuleMining
DataMining2/2553.CE,KMITL
1. TracetheresultsofusingtheApriorialgorithmonthegroceryshopwithsupportthreshold
33.34%andconfidencethreshold60%.Showthecandidateandfrequentitemsetsforeach
databasescan.Enumerateallthefinalfrequentitemsets.Alsoindicatetheassociationrulesthat
aregeneratedandhighlightthestrongones,sortthembyconfidence.
TransactionID
Items
T1
HotDogs,Buns,Ketchup
T2
HotDogs,Buns
T3
HotDogs,Coke,Chips
T4
Chips,Coke
T5
Chips,Ketchup
T6
HotDogs,Coke,Chips
2. TracetheresultsofusingtheApriorialgorithmonthecomputershopwithsupportthreshold
70%andconfidencethreshold80%.Showthecandidateandfrequentitemsetsforeach
databasescan.Enumerateallthefinalfrequentitemsets.Alsoindicatetheassociationrulesthat
aregeneratedandhighlightthestrongones,sortthembyconfidence.
TransactionID
Items
T1
Tripod,Lens,bag
T2
Camera,Lens,bag
T3
Camera,Tripod,Lens,Memorycard
T4
Camera,Tripod,Lens,bag
T5
Lens,Memorycard,bag
3. Describetheimportantofsupportandconfidencethresholdsinfindingassociationrules?And
whatshouldbetheirmostappropriatevalues?
AssociationRuleMiningwithWEKA
Aprioriworkswithcategoricalvaluesonly.Therefore,ifadatasetcontainsnumericattributes,
theyneedtobeconvertedintonominalbeforeapplyingtheApriorialgorithm.Hence,data
preprocessingmustbeperformed.Repeathomework2(DataPreprocessing),ifyoudontknowhowto
dealwithnumerictonominalconversion.
weather.nominal.arff
1. Loadweather.nominal.arffintoatexteditorandanalyzetheattributetypesandvalues.
2. Isthisdatasetappropriateforassociationrulemining?ifnot,modifyit.Youmayuse
WEKAsPreprocessingcapability.
3. ApplyApriorialgorithmtothedataset.
a. GotoAssociationtab
b. ChooseAprioriasAssociator
c. Acceptalldefaultvalues.YoumayclickonMorebuttontoseethesynopsisforthe
differentparameters.
d. ClickonStartbuttontorun
4. Studytheoutputintherightpanel.Itshouldlooksomethingsimilartothefollowing:
Apriori
=======
Minimum support: 0.15
Minimum metric : 0.9
Number of cycles performed: 17
Generated sets of large itemsets:
Size of set of large itemsets L(1): 12
Size of set of large itemsets L(2): 47
Size of set of large itemsets L(3): 39
Size of set of large itemsets L(4): 6
Best rules found:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
5. Canyouexplainwhattheoutputsays?
6. Tryvaryvalueofparameters;forexample,minimumsupport,minimumconfidenceand
numberofrules.
7. Whatdoyoufind?
WEKAsApriori(ref:web.mac.com)
ThedefaultvaluesforNumberofrules,thedecreaseforMinimumsupport(deltafactor)and
minimumConfidencevaluesare10,0.05and0.9.RuleSupportistheproportionofexamplescoveredby
theLHSandRHSwhileConfidenceistheproportionofexamplescoveredbytheLHSthatarealso
coveredbytheRHS.Soifarule'sRHSandLHScovers50%ofthecasesthentherulehas0.5support,if
theLHSofarulecovers200casesandofthesetheRHScovers50casesthentheconfidenceis0.25.
WithdefaultsettingsAprioritriestogenerate10rulesbystartingwithaminimumsupportof100%,
iterativelydecreasingsupportbythedeltafactoruntilminimumnonzerosupportisreachedorthe
requirednumberofruleswithatleastminimumconfidencehasbeengenerated.IfweexamineWeka's
output,aMinimumsupportof0.15indicatestheminimumsupportreachedinordertogeneratethe10
ruleswiththespecifiedminimummetric,hereconfidenceof0.9.Theitemsetsizesgeneratedare
displayed;e.g.thereare6fouritemsetshavingtherequiredminimumsupport.Bydefaultrulesare
sortedbyconfidenceandanytiesarebrokenbasedonsupport.Thenumberpreceding==>indicatesthe
numberofcasescoveredbytheLHSandthevaluefollowingtheruleisthenumberofcasescoveredby
theRHS.Thevalueinparenthesisistherule'sconfidence.
bank.arff
1. Loadbank.arffintoatexteditorandanalyzetheattributetypesandvalues.
2. Isthisdatasetappropriateforassociationrulemining?ifnot,modifyit.Youmayuse
WEKAsPreprocessingcapability.
3. ApplyApriorialgorithmtothedataset.
4. Studytheoutputintherightpanel.
5. Checkoutoutputfromvariousdifferentsetsofparameters.
6. Isitsomethingyouexpected?
marketbasket.arff
1. Performsimilarstepsagainstmarketbasket.arff.
Youdonthavetoturninanything.However,bepreparedtodiscussresultsandfindingsinclass
individually. Iwillrandomlycallyouguystogiveexplanation.