Sei sulla pagina 1di 14

G.M.

VEDAK INSTITUTE OF TECHNOLOGY, TALA


DEPARTMENT OF INFORMATION TECHNOLOGY
A
Final Year Project
On
“Data Mining Using Association Rule Based On APRIORI Algorithm”
Under the guidance of
Prof. R.Singh
Prepared by:
1.Sushant Joshi
2.Rachna Rai
3.Rushikesh Sawant
4.Shubham Shirke
 Contents
1. Introduction
2. Literature Survey
3. Existing System
4. Problem statement
5. Proposed system
6. Requirements
7. Advantages
8. Diagrams
9. Conclusion
 Introduction
• Data mining is an important method to increase efficiency, discover hidden (novel), useful,
valid and understandable knowledge from a massive databases.
• Data mining is the process of analyzing data from different perspective and summarizing
the data into useful identical format of information that can be used to predict future
trends or performances.
• The ultimate goal of data mining is to recognize pattern full information and predictions .
Association mining is an important component of data mining.
• Association Mining: Association mining is one of the most popular ways of data mining
uses association rules that are an important class of methods of finding
regularities/patterns in data.
• Association mining has been used in many application domains. One of the best known is
the business field where discovering of purchase patterns or association between products
is very useful for decision making and effective marketing.
• Definition of an Association Rule: Association rule of data mining involves picking out the
unknown interdependence of the data and finding out the rules between those items.
introduced association rules for point of sale (POS) systems in supermarkets . The left-hand
side of the rule is called as antecedent. The right-hand side of the rule is called as
consequent.
Literature Survey
• An efficient algorithm is proposed to integrate confidence measure during the process of mining
frequent item sets, may substantially improve the performance of association rules mining by
reducing the search space. The experimental results show the effectiveness of the proposed
algorithm in reducing the number of discovered rules comparing with the Apriori algorithm.
• A proposed of many algorithms to mine association rule that uses support and confidence as
constraint. We proposed a method based on support value that increase the performance of Apriori
algorithm and minimizes the number of candidate generated and removed candidate at checkpoint
which is infrequent which interns reduces storage and time required to calculate support of
candidate.
• Proposed an improved algorithm based on a combination of data division and dynamic item-sets
counting. The proposed algorithm has improved the two main problems which are faced by classical
apriori algorithm. First is the repeatedly scanning of transactional database and second is the
generation of large number of candidate sets. In data division, the transactional database is divided
into n parts that don’t intersect each other.
• In first scan, all the frequent sets of each division are mined which is called local frequent sets. In
second scan, the whole database is scanned again, getting support degree of all candidate item-sets
and then deciding the global frequent item-sets. After data division, dynamic item-sets counting are
used to decide candidate item-sets before scanning database every time. So, the whole process
needs only twice the entire database scan.
Existing System
• Existing system requires huge amount of human resources to do analysis on
products. Based on manual analysis products are mapped with another matching
products.
 Problem Statement

• In existing system,associated data is mapped manually.


• In this, human interaction is needed to map the related data.
• This process is tedious and requires more time.
• This leads to invest huge amount of money and also contains human errors.
 Proposed System
• This proposed work employees, transaction merging and Frequent Count table to
find the significant frequent items. First it merges the similar transactions in the
database and stores the merged transactions in the main memory. Later it reads
the merged transactions one by one from main memory and update the Frequent
Count table correspondingly.
• To find the frequent itemset for any threshold value it scans the Frequent Count
table not the database. Frequent Count table has entries of frequency count of all
itemset but not the total support count of that itemset. The frequency count of
each itemset is the count of the direct existence of such itemset in transactional
database D.
• The total count of particular itemset X is calculated by comparing whether it a
subset of all its bigger itemset in the Frequent Count table. If total frequency count
of particular itemset is greater than or equal to ST (Support Threshold) then the
itemset is included in the FI (Frequent Itemsets).
 Requirements
 Hardware requirements:
• Processor – Pentium –IV
• Speed – 1.1 Ghz
• RAM – 1GB (min)
• Hard Disk – 100 GB
Software requirements:
• Operating System : Windows 7
• Technology :JAVA, SERVLET, JSP, HTML5 / CSS3, BOOTSTRAP, JQUERY
• Database : MYSQL
• Tool : Eclipse
 Advantages
1. This will helps us to know the interest of user,according to that the
offers are provided.
2. Customers can easily make purchasing decisions.
3. Also customers time of market analysis is saved.
Block Diagram
Use Case Diagram
Flow Diagram
Conclusion

Association rule mining has a wide range of applicability such as market


basket analysis, suspicious e-mail detection, library management and many areas.
The conventional algorithm of association rules discovery proceeds in two steps.
All frequent item sets are found in the first step. The frequent item set is the item
set that is included in at least minimum support transactions. The association
rules with the confidence at least minimum confident are generated in the
second step. we surveyed the list of existing association rule mining techniques
using apriori algorithm.
THANK YOU

Potrebbero piacerti anche