Sei sulla pagina 1di 21

Ghada H.

El-Khawaga
Marwa M. El-Sadeeq
2007

What is data mining ?

Why data mining?

Data mining types

Data mining tasks

Knowledge discovery in databases (KDD) processes

Data mining processes

Data mining techniques

Data mining and Data warehousing

Data Mining System Components

Data Mining Applications


Data Mining Tools

Non-trivial extraction of implicit, previously


unknown and potentially useful information from
data.

A step in the knowledge discovery process


consisting of particular algorithms (methods)
that under some acceptable objective, produces
a particular enumeration of patterns (models)
over the data.

Data volumes are too large for classical analysis


approaches:
Large number of records
High dimensional data

Leverage organizations data assets

Only a small portion of the collected data is ever analyzed

Data that may never be analyzed continues to be


collected, at a great expense, out of fear that something
which may prove important in the future is missing.

As databases grow, the ability to support the


decision support process using traditional query
languages becomes infeasible

Query formulation problem

Predictive data mining: which produces the


model of the system described by the given data.
It uses some variables or fields in the data set to
predict unknown or future values of other
variables of interest.

Descriptive data mining: which produces new,


nontrivial information based on the available
data set. It focuses on finding patterns
describing the data that can be interpreted by
humans.

Data processing [descriptive]


Prediction [predictive]
Regression [predictive]
Clustering [descriptive]
Classification [predictive]
Link analysis/ associations [descriptive]
Evolution and deviation analysis [predictive]

Statistical methods
Case-based reasoning
Neural networks
Decision trees

Data warehousing + data mining =


increased performance of decision making
process
+
knowledgeable decision makers

SQL Vs. Data mining Vs. OLAP

Data Mining
Data Mining
Data Mining
Data Mining
Research
Data Mining

For Financial Data Analysis


For Telecommunications Industry
For The Retail Industry
In Healthcare and Biomedical
In Science and Engineering

The Function of the data mining system is to


assign scores to various profiles.

Data Mart
Data Mining System(Processing)
Operational Data Store
Scoring Software
Reporting System

Data Mining For Financial Data Analysis

In Banking Industry data mining is used :


1- in the predicting credit fraud
2- in evaluation risk
3- in performing trend analysis
4- in analyzing profitability
5- in helping with direct marketing campaigns

In financial markets and neural networks data mining is


used :
1- forecasting stock prices
2- forecasting commodity-price prediction
3- forecasting financial disasters

Data Mining For Telecommunications Industry


- Answering some strategic questions through data-mining
applications such as:
1-How does one retain customers and keep them loyal
as competitors offer special offers and reduced rates?
2-When is a high-risk investment, such as new fiber optic
lines, acceptable?
3-How does one predict whether customers will buy
additional products like cellular services, call waiting,
or basic services?
4-What characteristics differentiate our products from those of
our competitors?

Data Mining For The Retail Industry


-The retail industry is a major application area for data

mining since it collects huge amounts of data on sales,


customer-shopping history, goods transportation,
consumption patterns, and service records.
-Retailers are interested in creating data-mining models to
answer questions such as:
1- What are the best types of advertisements to reach
certain segments of customers?
2- What is the optimal timing at which to send mailers?
3- What types of products can be sold together?
4- How does one retain profitable customers?
5- What are the significant customer segments that
buy products?

Data Mining In Healthcare and Biomedical


Research
- Storing patients' records in electronic format and the
development in medical-information systems cause a large
amount of clinical data to be available online. Regularities, and
surprising events extracted from these data by data-mining
methods are important in assisting clinicians to make informed
decisions, thereby improving health services.
- data mining has been used in many successful medical
applications, including data validation in intensive care, the
monitoring of children's growth, analysis of diabetic patient's
data, the monitoring of heart-transplant patients.

Data Mining In Science and Engineering


- a few important cases of data-mine applications in
engineering problems. Pavilion Technologies' Process Insights,
an application-development tool that combines neural
networks, fuzzy logic, and statistical methods was used to
develop chemical manufacturing and control applications
to reduce waste, improve product quality, and increase
plant throughput.

Data Mind
Agent Base/Marketer
DB Miner
Decision Series
IBM Intelligent Miner
Data Mining Suite
Darwin (now part of Oracle)
Business Miner
Data Engine

Agent Base/Marketer
It is based on emerging intelligent-agent technology.
It can access data from all major sources, and it runs on Windows95,
Windows NT, and the Solaris operating system .

Business Miner
It is a single-strategy, easy-to-use tool based on decision trees.
It can access data from multiple sources including Oracle, Sybase,
SQL Server, and Teradata.
It runs on all Windows platforms

Data Engine
It is a multiple-strategy data-mining tool for data modeling,
combining conventional data-analysis methods with fuzzy technology,
neural networks, and advanced statistical techniques.
It works on the Windows platform.

Difficult to use
Needs Expert to run the tool
Difficult to add new functionality
Difficult to interface
Short lifetime
Limited Number of algorithms
Need lot of resources

Data Mining: Concepts, Models, Methods, and


Algorithms, Mehmed Kantardzic,
ISBN:0471228524, John Wiley & Sons 2003.
Privacy data mining report, DHS privacy
office,2005.
Building Data Mining Solutions with OLE DB for
DM and XML for Analysis, Zhaohui Tang, Jamie
Maclennan, Peter Pyungchul Kim, SIGMOD
Record, Vol. 34, No. 2, June 2005

Potrebbero piacerti anche