Sei sulla pagina 1di 3

Data Mining in SAS

What is data mining?


Data mining is a process that extracts the data from large data sets which can be used and it
entails the patterns of data analysis in the large amount of data with the help of one or more
than one software. Another name of data mining is Knowledge Discovery Data (KDD). It
basically analyzes the huge amount of data and then later extracts the meaningful data. The
tools of data mining make predictions about the future trends and the behavioral patterns that
enable the businesses to take better decisions in order to make a profit.

The task of data mining


Data mining is classified into two models; ‘Predictive Model’ and ‘Descriptive Model’.

1. PREDICTIVE MODEL

It is a process in data mining which is used to estimate the proper outcomes. Every single model
is made up of predictors that probably manipulate the future results. Statistical model is
defined once the data is collected for applicable predictors. It is further classified into several
parts like classification, prediction, time series analysis and regression.
CLASSIFICATION

 Classification if the first process of data mining where it predicts the class labels for new
data which is unseen and which is based on the data tuples with known class labels.
 For example, predicting whether a new customer will buy a computer in the store or
not?
 Classification is a two step process, first is construction or training of a model/classifier
and second step is testing the accuracy of the classifier.

PREDICTION

 Prediction is the second step of data mining where it predicts the continuous valued
functions.
 It uses regression analysis or log linear model that helps in finding the missing values for
the given attributes.
 For example, predicting the amount customer can afford to buy a computer.

2. DESCRIPTIVE MODEL

It is a technique in data mining which is used to give information about predictive analysis of
the past by using statistical analysis & forecast technique is used to know about the future. The
historical data that has been stored in the database is exploited by the descriptive model in
order to give the precised report. It is also further divided into various parts like clustering,
summarization, association rules and sequence discovery.

CLUSTERING

 Clustering is a task that processes the grouping of similar objects and it is un-supervised
learning and also known as learning from observation.
 Clustering algorithms must follow principle of maximizing the inter-cluster similarity and
minimizing the intra-cluster similarity.
 There are three main popular clustering algorithms; K-means clustering, hierarchical
clustering and density based clustering.

ASSOCIATION RULE MINING

 It is a set of records containing some number of items from a given collection.


 It produces dependency rules that predict the occurrence of an item based on
occurrence of other items.
 Association rule mining has two steps; first step is finding the frequent item sets for the
given database and second is generating a strong association rules for the above
frequent items.

Sas training in Noida is provided by SkyWebcom with real-time projects that enhance the skills
of every single trainee. SkyWebcom is the foremost IT institute of training where Sas course is
extensive and training is offered by 32 years of experienced professionals. SkyWebcom is the
best Sas training institute in Delhi/NCR where training is given with great lab facilities & 100%
placement assistance in top companies.

Potrebbero piacerti anche