Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Abstract- Liver disease had became one of the most functioning properly and person may suffer from
prominent disease in our country .It is the reason for about liver cancer.It cannot be reversed but can be stopped
2.4% of death per year in India. It has became a challenge to if the consumption of alcohol is stopped.
predict the disease of liver in early stage if not diagnosed
early stages it become very hard to cure later on. Machine II. LITERATURE SURVEY
Learning has helped us a lot in the field of medical.In this
paper,it is estimated that which attributes are important and This part consist of papers that are surveyed:
which are not.The classification techinques are performed
in the training dataset.The main aim of the paper is to apply P.Rajeswari,G.Sophia Reena et al.,[2010]introduced the
various machine learning algorithms like random classification based on liver diagnosis. The dataset for
forest,SVM(Support Vector Machine),Logistic Regression training is created by collecting data from UCI repository
on the datasets and thus identify whether the patient has liver which consists of 345 instances and 7 distinct attributes.
disease or not. In this paper,the result is obtained by applying naïve baised
algorithm,K-star algorithm and FT tree algorithm.Out of
Index Terms-Liver disease,classification,dataset,random these three algorithms,time taken by FT tree algorithm is fast
forest,SVM,logistic regression. with accuracy of 97.10%.Based on the result,FT tree
algorithm is the best algorithm among all three
algorithms[2].
I. INTRODUCTION
Internet is full of data and people give their opinion about Sa’diyah Noor Novita Alfisahrin, Teddy Mantoro et al.,
everything and anything.It is extremely difficult to make a [2013] have introduced to predict or identify if the patients
correct choice when their is a large amount of data and no is suffering from liver disease pr not on the basis of 10
good method to make a decision.We need a method to attributes.The algorithms used in it were Naïve
extract the sentiment out of a data and use it as a make a Baiyes,Decision tree and NB tree algorithm.On seeing the
sensible choice , to solve such problem we use sentiment result,it is seen that NB Tree algorithm is the best algorithm
analysis.SENTIMENTAL ANALYSIS is a kind of text with the highest accuracy,however NB algorithm is the first
classification based on Sentimental Orientation (SO) of to give the results.For further study,the improvement in
opinion they contain. Sentiment analysis of product reviews accuracy of NB tree algorithm,will be the target by finding
has recently become very popular in text mining and the most suitable factors in predicting whether patient has
computational linguistics research. liver disease or not[3].
above 94 percentage accuracy for every type of liver Due to data-preprocessing,it becomes easy to process the
disorder [6]. data and thus gives the better result.The example of the data-
preprocessing is to fill the null values in the dataset.
P.Thangarajul, R.Mehala et al., [2015] has introduced to The three techniques used in data-preprocessing are:
analyze liver diseases patients datawith the use of particle ● Rescale data
swarm optimization algorithm (PSO) with K-Star ● Binarize the data
classification in two ways for classifying the presence of ● Standardize the data
disease or not.This algorithm increase the performance of
accuracy when compared to present classification
algorithms. PSO-Kstar algorithm is best algorithm for the iii)Machine learning Algorithms used:
liver disorders classification as it improved the
performance in prediction accuracy.The best data mining A. SVM(Support Vector Machine)
algorithm with respect to understandability,
transformability and accuracy is PSO-KStar algorithm with This algorithm aims to focus on to find the hyperplane in N-
100% accuracy[7]. dimension plane which classifies the data-points distinctly.
Hyperplanes are the boundries which classifies the
Onwodi Gregory [2015] introduced two dataset of liver datapoints distinctly.Various hyperplanes are created and
patient which were to build classification model for hyperplane with the largest margin is choosed in this
predicting liver disorder . Eleven datamining classification algorithms to classify.The more is the margin,the less is the
algorithm were used in dataset and then performance of all error.
those algoritim are compared among themselves with
repect to accuracy , recall and precision. Based on those B. Random Forest algorithm
result the accuracy of FT tree algorithum was the best
which 78% accuracy , 86.4% sensitivity ,77.5% of It is the flexible,easy to use algorithm in the field of machine
precision and 38.2% of specificity result respectively [8]. learning.It is a supervised algorithm and can be used for both
classification and regression.Random forest reduce the
variance that create the disturbances in the results.We can
III. IMPLEMENTATION find the output of the individual tree through majority voting
and thus smoothing out the variance to increase the accuracy
i)DATASET of the results.
i)Accuracy
V.CONCLUSION
It is the ratio of the correct predicted to the total number of
predictions.
In this paper,various algorithms are applied on the
Formula of accuracy is:-
dataset.Various algorithm such as Random
forest,SVM,logistic regression,etc are applied to the
dataset.These algorithms gave different accuracy for the
dataset.
On analyzing the resul,it is seen that logistic regression
regression turns out to be the best algorithm with the
ii)Recall or sensitivity accuracy of 71.42%.
iii)Precision