Sei sulla pagina 1di 6

GRD Journals- Global Research and Development Journal for Engineering | Volume 5 | Issue 5 | April 2020

ISSN- 2455-5703

Comparative Study of Classification of


Cancerous Profile using Deep Learning and
Classification Algorithms
Snehal Mastud Shweta Pandit
Department of Computer Engineering Department of Computer Engineering
Smt. Indira Gandhi College of Engineering Smt. Indira Gandhi College of Engineering

Ankita Mungekar Prof. Sachin Desai


Department of Computer Engineering Department of Computer Engineering
Smt. Indira Gandhi College of Engineering Smt. Indira Gandhi College of Engineering

Abstract
In this paper, we have implemented the Comparative Study of ML and DL approaches employed in the modelling of cancer
progression. Here we are showing the comparative analysis of ML and DL techniques and their classification algorithms. And our
system also contains a website where in hospitals lab technician can detect type of cancer easily.
Keywords- SVM (Support Vector Machine), ML (Machine Learning), DL (Deep Learning), ANN (Artificial Neural
Network), CNN (Convolution Neural Network), Naïve Bayes, Decision Tree

I. INTRODUCTION
We aim at creating a system that can detect whether that person having a cancer or not. We have implemented ML and DL
techniques and their classification algorithms. Implemented Machine Learning algorithms are SVM, Naïve Bayes, and Decision
Tree. Implemented Deep Learning algorithms are ANN and CNN. These algorithms are applied on a Data Set taken from Kaggle.
Above algorithms are implemented on a Data set and algorithms gives its accuracy in percentage.
Our System also contains a website. This website is useful for lab technician for predicting cancer. By using website we
can easily detect two types of cancer. That two types of cancer are Malignant cancer and Metastatic cancer.

A. Support Vector Machine Algorithm


Support Vector Machine (SVM) is a supervised machine learning algorithm which can be used for both classification challenges.
However, it is mostly used in classification problems. In this algorithm, plot each data item as a point in n-dimensional space
(where n is number of features you have) with the value of each feature being the value of a particular coordinate. Support Vectors
are simply the coordinates of individual observation. In this paper mainly we will consider the input is based upon Support Vector
Machine as training data, testing data is decision value. In this method we consider the following steps like Load Dataset, after
loading the dataset will Classify Features (Attributes) based on class labels then estimate Candidate Support Value, like the
condition is While (instances!=null), Do condition if Support Value=Similarity between each instance in the attribute then finding
the Total Error Value. Suppose if any instance < 0 then the estimated decision value = Support Value\Total Error, repeated for all
points until it will be empty. Therefore mainly we have calculated the entropy and gini index.

B. Artificial Neural Network


Artificial Neural Networks (ANN) is an interconnected group of nodes that uses a computational model for information processing.
It changes its structure based on external or internal information that flows through the network. ANN can be used to model a
complex relationship between inputs and outputs and find patterns in data.

C. Naïve Bayes Algorithm


Naïve Bayes is a relatively simple machine learning technique based on probability models - Bayesian theorem. It belongs to the
family of probabilistic classifiers in machine learning based on Bayes’ theorem with a strong statistic independence assumed
between the features.
𝑃 ℎ𝑘 𝑥𝑗 = 𝑃 𝑥𝑗 ℎ𝑘 𝑃 ℎ𝑘 𝑥𝑗 𝑛 𝑖=0 ;0< 𝑘 < 𝑛 + 1 ; 𝑖,𝑗, 𝑘 𝜖 𝑍 (2)
This classification technique analyses the relationship between each feature and the class for each instance to derive a
conditional probability for the relationships between the feature values and the class.

All rights reserved by www.grdjournals.com 18


Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms
(GRDJE/ Volume 5 / Issue 5 / 004)

D. Decision Tree Algorithm


Decision tree learning uses a decision tree (as a predictive model) to go from observations about an item (represented in the
branches) to conclusions about the item's target value (represented in the leaves).

E. Convolutional Neural Network


A convolutional Neural Network (CNN) is comprised of one or more convolution layers (Often with a subsampling step) and then
followed by one or more fully connected layers as in a standard multilayer neural network. The architecture of a CNN is designed
to take advantage of the 2D structure of an input image (or other 2D input such as a speech signal).

II. LITERATURE SURVEY


Classification of Cancerous Profiles using Machine Learning Algorithms: - Many existing methods are available for lung cancer
identification. This type of treatment recommended for an individual is influenced by various factors such as cancer-type, the
severity of cancer (stage) and most important the genetic heterogeneity. In such a complex environment, the targeted drug
treatments are likely to be irresponsive or respond differently. Hence, there is need to analyse cancer data for predicting optimal
treatment options. Analysis of such profiles can help to predict and discover potential drug targets and drugs. In this paper the main
aim is to provide machine learning based classification technique for cancerous profiles.
A Comparative Study of Machine Learning Algorithms applied to Predictive Breast Cancer Data: - Diagnostic errors are
the most frequent non-operative medical errors. Diagnosis should be more data-driven than trial-and error. Machine Learning
provides techniques for classification and regression purposes which can be used for solving diagnostic problems in different
medical domains. Predictive analysis of fatal ailments like cancer using existing data can serve as a diagnosis tool for doctors. The
paper aims at a comparative study of Machine Learning algorithms on a predictive breast cancer dataset. The algorithms used for
comparison - Artificial Neural Networks (ANN), k-Nearest Neighbors (kNN) and Bayesian Network Classifiers – are supervised
learning algorithms used widely for classification purposes and are chosen for their diversity. Based on analysis of this data,
Artificial Neural Networks are better at classification with 97.4% accuracy than kNN and Bayesian Classifiers. Keywords: machine
learning, medical diagnosis, breast cancer, neural networks, k nearest neighbors, Bayesian classifiers.

III. METHODOLOGY

A. Classification
Step 1: Load the data into Python for classification
Step 2: Pre-process the data if required
Step 3: Split the data into training and testing data set
Step 4: Implement algorithms on Dataset stated previously
Step 5: Take accuracy of all algorithms
Step 6: Compare all algorithms on the basis of their accuracy and take algorithm having higher accuracy

B. User Interface
Step 1: Create a User interface
Step 2: Enter user ID and password.
Step 3: Click on the submit button.

C. System Requirement

1) Hardware Requirements
Workstation with minimum 4 GB Ram, i5 core processor (or anything equivalent), 16GB or more Hard Disk Space

2) Software Requirements
Python 3.6.3
Anaconda(Jupyter notebook)
Mysql
Pycharm IDE
Web Browser (Google Chrome Preferred)

D. Functional Requirements
Technologies
Python
Platform
Python Application
Frameworks/APIs

All rights reserved by www.grdjournals.com 19


Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms
(GRDJE/ Volume 5 / Issue 5 / 004)

The following frameworks /APIs will be required


Sklearn
Pandas
Numpy
Tkinter
Matplotlib
Python CV

E. Objectives of the Project


We aims to achieve the following through this project:
– Provide an intelligent and interactive system for detecting cancer accuracy.
– To evaluate if it is to identify a cancer to represent recent method that improved algorithm performance and accuracy in
distributed environment.

IV. FLOWCHART

V. RESULT AND CONCLUSION


The proposed system as planned after extensive research during a literature survey includes the following features: Implementation
of ML and DL algorithms on dataset.

All rights reserved by www.grdjournals.com 20


Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms
(GRDJE/ Volume 5 / Issue 5 / 004)

Fig. 1: Result Accuracy

In this bar chart, we got comparison models between algorithms. SVM, ANN and CNN got 90% accuracy, Decision Tree
got 80% accuracy, Naïve Bayes got 60% accuracy. SVM, ANN and CNN got highest accuracy. Next Decision Tree got 80%
accuracy. We found that DL algorithms are more accurate than ML algorithms.

A. Website Snapshots
After entering user ID and password.

All rights reserved by www.grdjournals.com 21


Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms
(GRDJE/ Volume 5 / Issue 5 / 004)

All rights reserved by www.grdjournals.com 22


Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms
(GRDJE/ Volume 5 / Issue 5 / 004)

REFERENCES
[1] Pirooznia, M., Yang, J.Y., Yang, M.Q. et al. A comparative study of different machine learning methods on microarray gene expression data. BMC Genomics
9, S13 (2008). https://doi.org/10.1186/1471-2164-9-S1-S13
[2] Danso, SO, Atwell, ES and Johnson, O (2013) A comparative study of machine learning methods for verbal autopsy text classification. IJCSI International
Journal of Computer Science Issues, 10 (6). ISSN 1694-0784
[3] Yaramala Sushma, Vagolu S Prasad Babu, Vanitha Kakollu, "Classification of Cancerous Profiles using Machine Learning Algorithms" International Journal
of Engineering Trends and Technology 67.3 (2019): 99-101.
[4] Potdar, Kedar & Kinnerkar, Rishab. (2016). A Comparative Study of Machine Learning Algorithms applied to Predictive Breast Cancer Data. International
Journal of Science and Research (IJSR). 5. 1550.
[5] Er, Orhan & Yumuşak, Nejat & Temurtas, Feyzullah. (2010). Chest diseases diagnosis using artificial neural networks. Expert Systems with Applications.
37. 7648-7655. 10.1016/j.eswa.2010.04.078.

All rights reserved by www.grdjournals.com 23

Potrebbero piacerti anche