Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
MACHINE LEARNING”
A project report submitted to
Chhattisgarh Swami Vivekanand Technical University , Bhilai(C.G.) , India
We the undersigned solemnly declare that the Minor Project work entitled
“CREDIT CARD FRAUD DETECTION SYSTEM USING MACHINE
LEARNING” is based our own work carried out during the course of our study
under the supervision of Mr. Abhishek Saw .
We assert that the statements made and conclusions drawn are an outcome of the
project work. We further declare that to the best of our knowledge and belief that
the report does not contain any part of any work which has been submitted for the
award of any other degree/diploma/certificate in this University /deemed
University of India or any other country.
.…………………………
(Signature of the Candidate)
.…………………………
(Signature of the Candidate)
……………………………
(Signature of the Candidate)
Department of Computer Science and Engineering
__________________ __________________
RAIPUR INSTITUTE OF
TECHNOLOGY,RAIPUR Chhatauna, Mandir Hasaud,
Raipur, (C.G.)
Signature of Guide
……………………………
ACKNOWLEDGEMENT
The pleasure, the achievement, the glory, the satisfaction, the reward appreciation
and the construction of our project cannot be through off without the few, who
apart from their regular schedule spared their valuable time. A number of persons
contribute either directly or indirectly in shaping and achieving the desired
outcome.
We express our sincere thanks to our superior Asst. Prof. Abhishek Saw,
Department of Computer Science &Engineering, Raipur Institute of
Technology,Raipur for his valuable guidance, suggestions and help required for
executing the project work time to time. Without his direction and motivation, it
would have been nearly impossible for us to achieve the level of target planned for
providing us with an opportunity to develop this project. Through his timely
advice, constructive criticism and supervision he was inspiration for us.
At the last but not the least we are really thankful to our Parents for always
encouraging us in our studies and also to our friends who directly or indirectly
help us in this work.
ABSTRACT
Financial fraud is an ever growing menace with far consequences in the financial industry.
Data mining had played an imperative role in the detection of credit card fraud in online
transactions. Credit card fraud detection, which is a data mining problem, becomes
challenging due to two major reasons - first, the profiles of normal and fraudulent behaviours
change constantly and secondly, credit card fraud data sets are highly skewed. The
performance of fraud detection in credit card transactions is greatly affected by the sampling
approach on dataset, selection of variables and detection technique(s) used. This paper
investigates the performance of naïve bayes, k-nearest neighbor and logistic regression on
highly skewed credit card fraud data. Dataset of credit card transactions is sourced from
European cardholders containing 284,807 transactions. A hybrid technique of under-
sampling and oversampling is carried out on the skewed data. The three techniques are
applied on the raw and preprocessed data. The work is implemented in Python. The
performance of the techniques is evaluated based on accuracy, sensitivity, specificity,
precision, Matthews correlation coefficient and balanced classification rate. The results
shows of optimal accuracy for naïve bayes, k-nearest neighbor and logistic regression
classifiers are 97.92%, 97.69% and 54.86% respectively. The comparative results show that
k-nearest neighbour performs better than naïve bayes and logistic regression techniques.
Table of Contents :
CHAPTER 1
Introduction
The PwC global economic crime survey of 2016 suggests that approximately 36% of
organizations experienced economic crime. Therefore, there is definitely a need to
solve the problem of credit card fraud detection. The task of fraud detection often
boils down to outlier detection, in which a dataset is scanned through to find
potential anomalies in the data. In the past, this was done by employees which
checked all transactions manually. With the rise of machine learning, artificial
intelligence, deep learning and other relevant fields of information technology, it
becomes feasible to automate this process and to save some of the intensive amount
of labor that is put into detecting credit card fraud. In the following sections, my
machine learning based Pythonic approach is explained.
CHAPTER -2
Introduction to Project and Working :
Due to rise and acceleration of E-Commerce, there has been a tremendous use of
credit cards for online shopping which led to High amount of frauds related to credit
cards. In the era of digitalization the need to identify credit card frauds is necessary.
Fraud detectioninvolves monitoring and analyzing the behavior of various users in
order to estimate detect or avoid undesirable behavior. In order to identify credit
card fraud detection effectively, we need to understand the various technologies,
algorithms and types involved in detecting credit card frauds. Algorithm can
differentiate transactions which are fraudulent or not.Find fraud, they need to passed
dataset and knowledge of fraudulent transaction. They analyze the dataset and
classify all transactions.
• Android IDLE
• Django Framework
• Pycharm Idle
• Laptop or PC
• Wifi
CHAPTER- 4
2. Flow Diagram
Screen Shot :
CHAPTER-5
Conclusion and Future Enhancement:
Methodology :
We are using Random Forest Algorithm and Local Outlier Factor for detecting
fraudulent credit card transactions from the dataset. Here given dataset is in labelled
format. For analysing efficiency of the algorithms, we use split function on database.
Split function divides the dataset in training data and testing data. Amount of data
that is to be divided into training and testing data isupon user. User can decide how
much data to be used for training and testing purposes as per the need. Training
data is the data that is to be passed to the module for building its logic. After model
is trained with the training data, testing data is passed to the model to check
efficiency of algorithms. Here we have used 80% of the total credit card transactions
for training purpose and remaining 20% of the transactions for testing purpose.
Selected 80% of training data is used to train fraud detection module, module
defines its logic for dealing with further transactions, algorithms used can be
Random Forest Algorithm or Local Outlier Factor, Testing Data is passed to the
module as training of module is complete.