Sei sulla pagina 1di 34

0 Title

Mining the Telecommunication


Data for Fraud Detection on
Call Detail Record
1 Introduction
What is Fraud?

◉ Wrongful or criminal deception intended to result in


financial or personal gain.
◉ Fraudster
◉ Vulnerability
2 Literature Review
Literature Review

◉ Reviewed around 79 papers related to the topic.


◉ Was not that much of a material available due to
unavailability of the telecom data publicly.
◉ From Jan 2019, this has gotten attention again.
Telecommunication
3 Fraud
What is Fraud in Call Detail
Record?
◉ Subscription Fraud
○ Fraudsters obtain an account without intention to pay the bill.
◉ Superimposed Fraud
○ Fraudsters take over a legitimate account.
◉ Pricing Fraud
○ Fraudsters pay less than the amount charged.
4 Problem Statement
Problem Statement

◉ The losses to telephone companies are measured in


terms of billions of American dollars.
◉ Fraud negatively impacts the telephone company:
○ Shareholder perceptions
○ Financially
○ Customer relations
○ Marketing.
◉ Pakistan telecom industry faces the same threats.
5 Research Questions
Research Questions

◉ Q 1: How to detect vulnerability of the system for possible


frauds?
◉ Q 2: How to increase the accuracy of detecting fraud in
telecommunication data?
◉ Q 3: To predict if the same fraud can occur in future?
◉ Q 4: Will the new approach be able to help in making
decisions about preventing the fraud in future?
6 Data
Data

◉ Telecom network generates around 70 plus different types


of call detail records.
○ Example:
■ Mobile originated call record
■ Mobile terminated call record
■ Mobile originated SMS record etc.
◉ Focus of this thesis
○ Mobile originated call record
Dataset

◉ Is of Mobile Telecommunications Network.


○ Manipulated a bit to cope up with the privacy policy of the
company.
◉ It is a call detail record of about 30 minutes at peak hours
from radar.
◉ Attributes = 57
◉ Tuples = 4 million plus
◉ Fraudulent = 27
◉ Non Fraudulent = 73
Data Snapshot
7 Methodology
Overview of the Detection Process
Methodology

◉ JNN
○ The neural network designed by us from traditional neural network
and named Jahanzeb Neural Network.
○ 17 hidden layers are used for training.
○ Used for fraud detection.
◉ JCNN
○ The neural network designed by us from traditional convolutional
neural network and named it Jahanzeb Convolutional Neural
Network.
○ 101 hidden layers are used for training.
○ Used for vulnerability prediction.
Methodology

◉ We can divide the complete working of the methodology


into three parts.
○ Pre-processing
○ Fraud detection
○ Predictions about the vulnerability
Methodology – Pre-processing

◉ Prior to pre-processing we first need to extract the feature


and then select them to work on.
○ Feature Extraction
■ Mutual Information
■ Input: Dataset
■ Output: Feature Name
○ Feature Selection
■ Principle Component Analysis
■ Input: Extracted features
■ Output: Tells us the worth of feature with the help of intensity
worth.
Methodology – Pre-processing

◉ This starts from the selection of data according to three


features.
○ Payments
○ Caller Identification
○ Call Duration
◉ Processed by Support Vector Machine (SVM)
○ Gives us the prediction on behalf of these three potential fraudulent
features
○ Input: Three features (Payment, Caller Identification, Call Duration)
○ Output: 0 = Clean data, fetch next value, 1 = suspicious data, forward
to MLP.
Methodology – Fraud detection

◉ Multilayer Perceptron (MLP)


○ Calculates the Mean Square Error (MSE) for each iteration having
a suspicious activity.
○ Input: SVM identified suspicious data.
○ Output: Mean Square Error
◉ Jahanzeb Neural Network (JNN)
○ It will analyse this same data and find out the intensity of fraud
(data manipulated) in all three features.
○ Input: MSE and original data.
○ Output: Feed Forward, Result to Naïve Base (NB)
Methodology – Fraud detection

◉ Naïve Bayes (NB)


○ It will decide if the data is fraudulent. If yes then what is the
accuracy.
○ SVM identified suspicious data, Feed Forward, Result from JNN
○ Fraud detection and accuracy
◉ Sigmoid Function and Entropy
○ For verification and identification of the fraudulent features with
prediction.
○ Fraudulent data with accuracy from JNN and NB
○ Tell values
Methodology – Predictions
about vulnerability
◉ Jahanzeb Convolutional Neural Network (JCNN) and
Sigmoid Function
○ Will provide prediction about vulnerability of each feature.
○ Input: Feed Forward, SVM Data, JNN output and entropy value
○ Output: Prediction about the vulnerability.
Working Methodology
8 Results
Fraud Detection Methods Accuracy

SVM, 76%
J48 56.8%
Decision Tree 83%
Outlier Detection C4.5 81%
K-mean 78.1%
MLP 89.2%
YOLO_V3 88.31%
JCNN 89.19%

Method Accuracy of Algorithm in Fraud


Detection
RCNN 87.8%
Addictive Regression 83% Fraud Detection Comparison
CNN 91%
Capsule 93%
JNN 93.2%
Vulnerability Detection Methods Accuracy

SVM, 88%
MLP, J48 87.8%
Decision Tree 83%
SVM, C4.5 91%
K-mean, SVM, MLP 93%
MLP, SVM, CNN, 93.2%
Fst.Net 89.90%
YOLO 89.92%
JCNN 93.7%

Comparison Table for Vulnerability Detection


Future Concerns
9 and Work
Future Concerns and Work

◉ Frauds are versatile as technology is changing


continuously.
◉ Researchers can improve the accuracy of fraud detection
system.
◉ Researchers can spend their efforts on detecting frauds
related to SMS, internet service etc.
10 Conclusion
Conclusion

◉ We have improved the accuracy of fraud detection system


from 88.31 to 89.19%.
◉ With our proposed methodology it has increased the
accuracy of vulnerability to 93.7% from 89.92%.
Thanks!

Potrebbero piacerti anche