Credit Card Fraud Detection Using Machine Learning Techniques A Comparative Analysis
Credit Card Fraud Detection Using Machine Learning Techniques A Comparative Analysis
Techniques:
A Comparative Analysis
Abstract—Financial fraud is an ever growing menace with far of stolen credit card to get cash through dubious means. A lot
consequences in the financial industry. Data mining had played of researches have been devoted to detection of external card
an imperative role in the detection of credit card fraud in online fraud which accounts for majority of credit card frauds.
transactions. Credit card fraud detection, which is a data mining Detecting fraudulent transactions using traditional methods of
problem, becomes challenging due to two major reasons – first,
manual detection is time consuming and inefficient, thus the
the profiles of normal and fraudulent behaviours change
constantly and secondly, credit card fraud data sets are highly advent of big data has made manual methods more
skewed. The performance of fraud detection in credit card impractical. However, financial institutions have focused
transactions is greatly affected by the sampling approach on attention to recent computational methodologies to handle
dataset, selection of variables and detection technique(s) used. credit card fraud problem.
This paper investigates the performance of naïve bayes, k-nearest Data mining technique is one notable methods used in
neighbor and logistic regression on highly skewed credit card solving credit fraud detection problem. Credit card fraud
fraud data. Dataset of credit card transactions is sourced from detection is the process of identifying those transactions that
European cardholders containing 284,807 transactions. A hybrid are fraudulent into two classes of legitimate (genuine) and
technique of under-sampling and oversampling is carried out on
fraudulent transactions [1]. Credit card fraud detection is
the skewed data. The three techniques are applied on the raw and
preprocessed data. The work is implemented in Python. The based on analysis of a card’s spending behaviour. Many
performance of the techniques is evaluated based on accuracy, techniques have been applied to credit card fraud detection,
sensitivity, specificity, precision, Matthews correlation coefficient artificial neural network [2], genetic algorithm [3, 4], support
and balanced classification rate. The results shows of optimal vector machine [5], frequent itemset mining [6], decision tree
accuracy for naïve bayes, k-nearest neighbor and logistic [7], migrating birds optimization algorithm [8], naïve bayes
regression classifiers are 97.92%, 97.69% and 54.86% [9]. A comparative analysis of logistic regression and naive
respectively. The comparative results show that k-nearest bayes is carried out in [10]. The performance of bayesian and
neighbour performs better than naïve bayes and logistic neural network [11] is evaluated on credit card fraud data.
regression techniques. Decision tree, neural networks and logistic regression are
tested for their applicability in fraud detections [12]. This
Keywords—credit card fraud; data mining; naïve bayes;
paper [13] evaluates two advanced data mining approaches,
decision tree; logistic regression, comparative analysis support vector machines and random forests, together with
logistic regression, as part of an attempt to better detect credit
I. INTRODUCTION card fraud while neural network and logistic regression is
Financial fraud is an ever growing menace with far applied on credit card fraud detection problem [14]. A number
reaching consequences in the finance industry, corporate of challenges are associated with credit card detection, namely
organizations, and government. Fraud can be defined as fraudulent behaviour profile are dynamic, that is fraudulent
criminal deception with intent of acquiring financial gain. transactions tend to look like legitimate ones; credit card
High dependence on internet technology has enjoyed transaction datasets are rarely available and highly imbalanced
increased credit card transactions. As credit card transactions (or skewed); optimal feature (variables) selection for the
become the most prevailing mode of payment for both online models; suitable metric to evaluate performance of techniques
and offline transaction, credit card fraud rate also accelerates. on skewed credit card fraud data. Credit card fraud detection
Credit card fraud can come in either inner card fraud or performance is greatly affected by type of sampling approach
external card fraud. Inner card fraud occurs as a result of used, selection of variables and detection technique(s) used.
consent between cardholders and bank by using false identity This study investigates the effect of hybrid sampling on
to commit fraud while the external card fraud involves the use performance of fraud detection of naïve bayes, k-nearest
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on June 14,2023 at 00:30:21 UTC from IEEE Xplore. Restrictions apply.
neighbour and logistic regression classifiers on highly skewed fraudulent cases identified in a pool of credit card transactions
credit card fraud data. data leading to a highly skewed distribution towards the
This paper seeks to carry out comparative analysis of negative class (legitimate transactions). The credit card data
credit card fraud detection using naive bayes, k-nearest investigated in [18] contains 20% of the positive cases,
neighbor and logistic regression techniques on highly skewed 0.025% positive cases [19] and below 0.005% positive cases
data based on accuracy, sensitivity, specificity and Matthews’s [8]. The data used in this study has positive class (frauds)
correlation coefficient (MCC) metrics. This paper extends the accounting for 0.172% of all transactions. A number of
handling of highly imbalanced credit card fraud data in [33]. sampling approaches have been applied to the highly skewed
The imbalanced dataset used in this study which contains credit card transactions data. A random sampling approach is
about 0.172% of fraud transactions is sampled in a hybrid used in [18, 20] and reports experimental results indicating
approach. The positive class (fraud) is oversampled while the that 50:50 artificially distribution of fraud/non-fraud training
negative class (legitimate) is under-sampled by the same data generate classifiers with the highest true positive rate and
number of times to achieve two distributions of 34:66 and low false positive rate. The paper [8] uses stratified sampling
10:90. The three techniques are applied to the data. The to under sample the legitimate records to a meaningful
performance comparison of the three techniques is analyzed number. It experiment on 50:50, 10:90 and 1:99 distributions
based on accuracy, sensitivity, specificity, Matthews of fraud to legitimate cases reports that 10:90 distribution has
Correlation Coefficient (MCC) and balanced classification the best performance (regarding the performance comparisons
rate. on the 1:99 set) as it is closest to the real distribution of frauds
The rest of this paper is organized as follows: Section II and legitimates. Stratified sampling is also applied in [21]. In
gives detailed review on credit card fraud, feature selection this study, a hybrid of under-sampling the negative cases and
detection techniques and performance comparison. Section III oversampling the positive cases is carried in order to preserve
describes the experimental setup approach including the data valuable patterns from the data.
pre-processing and the three classifier methods on credit card
fraud detection. Section IV reports the experimental results B. Feature (Variables) selection
and discussion about the comparative analysis. Section V The basis of credit card fraud detection lies in the analysis
concludes the comparative study and suggests future areas of of cardholder’s spending behaviour. This spending profile is
research. analysed using optimal selection of variables that capture the
unique behaviour of a credit card. The profile of both a
II. RELATED WORKS
legitimate and fraudulent transaction tends to be constantly
Classification of credit card transactions is mostly a binary changing. Thus, optimal selection of variables that greatly
classification problem. Here, credit card transaction is either as differentiates both profiles is needed to achieve efficient
a legitimate transaction (negative class) or a fraudulent classification of credit card transaction. The variables that
transaction (positive class). Fraud detection is generally form the card usage profile and techniques used affect the
viewed as a data mining classification problem, where the performance of credit card fraud detection systems. These
objective is to correctly classify the credit card transactions as variables are derived from a combination of transaction and
legitimate or fraudulent [6]. past transaction history of a credit card. These variables fall
A. Credit Card Fraud under five main variable types, namely all transactions
statistics, regional statistics, merchant type statistics, time-
Credit card frauds have been partitioned into two types: based amount statistics and time-based number of transactions
inner card fraud and external fraud [12, 15] while a broader statistics [19].
classification have been done in three categories, that is, The variables that fall under all transactions statistics type
traditional card related frauds (application, stolen, account depict the general card usage profile of the card. The variables
takeover, fake and counterfeit), merchant related frauds under regional statistics type show the spending habits of the
(merchant collusion and triangulation) and Internet frauds card with taken into account the geographical regions. The
(site cloning, credit card generators and false merchant sites) variables under merchant statistics type show the usage of the
[16]. It is reported in [17] that the total amount of fraud losses card in different merchant categories. The variables of time-
of banks and businesses around the world reached more than based statistics types identify the usage profile of the cards
USD 16 billion in 2014 with an increase of nearly USD 2.5 with respect to usage amounts versus time ranges or
billion in the previous year recorded losses, meaning that, each frequencies of usage versus time ranges. Most literature
USD 100 is having 5.6 cents that was fraudulent, the report focused on cardholder profile rather than card profile. It is
concluded. evident that a person can operate two or more credit cards for
Credit card transactions data are mainly characterized by different purposes. Therefore, one can exhibit different
an unusual phenomenon. Both legitimate transactions and spending profile on such cards. In this study, focus is beamed
fraudulent ones tend to share the same profile. Fraudsters learn on card rather than cardholder because one credit card can
new ways to mimic the spending behaviour of legitimate card only exhibit a unique spending profile while a cardholder can
(or cardholder). Thus, the profiles of normal and fraudulent exhibit multiple behaviours on different cards. A total of 30
behaviours are constantly dynamic. This inherent variables are used in [18], 27 variables in [19] and 20 variables
characteristic leads to a decrease in the number of true are reduced to 16 relevant ones [6].
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on June 14,2023 at 00:30:21 UTC from IEEE Xplore. Restrictions apply.
C. Credit card Fraud Detection Back-propagation (BP), together with naive Bayesian (NB)
As credit card becomes the most general mode of payment and C4.5 algorithms are applied to skewed data partitions
(both online and regular purchase), fraud rate tends to derived from minority oversampling with replacement [25].
accelerate. Detecting fraudulent transactions using traditional The study shows that innovative use of naive Bayesian (NB),
methods of manual detection are time consuming and C4.5, and back-propagation (BP) classifiers to process the
inaccurate, thus the advent of big data had made these manual same partitioned numerical data has the potential of getting
methods more impractical. However, financial institutions better cost savings. An adaptive and robust model learning
have turned to intelligent techniques. These intelligent fraud method that is highly adaptive to concept changes and is
techniques comprise of computational intelligence (CI)-based robust to noise is presented [26]. The classifiers’ weights are
techniques. Statistical fraud detection methods have been computed by logistic regression technique, which ensures
divided into two broad categories: supervised and good adaptability. Three different classification methods,
unsupervised [22]. In supervised fraud detection methods [13], decision tree, neural networks and logistic regression are
models are estimated based on the samples of fraudulent and tested for their applicability in fraud detections [12]. The
legitimate transactions to classify new transactions as results show that the proposed classifier of neural networks
fraudulent or legitimate while in unsupervised fraud detection, and logistic regression approaches outperform decision tree in
outliers’ transactions are detected as potential instances of solving the problem under investigation. A fusion approach
fraudulent transactions. A detailed discussion of supervised using Dempster–Shafer theory and Bayesian learning for
and unsupervised techniques is found in [23]. Quite a number detecting credit card fraud is proposed [27].The results also
of studies on a range of techniques have been carried out in show that use of Bayesian learning however, brings down the
solving credit card fraud detection problem. These techniques false positive rates to values close to 5%.
include but not limited to; neural network models (NN), Detection of credit card fraud using decision trees and
Bayesian network (BN), intelligent decision engines (IDE), support vector machines is investigated [28] and the results
expert systems, meta-learning agents, machine learning, show that the proposed classifiers of decision tree approaches
pattern recognition, rule-based systems, logic regression (LR), outperform SVM approaches in solving the problem under
support vector machine (SVM), decision tree, k-nearest investigation. As the training data scales, SVM based model
neighbor (kNN), meta learning strategy, adaptive learning etc. detection accuracy equal that of the decision tree based
Some related works on comparative study of credit card fraud models, but fall short in the number of frauds detected. This
detection techniques are presented. paper [13] evaluates the performance of logistic regression
alongside two advanced data mining approaches, support
D. Comparative study vector machines and random forests in credit card fraud
A study of the issues and results associated with credit card detection. The study shows that logistic regression maintained
fraud detection using meta-learning is presented [18]. This similar performance with different levels of under-sampling,
study is geared towards investigating distribution of frauds while SVM performance tend to increase with lower
and non-frauds that will lead to better performance, best proportion of fraud in the training data. Logistic regression
learning algorithms between meta-learning strategy. The shows appreciable performance, often surpassing that of the
results show that given a skewed distribution in the original SVM models with different kernels. In another study,
data, artificially more balanced training data leads to better classification models based on Artificial Neural Networks
classifiers. It demonstrate how meta-learning can be used to (ANN) and Logistic Regression (LR) are developed and
combine different classifiers and maintain, and in some cases, applied on credit card fraud detection problem [14] using a
improve the performance of the best classifier. Multiple highly skewed data. The results show that the proposed ANN
algorithms for fraud detection are investigated in [24] and classifiers outperform LR classifiers in solving the problem
results indicate that an adaptive solution can provide fraud under investigation. The logistic regression classifiers tend to
filtering and case ordering functions for reducing the number over fit the training data as it increases. This is due to lack of
of final-line fraud investigations necessary. A comparison of adequate sampling in the work. A comparative assessment of
logistic regression and naive bayes is presented in [10]. The supervised data mining techniques for fraud prevention is
results of the analysis shows that even though the presented in [29]. The techniques evaluated are decision tree,
discriminative logistic regression algorithm has a lower neural network and naive bayes classifiers. It is reported that
asymptotic error, the generative naive Bayes classifier may neural network classifiers are suitable for larger databases only
also converge more quickly to its (higher) asymptotic error. and take long time to train the model. Bayesian classifiers are
There are a few cases reported in which logistic regression's more accurate and much faster to train and suitable for
performance underperformed that of naive Bayes, but this is different sizes of data but are slower when applied to new
observed primarily in particularly small datasets. Another instances.
comparative study on credit card fraud detection using A meta-classification strategy is applied in improving
Bayesian and neural networks is done [11]. The results report credit card fraud detection [30]. The approach consists of 3
that Bayesian network performs better than neural network in base classifiers constructed using the decision tree, naïve
detecting credit card fraud. Bayesian, and k-nearest neighbour algorithms. Using the naïve
Bayesian algorithm as the meta-level algorithm to combine the
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on June 14,2023 at 00:30:21 UTC from IEEE Xplore. Restrictions apply.
base classifier predictions, the result shows 28% improvement subtraction of a data point interpolated between existing data
in performance. This paper [31] put a light on performance points till over-fitting threshold is reached.
evaluation based on the correct and incorrect instances of data
classification using Naïve Bayes and decision tree. The results n
show that the efficiency and accuracy of J48 is better than that PCnew = PC + i (1)
of Naïve Bayes [31]. In this paper [19], new comparison i =1
measure that realistically represents the monetary gains and n
NCnew = NC − i (2)
losses due to fraud detection shows that including the real cost
by creating a cost sensitive system using a Bayes minimum i =1
risk classifier, gives rise to much better fraud detection results n = mod(( NC / PC ) / 2) (3)
in the sense of higher savings. where PCnew is the new number of positive data point
instances, NCnew is the new number of negative data points, n
III. EXPERIMENTAL SET UP AND METHODS is the modulus of the ratio (NC/PC) of number of negative
This section describes the dataset used in the experiments class to positive class, PC and NC is the number of positive
and the three classifiers under study, namely; Naïve Bayes, k- and negative class data points in imbalanced dataset
Nearest Neighbour and Logistic Regression techniques. The respectively.
different stages involved in generating the classifiers include;
collection of data, preprocessing of data, analysis of data, C. Naïve Bayes Classifier
training of the classifier algorithm and testing (evaluation). Naïve Bayes a statistical approach based on Bayesian
During the preprocessing stage, the data is converted into theory, which chooses the decision based on highest
useable format fit and sampled. A hybrid of under-sampling probability. Bayesian probability estimates unknown
(the negative cases) and over-sampling (the positive cases) is probabilities from known values. It also allows prior
carried out to achieve two sets of data distributions. For the knowledge and logic to be applied to uncertain statements.
analysis stage, the feature selection and reduction is already This technique has an assumption of conditional independence
carried out on the dataset using PCA. The training stage is among features in the data. The Naïve Bayes classifier is
where the classifier algorithms are developed and fed with the based on the conditional probabilities (4) and (5) of the binary
processed data. The experiments are evaluated using True
classes (fraud and non fraud).
positive, True Negative, False Positive and False Negative
) P( f k P| c(if )*)P(ci )
rates metric. The performance comparison of the classifiers is
analyzed based on accuracy, sensitivity, specificity, precision, (
P ci | f k = (4)
Matthews correlation coefficient and balanced classification k
rate.
P( f k | ci ) = ∏ P( f k ci ) k = 1,..., n; i = 1,2
n
A. Dataset (5)
i =1
The dataset is sourced from ULB Machine Learning Group
where n represents maximum number of features (30), P(ci | fk)
and description is found in [32]. The dataset contains credit
card transactions made by European cardholders in September is probability of feature value fk being in class ci, P(fk | ci) is
2013. This dataset presents transactions that occurred in two probability of generating feature value fk given class ci, P(ci)
days, consisting of 284,807 transactions. The positive class and P(fk) are probability of occurrence of class ci and
(fraud cases) make up 0.172% of the transactions data. The probability of feature value fk occurring respectively. The
dataset is highly unbalanced and skewed towards the positive classifier performs the binary classification based on Bayesian
class. It contains only numerical (continuous) input variables classification rule.
which are as a result of a Principal Component Analysis
(PCA) feature selection transformation resulting to 28 If P(c1 | fk) > P(c2 | fk) then the classification is C1
principal components. Thus a total of 30 input features are
utilized in this study. The details and background information If P(c1 | fk) < P(c2 | fk) then the classification is C2
of the features cannot be presented due to confidentiality Ci is the target class for classification where C1 is the negative
issues. The time feature contains the seconds elapsed between class (non fraud cases) and C2 is the positive class (fraud
each transaction and the first transaction in the dataset. The cases).
'amount' feature is the transaction amount. Feature 'class' is the
target class for the binary classification and it takes value 1 for D. K-Nearest Neighbour Classifier
positive case (fraud) and 0 for negative case (non fraud). The k-nearest neighbour is an instance based learning
which carries out its classification based on a similarity
B. Hybrid Sampling of dataset
measure, like Euclidean, Mahanttan or Minkowski distance
Data pre-processing is carried out on the data. A hybrid of functions. The first two distance measures work well with
under-sampling and over-sampling is carried out on the highly continuous variables while the third suits categorical variables.
unbalanced dataset to achieve two sets of distribution (10:90 The Euclidean distance measure is used in this study for the
and 34:64) for analysis. This is done by stepwise addition and kNN classifier. The Euclidean distance (Dij) between two
input vectors (Xi, Xj) is given by:
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on June 14,2023 at 00:30:21 UTC from IEEE Xplore. Restrictions apply.
n 2 Positive (FPR) and False Negative (FNR) rates metric
Dij = X ik − X jk k=1,2,…,n (6) respectively.
k = 1
TP
TPR = (10)
For every data point in the dataset, the Euclidean distance P
between an input data point and current point is calculated.
These distances are sorted in increasing order and k items with TN
TNR = (11)
lowest distances to the input data point are selected. The N
majority class among these items is found and the classifier FP
returns the majority class as the classification for the input FPR = (12)
point. Parameter tuning for k is carried out for k = 1, 3, 5, 7, 9, N
11 and k = 3 showed optimal performance. Thus, value of k = FN
3 is used in the classifier. FNR = (13)
P
E. Logistic Regression Classifier where TP, TN, FP and FN are the number of true positive, true
Logistic Regression which uses a functional approach to negative, false positive and false negative test cases classified
estimate the probability of a binary response based on one or while P and N are the total number of positive and negative
more variables (features). It finds the best-fit parameters to a class cases under test. True positives are cases classified as
nonlinear function called the sigmoid. The sigmoid function positive which are actually positive. True negative are cases
(σ) and the input (x) to the sigmoid function are shown in (7) classified rightly as negative. False positive are cases
and (8). classified as positive but are negative cases. False negative are
cases classified as negative but are truly positive.
1 The performance of naïve bayes, k-nearest neighbour and
σ ( x) =
(1 + )
−x
(7) logistic regression classifiers are evaluated based on accuracy,
sensitivity, specificity, precision, Matthews correlation
x = w0 z 0 + w1 z1 + ... + wn z n (8) coefficient (MCC) and balanced classification rate. These
evaluation metrics are implored based on their relevance in
The vector z is input data and the best coefficients w, is evaluating imbalanced binary classification problem.
multiplied together multiply each element and adds up to get
TP + TN
one number which determines the classifier classification of Accuracy = (14)
the target class. If the value of the sigmoid is more than 0.5, TP + FP + TN + FN
it’s considered a 1; otherwise, it’s a 0. An optimization TP
method is used to train the classifier and find the best-fit Sensitivity = (15)
TP + FN
parameters. The gradient ascent (9) and modified stochastic
gradient ascent optimization methods were experimented on to TN
Specificity = (16)
evaluate their performance on the classifier. FP + TN
w := w + α∇ w f (w) (9) TP
Pr ecision = (17)
TP + FP
where the parameter ∇ is the magnitude of movement of the
gradient ascent. The steps are continued until a stopping
MCC =
(TP * TN ) − (FP * FN ) (18)
criterion is met. The optimization methods are investigated (TP + FP )(TP + FN )(TN + FP )(TN + FN )
(for iterations 50 to 1000) to know if the parameters are
converging. That is, are the parameters reaching a steady TP TN
value, or are they constantly changing. At 100 iterations, BCR = 1 * + (19)
2 P N
steady values of parameters are achieved.
Stochastic gradient ascent incrementally updates the Sensitivity (Recall) gives the accuracy on positive (fraud)
classifier as new data comes in rather than all at once. It starts cases classification. Specificity gives the accuracy on negative
with all weights set to 1. Then for every feature value in the (legitimate) cases classification. Precision gives the accuracy
dataset, the gradient ascent is calculated. The weights vector is in cases classified as fraud (positive). Matthews Correlation
updated by the product of alpha and gradient. Then weight Coefficient (MCC) is an evaluation metric for binary
vector is returned. The stochastic gradient ascent is used in classification problems. MCC is used mainly with unbalanced
this study because given the large size of data it updates the data sets because its evaluation consists of TP, FP, TN and
weights using only one instance at a time, thus reducing FN. The MCC value is usually between -1 and +1; a +1 value
computational complexity. represents excellent classification while a -1 value represents
total distinction between classification and observation.
IV. PERFORMANCE EVALUATION AND RESULTS Balanced classification rate represents the average of
Four basic metrics are used in evaluating the experiments, sensitivity and specificity which is the portion of negatives
namely True positive (TPR), True Negative (TNR), False which are classified as negatives [33].
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on June 14,2023 at 00:30:21 UTC from IEEE Xplore. Restrictions apply.
A. Results
In this study, three classifier models based on naive bayes,
k-nearest neighbour and logistic regression are developed. To TABLE 3. Accuracy result for 34:66 data distribution
evaluate these models, 70% of the dataset is used for training Classifiers
while 30% is set aside for validating and testing. Accuracy, Metrics k-Nearest Logistic
Naïve Bayes
sensitivity, specificity, precision, Matthews correlation Neighbour Regression
coefficient (MCC) and balanced classification rate are used to Accuracy 0.9769 0.9792 0.5486
evaluate the performance of the three classifiers. The accuracy
Sensitivity 0.9514 0.9375 0.5833
of the classifiers for the original 0.172:99.828 dataset
distribution, the sampled 10:90 and 34:66 distributions are Specificity 0.9896 1.0000 0.5313
presented in Tables 1, 2 and 3 respectively. Precision 0.9786 1.0000 0.3836
An observation of the metric tables shows that there is
Matthews
significant improvement from the sampled dataset distribution Correlation +0.9478 +0.9535 +0.1080
of 10:90 to 34:66 for accuracy, sensitivity, specificity, Coefficient
Matthews correlation coefficient and balanced classification Balanced
rate of the classifiers. This shows that a hybrid sampling Classification 0.9705 0.9688 0.5573
Rate
(under-sampling and over-sampling) on a highly imbalanced
dataset greatly improves the performance of binary
classification. The true positive, true negative, false positive
and false negative rates of the classifiers in each set of un-
TABLE 4. Basic metric rates for un-sampled data distribution
sampled and sampled data distribution is shown in Tables 4, 5
and 6. Logistic regression is the only technique that did not Classifiers
show better improvement in false negative rates from the Metrics k-Nearest Logistic
Naïve Bayes
10:90 to 34:66 data distribution. However, it showed overall Neighbour Regression
best performance in the un-sampled distribution. True Positive Rate 0.8072 0.8835 0.9767
Classifiers
Metrics k-Nearest Logistic
Naïve Bayes
Neighbour Regression TABLE 6. Basic metric rates for 34:66 data distribution
Accuracy 0.9752 0.9715 0.3639
Classifiers
Sensitivity 0.8210 0.8285 0.7155 Metrics k-Nearest Logistic
Naïve Bayes
Neighbour Regression
Specificity 0.9754 1.0000 0.2939
True Positive Rate 0.9514 0.9375 0.5833
Precision 0.0546 1.0000 0.1678
False Positive Rate 0.0104 0.000 0.4688
Matthews
Correlation +0.2080 +0.8950 +0.0077 True Negative Rate 0.9896 1.0000 0.5313
Coefficient
False Negative Rate 0.0486 0.0625 0.4167
Balanced
Classificati 0.8975 0.9143 0.5047
on Rate
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on June 14,2023 at 00:30:21 UTC from IEEE Xplore. Restrictions apply.
B. Comparative Performance
The performance evaluation of the three classifiers for the
34:66 data distribution is shown in figure 1. This data
distribution showed better performance. The k-nearest
neighbour technique showed superior performance across the
evaluation metrics used. It reached the highest value for
specificity and precision (that is 1.0) for the two data
distributions. This is because the kNN classifier recorded no
false positive in the classification. Naïve Bayes classifier only
outperformed the kNN in accuracy for the 10:90 data
distribution. The Logistic regression classifier showed the
least performance among the three classifiers evaluated.
However, there was significant improvement in performance
Figure 3.TPR and FPR evaluation of k-nearest neighbour classifiers
between the two sets of sampled data distribution. Since not
all related works carried out evaluation based on accuracy, *TPR = True Positive Rate
*FPR = False Positive Rate
sensitivity, specificity, precision, Matthews correlation *Proposed kNN = Proposed k-nearest neighbor classifier
coefficient and balanced classification rate, thus other related
works are compared with this study based on the basic true
positive and false positive rates. Figures 2 and 3 show the TPR
and FPR evaluation of proposed Naïve Bayes, kNN and LR
classifiers against other related works. The related works are
referenced using their reference number delimited within
square brackets “[ ]”.
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on June 14,2023 at 00:30:21 UTC from IEEE Xplore. Restrictions apply.
Logistic Regression) are trained on real life of credit turkish bank. In Data Mining Workshops (ICDMW), 2013 IEEE
card transactions data and their performances on 13th International Conference on (pp. 162-171). IEEE.
credit card fraud detection evaluated and compared [9] Bahnsen, A. C., Stojanovic, A., Aouada, D., & Ottersten, B.
(2014). Improving credit card fraud detection with calibrated
based on several relevant metrics. probabilities. In Proceedings of the 2014 SIAM International
2. The highly imbalanced dataset is sampled in a hybrid Conference on Data Mining (pp. 677-685). Society for Industrial
approach where the positive class is oversampled and and Applied Mathematics.
the negative class under-sampled, achieving two sets [10] Ng, A. Y., and Jordan, M. I., (2002). On discriminative vs.
generative classifiers: A comparison of logistic regression and
of data distributions. naive bayes. Advances in neural information processing
3. The performances of the three classifiers are systems, 2, 841-848.
examined on the two sets of data distributions using [11] Maes, S., Tuyls, K., Vanschoenwinkel, B., & Manderick, B.
accuracy, sensitivity, specificity, precision, balanced (2002). Credit card fraud detection using Bayesian and neural
classification rate and Matthews Correlation networks. In Proceedings of the 1st international naiso congress
on neuro fuzzy technologies (pp. 261-270).
coefficient metrics.
[12] Shen, A., Tong, R., & Deng, Y. (2007). Application of
Performance of classifiers varies across different evaluation classification models on credit card fraud detection. In Service
metrics. Results from the experiment shows that the kNN Systems and Service Management, 2007 International
shows significant performance for all metrics evaluated except Conference on (pp. 1-4). IEEE.
for accuracy in the 10:90 data distribution. This study shows [13] Bhattacharyya, S., Jha, S., Tharakunnel, K., & Westland, J. C.
the effect of hybrid sampling on the performance of binary (2011). Data mining for credit card fraud: A comparative
classification of imbalanced data. Expected future areas of study. Decision Support Systems, 50(3), 602-613.
research could be in examining meta-classifiers and meta- [14] Sahin, Y. and Duman, E., (2011). Detecting credit card fraud by
ANN and logistic regression. In Innovations in Intelligent
learning approaches in handling highly imbalanced credit card Systems and Applications (INISTA), 2011 International
fraud data. Also effects of other sampling approaches can be Symposium on (pp. 315-319). IEEE.
investigated. [15] Chaudhary, K. and Mallick, B., (2012). Credit Card Fraud: The
study of its impact and detection techniques, International
Acknowledgment Journal of Computer Science and Network (IJCSN), Volume 1,
Issue 4, pp. 31 – 35, ISSN: 2277-5420
We wish to acknowledge Nwaiwu John C for his effort in [16] Bhatla, T.P.; Prabhu, V.; and Dua, A. (2003).
the experimentation carried out and Pozzolo et al [32] for the Understanding credit card frauds. Crads Business Review#
source and description of the credit card fraud data. 2003-1, Tata Consultancy Services
[17] The Nilson Report. (2015). U.S. Credit & Debit Cards 2015.
References David Robertson.
[18] Stolfo, S., Fan, D. W., Lee, W., Prodromidis, A., & Chan, P.
[1] Maes, S., Tuyls, K., Vanschoenwinkel, B. and Manderick, B., (1997). Credit card fraud detection using meta-learning: Issues
(2002). Credit card fraud detection using Bayesian and and initial results. In AAAI-97 Workshop on Fraud Detection
neural networks. Proceeding International NAISO Congress on and Risk Management.
Neuro Fuzzy Technologies. [19] Bahnsen, A. C., Stojanovic, A., Aouada, D., & Ottersten, B.
[2] Ogwueleka, F. N., (2011). Data Mining Application in Credit (2013). Cost sensitive credit card fraud detection using Bayes
Card Fraud Detection System, Journal of Engineering Science minimum risk. In Machine Learning and Applications (ICMLA),
and Technology, Vol. 6, No. 3, pp. 311 – 322 2013 12th International Conference on (Vol. 1, pp. 333-338).
[3] RamaKalyani, K. and UmaDevi, D., (2012). Fraud Detection of IEEE.
Credit Card Payment System by Genetic Algorithm, [20] Pun, J. K. F. (2011). Improving Credit Card Fraud Detection
International Journal of Scientific & Engineering Research, using a Meta-Learning Strategy (Doctoral dissertation,
Vol. 3, Issue 7, pp. 1 – 6, ISSN 2229-5518 University of Toronto).
[4] Meshram, P. L., and Bhanarkar, P., (2012). Credit and ATM [21] Sahin, Y., Bulkan, S., & Duman, E. (2013). A cost-sensitive
Card Fraud Detection Using Genetic Approach, International decision tree approach for fraud detection. Expert Systems with
Journal of Engineering Research & Technology (IJERT), Vol. 1 Applications, 40(15), 5916-5923.
Issue 10, pp. 1 – 5, ISSN: 2278-0181 [22] Bolton, R. J. and Hand, D. J., (2001). Unsupervised profiling
[5] Singh, G., Gupta, R., Rastogi, A., Chandel, M. D. S., and Riyaz, methods for fraud detection, Conference on Credit Scoring and
A., (2012). A Machine Learning Approach for Detection of Credit Control, Edinburgh.
Fraud based on SVM, International Journal of Scientific [23] Kou, Y., Lu, C-T., Sinvongwattana, S. and Huang, Y-P., (2004).
Engineering and Technology, Volume No.1, Issue No.3, pp. Survey of Fraud Detection Techniques, In Proceedings of the
194-198, ISSN : 2277-1581 2004 IEEE International Conference on Networking, Sensing &
[6] Seeja, K. R., and Zareapoor, M., (2014). FraudMiner: A Novel Control, Taipei, Taiwan, March 21-23.
Credit Card Fraud Detection Model Based on Frequent Itemset [24] Wheeler, R., and Aitken, S. (2000). Multiple algorithms for
Mining, The Scientific World Journal, Hindawi Publishing fraud detection. Knowledge-Based Systems, 13(2), 93-99.
Corporation, Volume 2014, Article ID 252797, pp. 1 – 10, Elsevier
http://dx.doi.org/10.1155/2014/252797
[25] Phua, C., Alahakoon, D., & Lee, V. (2004). Minority report in
[7] Patil, S., Somavanshi, H., Gaikwad, J., Deshmane, A., and fraud detection: classification of skewed data. Acm sigkdd
Badgujar, R., (2015). Credit Card Fraud Detection Using explorations newsletter, 6(1), 50-59.
Decision Tree Induction Algorithm, International Journal of
Computer Science and Mobile Computing (IJCSMC), Vol.4, [26] Chu, F., Wang, Y., & Zaniolo, C. (2004). An adaptive learning
Issue 4, pp. 92-95, ISSN: 2320-088X approach for noisy data streams. In Data Mining, 2004.
ICDM'04. Fourth IEEE International Conference on (pp. 351-
[8] Duman, E., Buyukkaya, A., & Elikucuk, I. (2013). A novel and 354). IEEE
successful credit card fraud detection system implemented in a
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on June 14,2023 at 00:30:21 UTC from IEEE Xplore. Restrictions apply.
[27] Panigrahi, S., Kundu, A., Sural, S., & Majumdar, A. K. (2009). classification. International Journal of Computer Science and
Credit card fraud detection: A fusion approach using Dempster– Applications, 6(2), 256-261.
Shafer theory and Bayesian learning. Information Fusion, 10(4), [32] Pozzolo, A. D., Caelen, O., Johnson, R. A., and Bontempi, G.,
354-363. (2015). Calibrating Probability with Undersampling for
[28] Sahin, Y. and Duman, E., (2011). Detecting Credit Card Fraud Unbalanced Classification. In Symposium on Computational
by Decision Trees and Support Vector Machines, Proceedings Intelligence and Data Mining (CIDM), IEEE.
of International Multi-Conference of Engineers and Computer [33] Fahmi, M., Hamdy, A. and Nagati, K., (2016). Data Mining
Scientists (IMECS 2011), Mar. 16-18, Hong Kong, Vol. 1, pp. 1 Techniques for Credit Card Fraud Detection: Empirical Study,
- 6, ISBN: 978-988-18210-3-4, ISSN: 2078-0966 (Online) In Sustainable Vital Technologies in Engineering and
[29] Sherly, K. K. (2012). A comparative assessment of supervised Informatics BUE ACE1, pp. 1 – 9, Elsevier Ltd.
data mining techniques for fraud prevention. TIST. Int. J. Sci. [34] Islam, M. J., Wu, Q. M. J., Ahmadi, M. and Sid-Ahmed, M. A.,
Tech. Res, 1(16). (2007). Investigating the Performance of Naive- Bayes
[30] Pun, J., and Lawryshyn, Y. (2012). Improving credit card fraud Classifiers and KNearestNeighbor Classifiers. IEEE,
detection using a meta-classification strategy. International International Conference on Convergence Information
Journal of Computer Applications, 56(10). Technology, pp. 1541-1546.
[31] Patil, T. R., & Sherekar, S. S. (2013). Performance analysis of
Naive Bayes and J48 classification algorithm for data
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on June 14,2023 at 00:30:21 UTC from IEEE Xplore. Restrictions apply.