0% found this document useful (0 votes)

13 views5 pages

Redit Card Fraud Detection Using Machine Learning as Data Mining Technique

Uploaded by

giftingfor1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views5 pages

Redit Card Fraud Detection Using Machine Learning as Data Mining Technique

Uploaded by

giftingfor1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Credit Card Fraud Detection Using Machine

Learning As Data Mining Technique

Ong Shu Yee, Saravanan Sagadevan and Nurul Hashimah Ahamed Hassain Malim
School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia.
[email protected]

Abstract—The rapid participation in online based detect credit card fraud activities.
transactional activities raises the fraudulent cases all over the Data mining is known as the process of gaining interesting,
world and causes tremendous losses to the individuals and novel and insightful patterns as well as discovering
financial industry. Although there are many criminal activities understandable, descriptive and predictive models from large
occurring in financial industry, credit card fraudulent activities
are among the most prevalent and worried about by online
scale of data collections [5, 6]. The ability of data mining
customers. Thus, countering the fraud activities through data techniques to extract fruitful information from large scale of
mining and machine learning is one of the prominent data using statistical and mathematical techniques would
approaches introduced by scholars intending to prevent the assist credit card fraud detection based on differentiating the
losses caused by these illegal acts. Primarily, data mining characteristics of common and suspicious credit card
techniques were employed to study the patterns and transactions. While data mining focused on discovering
characteristics of suspicious and non-suspicious transactions valuable intelligence, machine learning is rooted in learning
based on normalized and anomalies data. On the other hand, the intelligence and developing its own model for the purpose
machine learning (ML) techniques were employed to predict the of classification, clustering or so on.
suspicious and non-suspicious transactions automatically by
using classifiers. Therefore, the combination of machine
The application of machine learning techniques spreads
learning and data mining techniques were able to identify the widely throughout computer sciences domains such as spam
genuine and non-genuine transactions by learning the patterns filtering, web searching, ad placement, recommender
of the data. This paper discusses the supervised based systems, credit scoring, drug design, fraud detection, stock
classification using Bayesian network classifiers namely K2, trading, and many other applications. Machine Learning
Tree Augmented Naïve Bayes (TAN), and Naïve Bayes, logistics classifiers operate by building a model from example inputs
and J48 classifiers. After preprocessing the dataset using and using that to make predictions or decisions, rather than
normalization and Principal Component Analysis, all the following strictly static program instructions. There are many
classifiers achieved more than 95.0% accuracy compared to different types of machine learning approaches available with
results attained before preprocessing the dataset.
the intentions to solve heterogeneous problems. Due to the
Index Terms—Credit Card; Data Mining; Fraud Detection; nature of this study which was focused on classification, the
Machine Learning. discussion that follows is based on this topic. Machine
learning classification refers to the process of learning to
I. INTRODUCTION assign instances to predefined classes. Formally, there are
several types of learning such as supervised, semi-supervised,
According to Global Payments Report 2015, credit card is the unsupervised, reinforcement, transduction and learning to
highest used payment method globally in 2014 compared to learn [7]. As the interest of this study was to conduct
other methods such as e-wallet and Bank Transfer [1]. The supervised based machine learning classification, the
huge transactional services are often eyed by cyber criminals discussions about the rest of the methods are discarded from
to conduct fraudulent activities using the credit card services. further elaboration. In most classification studies, supervised-
Credit card fraud is defined as the unauthorized usage of card, based learning is favoured more than other methods due to
unusual transaction behavior, or transactions on an inactive the ability to control the classes of the instances with the
card [2]. In general, there are three categories of credit card interventions of human. In supervised learning, the classes of
fraud namely, conventional frauds (e.g. stolen, fake and the instances would be labeled prior to feeding into
counterfeit), online frauds (e.g. false/fake merchant sites), classifiers. Then, by using certain evaluation metrics, the
and merchant related frauds (e.g. merchant collusion and performances of the classifiers could be measured.
triangulation) [3]. In the case of credit card fraud detection, the binary
In the past couple of the years, credit card breaches have classification technique was employed due to the instances
been trending alarmingly. According to Nilson Report, the labeled as fraud and non-fraud. The inputs were transformed
global credit card fraud losses reached $16.31 billion in 2014 as Boolean x = (x1,…, xj), where xj = 1, if the jth
and it is estimated that it will exceed $35 billion in 2020 [4]. characteristics appeared in the instances, but otherwise, xj =
Therefore, it is necessary to develop credit card fraud 0. A classifier input a training set into (xi, yi), where xi = (xi,
detection techniques as the counter measure to combat illegal . . . , xq) was an observed input and yi was the corresponding
activities. In general, credit card fraud detection has been output of the classifier. The rest of the paper is organized into
known as the process of identifying whether transactions are background studies, research methodology, results,
genuine or fraudulent. As the data mining and machine discussions and conclusions.
learning techniques are vastly used to counter cyber-criminal
cases, scholars often embraced those approaches to study and

e-ISSN: 2289-8131 Vol. 10 No. 1-4 23

Journal of Telecommunication, Electronic and Computer Engineering

II. BACKGROUND STUDIES probability of root node. Basically, a Bayesian Network A=

<N, B, Ѳ>, is a directed acyclic graph that consists a set of
Data mining and machine learning are popular methods to random variables, where, DAC= <N, B>, and each node n ∈
study and combat the credit card fraud cases. There is a large N represents the variable of the data. Each arc a ∈ A in
number of studies that exploited the strength of data mining between nodes represents probability dependency. Bayesian
and machine learning to prevent the credit card fraudulent network is able to compute the conditional probability of a
activities. Based on Self-Organizing Map and Neural node based on given values assigned to other nodes. There
Network, the study of [8] obtained Receiver Operating Curve are several advantages of Bayesian Network such as the
(ROC) over 95.00% of fraud cases without false alarms rate. ability to handle incomplete inputs, the learning of causal
The Hidden Markov Model (HMM) also has been applied in relationship and so on [17]. As illustrated in Figure 1, there
credit card fraud detection with low percentage of false alarm are minor differences between Naïve Bayes, TAN and
rates [9]. However, transition process of different states and general framework of Bayesian Network. Naïve Bayes is a
calculating the probability in HMM are very costly and very popular classifier as it is simple, efficient and yields
intensive. Furthermore, rather than using single classifiers, better performance in solving real world problems. Naïve
some of the credit card fraud detection studies used meta- Bayes is a probabilistic classifier based on Bayes rules with
learning learners based on supervised learning. Stolfo et al. strong independent assumptions. In simple term, a descriptive
investigated credit card fraud detection system using four "independent feature model" based on probability will allow
types of algorithms namely Iterative Dichotomiser 3 (ID3), NB to make assumptions that the presence or absence of a
Classification and Regression Tree (CART), Ripper and peculiar feature of a class is not related to the presence of
Bayes as base learners and tested with heterogeneous data absence of other features. K2 as one of Bayesian type
distributions [10]. Based on 50% / 50% distribution of classifiers used scoring functions to compute the joint
instances (fraud and non-fraud), the study found that meta- probability of any instantiation of all the variables in a belief
learning using Bayes as a base learner obtained a higher true network as the product of probabilities [18]. In WEKA, K2
positive rate compared to other meta learners. However, even classifiers used hill climbing methods in order to develop the
though the distribution of 50% / 50% yields good results, it Bayesian beliefs. On the other hand, TAN classifier used
does not reflect real world circumstances where genuine Bayesian scoring function to develop the Bayesian Belief. As
credit card transactions are quite higher than non-legitimate illustrated in Figure 1, TAN classifier allows arcs between the
transactions. Researchers have also tested other types of meta children of the classification node xc. Therefore, the TAN
learning classifiers such as Adaboost, Logitboost, Bagging classifier is able to compute the probability from each child
and Dagging and yielded interesting outcomes [11]. and eventually identify the appropriate classes of the children
Through our literature studies, Bayesian Network is one of based on computed probability. Although the information
the classifier types that have been widely applied to detect channeled by TAN looks better than Naïve Bayes, none of the
fraud in the credit card industry. Maes et al examined the true studies found to be investigating the performances of TAN on
positive and false positive produced by Bayesian Belief credit card fraud detection domain. Then, compared to Naïve
Network and Artificial Neural Network on classifying credit Bayes as generative model, Logistics as discriminative
card fraud instances. The study found that Bayesian network classifier predicts the probability using direct Bayes
performed approximately 8% higher than Artificial Neural Functional Form. Logistics uses conditional probability and
Network and claimed that the former's classifier processing iterative based estimation in order to estimate the classes of
time is shorter than the latter [12]. Rather than analyzing the instances. J48 is an open source Java implementation in
using traditional classification methods, the investigation by WEKA based on C4.5 algorithm. C4.5 algorithm was
[13] initiated to perform cost sensitive credit card fraud developed by Ross Quinlan to generate the decision tree
detection based on Bayes Minimum Risk technique. The based on a set of labeled input data [19]. J48 is a predictive
study measured the performances of Logistic Regression machine-learning classifier that determines the target values
(LR), C4.5 and Random Forest (RF). The study showed that of new samples based on various attribute values in the data.
adjusting the probabilities of Bayes Minimum Risk classifier The internal decision tree nodes represent the different
on RF classification yielded consistently better results than attributes or features while the branches between nodes
LR and C4.5. denote the possible or viable attributes that could be included
Throughout our observation and analysis of previous in the observed samples or classes. The terminal nodes depict
studies, Bayesian Network classifiers have become one of the the final classification attributes of the target value.
popular classifier types that are widely used to classify credit
card fraud data. Therefore, this study attempted to investigate
the classification by several Bayesian classifiers such as K2,
Tree Augmented Naïve Bayes (TAN), and Naïve Bayes.
Moreover, this study also measured the performances of
Logistics Regression and J48 based on the proposed
methodology. A brief discussion about Bayesian Network
Classifier and proposed classifiers are stated below. Figure 1: Illustration of Naïve Bayes, TAN and General BN structures

A. Bayesian Network Classifier III. RESEARCH HYPOTHESIS

Bayesian Network is a threshold-based model that
computes the sum of the output accumulated from child Based on the review from past studies, two main
nodes. The reasons behind the creation of such model is the conclusions are made on the evaluation of credit card fraud
ability of child nodes to operate independently without detection investigations. The first conclusion is that credit
interrupting other child nodes and particularly influence the card data plays essential roles in identifying fraudulent and

24 e-ISSN: 2289-8131 Vol. 10 No. 1-4

Credit Card Fraud Detection Using Machine Learning As Data Mining Technique

non-fraudulent characteristics. However, the process of

getting the real credit card fraud related data is very hard due
to record privacy and sensitivity. Therefore, as to mimic the
real data, the authors of this study used a dummy data created
based on manipulating certain features that were expected to
have significant impact for fraud detection. For instance, if
the customer entered a wrong pin number from an actual or
shipping address that was different than billing address or
transaction date and time that were too close with large sum
of transactions from previous actions, it could be suspected
as suspicious affairs. Furthermore, some countries such as
Yugoslavia, Lithuania, and Pakistan have a very high number
of fraud incidents with unverifiable addresses. Based on such
indicators, the data was developed using several attributes
such as credit card number, reference number, terminal id,
actual pin, entered pin, transaction amount, transaction date
and time, location, billing address and shipping address.
Those attributes were the common variables that were used Figure 2: A simple illustration on the flow of methodology in this work
to study the credit card fraud activities. The data was
developed manually using spreadsheet and GNU auto data In the classification process, a prominent data mining and
generation script derived from generatedata.com. The machine learning tool namely WEKA was used in order to
instances were labeled as fraud based on the presence of measure the performances of the classifiers. WEKA is one the
correlations among the attributes as stated in Table 1. The rest open source prominent tools that is used widely to study
of the correlation was defined as non-fraud. many real world problems such as sentiment analysis,
personality detection, spam filtering, and fraud detection. The
Table 1 classification was run using 10-fold cross validation
Rules to Labeling Fraud Instances (T=TRUE, F=FALSE) techniques. The 10-fold cross validation technique is widely
applied in data mining and machine learning studies due to
Time Interval
Similar Pin

Blacklisted

the training and testing process that occurred on the entire

Threshold

Address
country

Similar
Exceed

Fraud
Case

dataset. Through 10-fold cross validation, the dataset was

splitted into ten parts, each part was held out in turn, and
eventually the average results were computed. In other words,
1 T T T T T T
2 T T T T F T
each data point in the dataset was used once for testing and 9
3 T T T F F T times for training. Then, in order to measure the performances
4 T T F T F T of the classifiers, several evaluation metrics were employed
5 T T F F F T in this study. Primarily, the output of the metrics depended on
6 T F T T T T
7 T F T F F T
the results obtained by True Positive (TP), True Negative
8 T F T T F T (TN), False Positive (FP) and False Negative (FN). TP refers
9 T/F T/F T/F T/F T/F T to the number of fraud transactions predicted as fraud while
FP is the number of legal transactions predicted as fraud. TN
In order to evaluate the validity of the dummy data, the first refers to the number of fraud transactions predicted as legal
experiment was conducted to verify the authentication of the transactions while FN is the number of legal transactions
corpus to be used in the credit card fraud detection. The predicted as fraud. This study evaluated the performances of
second conclusion is most of the previous studies attempted the classifiers using True Positive Rate (TPR), False Positive
to use heterogeneous types of the classifiers to measure the Rate (FPR), Precision, Recall, F-Measure, and accuracy.. The
performances on detecting genuine and non-genuine description and formula for each evaluation metrics are
transactions. On the intention to contribute further to body of defined in Table 2.
the knowledge, the second experiment was conducted to
evaluate the performances of the proposed classifiers in the Table 2
classification of credit card fraud activities. Therefore, the The Formula of Metrics Used in the Study
first and second hypotheses that reflect the former and latter Metric Formula and Description
experiments are stated as follows: True Positive Rates (TPR) TPR = TP / (TP + FN)
Hypothesis (1) : The dummy dataset that was created
False Positive Rates (FPR) FPR = FP / (FP + TN)
based on suspicious behaviors can be used for classification
in data mining. Precision Precision = TP / (TP + FP)
Hypothesis (2) : The performances on the dataset which Recall Recall = TP / (TP + FN)
undergo data preprocessing are better than the raw dataset. F-Measure F-Measure = 2TP / (2TP + FP + FN)
Accuracy Accuracy = (TP + TN) / (TP + TN +
IV. RESEARCH METHODOLOGY FP + FN)

The overview of the research methodology illustrated in The following paragraphs will elaborate on data
Figure 2. transformation and data reduction. Generally, data
transformation and data reduction are referred to as data pre-
processing phase, where the raw data is cleaned and

e-ISSN: 2289-8131 Vol. 10 No. 1-4 25

Journal of Telecommunication, Electronic and Computer Engineering

transformed into appropriate forms (or standardization) to be Table 3

Results of Classification Using Raw Dummy Dataset
evaluated and fed into machine learners. Data transformation
process involves activities such as normalization, smoothing, Naïve
aggregation, attributes construction and generalization of the Metric K2 TAN Logistic J48
Bayesian
data. While data reduction is to reduce the number of True Positive Rate 31.0 50.3 75.0 60.3 73.0
attributes such as data cube aggregation, removing irrelevant (%)
False Positive Rate 69.0 49.7 25.0 39.7 27.0
attributes and principle component analysis. For instance, (%)
during data transformation, the format of transaction date and Precision (%) 21.0 45.7 73.0 44.7 69.4
time were standardized into a uniform state so that it was Recall (%) 32.0 60.3 75.0 47.8 67.5
identical to machine learners to interpret it as date and time F-Measure (%) 32.2 34.3 68.5 44.9 67.4
Processing Speed 10.0 10.0 56.0 25.0 84.0
attributes. Then, Principal Component Analysis technique (seconds)
was employed to detect the anomaly transactions. Principal Accuracy 41.8 53.7 84.0 67.3 80.0
Component Analysis is a method to transform the correlated
variables into a smaller number of uncorrelated attributes B. Results and Analysis for Experiment 2
called Principal Components. The objective of applying the The second experiment used the data that was filtered with
method was to identify and reduce the dimensionality of the normalization and Principal Component Analysis. From
dataset and discover new meaningful underlying attributes. Experiment 2, all the five classifiers showed better results
The advantage of Principal Component Analysis is during compared to Experiment 1. All the classifiers achieved
reducing the dimensions of the data using eigenvector, the accuracy more than 95.0% with better processing speed than
losses to the information of the data are insignificant. Experiment 1. The minimal FPR showed the preprocessing
Furthermore, the losses could be trace back by decompressing techniques employed by this study which had increased the
the eigenvalue. reliability of the data by removing the unusable attributes.
The results of J48 and Logistics showed that both classifiers
V. RESULTS & EVALUATION gained maximum strengths upon preprocessing of the dataset.
It is a huge classification improvement showed by K2
This study used two datasets to run through the compared to the previous experiment. The classifiers
experiments. The raw dataset and the new dataset were achieved almost 195.80% increase of TPR after data
created by data transformation and data reduction. transformation and data reduction process. Furthermore,
besides the improvement to the TPR, precision, recall, F-
A. Results and Analysis for Experiment 1 Measure and accuracy, the processing speed for all the
For Experiment 1, the raw dummy dataset was used to classifiers also improved significantly compared to the
evaluate the integrity of the data for credit card fraud previous experiment. The authors were curious about
detection. The result (see Table 3) showed that the TPR attributes that were removed during data preprocessing,
(75.0%), precision (73.0%), recall (75.00%), F-Measure hence the cleaned dataset was observed. During the
(68.5%) and accuracy (84.0%) of TAN are the highest among observation, the authors noticed that the terminal_id
the classifiers on the evaluations. The minimal FPR rate of attributes were reduced significantly. Based on the results
TAN showed the ability of TAN to process the raw data better shown in Table 4, the hypothesis of experiment 2 was proven
than other classifiers even though the classifier's speed was where the performances of the classifiers on the preprocessed
higher than K2, Naïve Bayesian, and Logistics. This could be dataset are better than the raw dataset after undergoing data
due to the heavy processes such as finding the probability and preprocessing tasks.
creating the tree model which caused the processing of the
data to take too long. The J48 that was also based on tree Table 4
model as TAN achieved TPR (73.0%), precision (69.4%), Results of Classification Using Transformed Dataset
recall (67.5%), and F-Measure (67.4%) which were slightly Naïve
lower than TAN. Moreover, the processing speed for J48 was Metric K2 TAN Logistic J48
Bayesian
also slower than TAN although the processes involved in True Positive 91.7 99.6 99.7 100.0 100.0
latter classifier were more heavy/costly than former classifier. Rate (%)
From the point of views of the authors, the False Positive 8.3 0.4 0.3 0.0 0.0
Rate (%)
underperformances of Logistics, Naïve Bayesian and K2 Precision (%) 92.6 95.6 98.4 100.0 100.0
showed that the raw data with high number of noises affects Recall (%) 91.7 99.6 99.6 100.0 100.0
the modeling and evaluation of the raw data. As the worst F-Measure (%) 95.7 89.3 99.0 100.0 100.0
performer, K2 even obtained very poor results in terms of Processing Speed 2.0 2.0 30.0 5.0 32.0
(seconds)
TPR (31.0%), precision (21.0%), recall (32.0%), F-Measure Accuracy 95.8 96.7 99.7 100.0 100.0
(32.2%) and accuracy (41.8%). Even though some of the
classifiers obtained poor results, the ability of the learners to C. Discussion and Future Work
classify the data showed the reliability of the dummy data The detection of credit card fraud using data mining and
being used to test the credit card fraud detection. To further Machine Learning techniques have become one of the
improve the classification results, in experiment 2, the raw reliable approaches to counter this illegal activity. However,
dummy dataset was fed into data transformation and data the process to gather real time credit card fraud data is very
reduction techniques as mentioned above. hard. Therefore, to mimic the real data, the development of
dummy data may assist the detection process. However, the
creation and credibility of dummy data must be ascertained
prior to conducting the classification processes. Based on the
results from Experiment 1, the credibility of the data could be

26 e-ISSN: 2289-8131 Vol. 10 No. 1-4

Credit Card Fraud Detection Using Machine Learning As Data Mining Technique

ensured by noticing the ability of the WEKA to produce non- support to this work.
zero results. Generally, WEKA would not be able to process
the data if the data is highly unstructured and would return REFERENCES
N/A (Not Applicable) results, errors, or freeze during
[1] WorldPay. (2015, Nov). Global payments report preview: your
modeling process. However, it did not happen to our dummy
definitive guide to the world of online payments. Retrieved September
dataset. Furthermore, the development of the dummy dataset 28, 2016, from http://offers.worldpayglobal.com/rs/850-JOA-
was based on attributes commonly used for credit card fraud 856/images/GlobalPaymentsReportNov2015.pdf.
detection and created automatically by using GNU data [2] Federal Trade Commision. (2008). consumer sentinel network - data
book for January - December 2008. Retrieved Oct 20, 2016. From
generation scripts. Then, as always emphasized by many data
https://www.ftc.gov/.
mining researchers, the preprocessing of raw dataset is an [3] Bhatla, T.P., Prabhu, V., and Dua, A. (2003). understanding credit card
essential factor to improve the classification results. This has frauds. Crads Business Review# 2003-1, Tata Consultancy Services.
been proven by observing the differences between results of [4] The Nilson Report. (2015). Global fraud losses reach $16.31 Billion.
Edition: July 2015, Issue 1068.
Experiment 1 and Experiment 2. The improvement on
[5] Y. Sahin and E. Duman, “Detecting credit card fraud by decision trees
Experiment 2 after data transformation and data reduction and support vector machines”, Proceedings of the International Multi-
significantly improve the classification performances. As Conference of Engineers and Computer Scientists 2011 Vol I, IMECS
mentioned earlier, the strength of Principal Component 2011, March 2011.
[6] Elkan, C. (2001). Magical thinking in data mining: lessons from COIL
Analysis that reduced the dimensionality, losing much the
challenge 2000. Proc. of SIGKDD01, 426-431.
information from the attributes was one of the major factor [7] Mohammed, J. Zaki., & Wagner, Meira Jr. (2014). Data mining and
that improved the classification process. Therefore, we analysis: fundamental concepts and algorithms. Cambridge University
believed that Principal Component Analysis technique is the Press. ISBN 978-0-521-76633-3.
[8] F. N. Ogwueleka. (2011). Data mining application in credit card fraud
better filtering approach to be considered and to be used in
detection system. Journal of Engineering Science and Technology, Vol.
credit card fraud detection processes. Then, our classification 6, No. 3 (2011) 311 - 322.
process also proved that Bayesian based classifiers such as [9] V. Bhusari & S. Patil. (2011). Application of hidden markov model in
K2, Naïve Bayesian, Tan, Logistics and J48 were able to credit card fraud detection. International Journal of Distributed and
Parallel Systems (IJDPS) Vol.2, No.6.
classify and predict the credit card fraud activities better if the
[10] S.J. Stolfo, D.W. Fan, W. Lee, A.L. Prodromidis, and P.K. Chan.
data was preprocessed using reliable filtering techniques. (1998). Credit card fraud detection using meta-learning: issues and
Moreover, after the dimensionality of the raw data was initial results, Proc. AAAI Workshop AI Methods in Fraud and Risk
reduced by using Principal Component Analysis, the authors Management, pp. 83-90.
[11] Sen, Sanjay Kumar., & Dash, Sujatha. (2013). Meta learning
of this study found that the terminal_id attributes were largely
algorithms for credit card fraud detection. International Journal of
reduced.. Therefore, we made the assumptions that Engineering Research and Development Volume 6, Issue 6, pp. 16-20.
terminal_id information contribute less to the credit card [12] Maes, Sam, Tuyls Karl, Vanschoenwinkel Bram & Manderick,
fraud detection. However, the investigation of credit card Bernard. (2002). Credit card fraud detection using bayesian and neural
networks. Proc. of 1st NAISO Congress on Neuro Fuzzy Technologies.
hacking based on physical methods (e.g. hardware stressing)
Hawana.
has to use terminal_id attributes as the reference to identify [13] A.C. Bahnsen, Aleksandar, Stojanovic., D. Aouada & Bjorn, Ottersten.
the illegal activity. (2013). Cost sensitive credit card fraud detection using bayes minimum
In the future, this study will attempt to explore more credit risk. 12th International Conference on Machine Learning and
Applications.
card fraud detections using real time data. Then, since the
[14] Amlan Kundu, Suvasini Panigrahi, Shamik Sural and Arun K.
Bayesian Networks classifiers showed better results, the Majumdar. (2009). Credit card fraud detection: a fusion approach
comparisons with other types of classifiers such as using dempster–shafer theory and bayesian learning. Special Issue
Hyperplane based may contribute further to the body of the on Information Fusion in Computer Security, Vol. 10, Issue No. 4,
pp.354-363.
knowledge.
[15] Lam, Bacchus (1994). Learning bayesian belief networks: an approach
based on the MDL principle. Computational Intelligence, Vol. 10, Issue
VI. CONCLUSION No. 3, pp.269–293.
[16] M. Mehdi, S. Zair, A. Anou and M. Bensebti (2007). A bayesian
networks in intrusion detection systems. International Journal of
This paper tested classification metrics by using five
Computational Intelligence Research, Issue No. 1, pp.0973-1873 Vol.
Bayesian classifiers namely Naïve Bayes, K2, TAN, 3.
Logistics and J48. The evaluations conducted using two [17] R.Najafi & Afsharchi, Mohsen. (2012). Network intrusion detection
datasets, where, the first dataset was a dummy dataset that using tree augmented naive-bayes. The Third International Conference
on Contemporary Issues in Computer and Information Sciences (CICI)
represented the characteristics of credit card data and a newly
2012.
transformed dataset using data normalization and Principal [18] G. Cooper, E. Herskovits (1992). A bayesian method for the induction
Component Analysis techniques. Overall, all the Bayesian of probabilistic networks from data. Machine Learning. 9(4):309-347.
classifiers achieved significantly better results after being fed [19] Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan
Kaufmann Publishers.
with filtered data.
[20] Friedman, N. and Goldszmidt, M. (1996). Building classifiers using
bayesian networks. Proc. 13th National Conference on Artificial
ACKNOWLEDGMENT Intelligence.Vol. 2, pp 1277-1284.
[21] Friedman, N., Geiger, D. and Goldszmidt, M. (1997). Bayesian
network classifiers. machine learning,Vol. 29, pp 131-163. Kluwer
We are grateful to Universiti Sains Malaysia for providing
Academic Publishers, Boston.

e-ISSN: 2289-8131 Vol. 10 No. 1-4 27

Credit Card Fraud Detect
No ratings yet
Credit Card Fraud Detect
19 pages
Real-Time Credit Card Fraud Detection Using Machine Learning
No ratings yet
Real-Time Credit Card Fraud Detection Using Machine Learning
6 pages
Credit Card Fraud Detection Proposal Redone
No ratings yet
Credit Card Fraud Detection Proposal Redone
5 pages
Credit Card Fraud Detection Using Machine Learning Techniques
No ratings yet
Credit Card Fraud Detection Using Machine Learning Techniques
9 pages
Analysis On Credit Card Fraud Detection Methods
No ratings yet
Analysis On Credit Card Fraud Detection Methods
19 pages
Intrusion Detection in Smart Grid
No ratings yet
Intrusion Detection in Smart Grid
4 pages
Implementation of Credit Card Fraud Detection Using Support Vector Machine
No ratings yet
Implementation of Credit Card Fraud Detection Using Support Vector Machine
13 pages
Bridget
No ratings yet
Bridget
6 pages
IJIREEICE.2021.91209
No ratings yet
IJIREEICE.2021.91209
4 pages
A_Review_of_Machine_Learning_Applications_for_Cred
No ratings yet
A_Review_of_Machine_Learning_Applications_for_Cred
11 pages
MPML10 2022 FR
No ratings yet
MPML10 2022 FR
24 pages
AI Based Credit Card Fraud Detection Using Machine Learning Technique
No ratings yet
AI Based Credit Card Fraud Detection Using Machine Learning Technique
10 pages
A Review Credit Card Fraud Detection in Banks Using Machine Learning Algorithms
No ratings yet
A Review Credit Card Fraud Detection in Banks Using Machine Learning Algorithms
7 pages
s11042-023-14698-2
No ratings yet
s11042-023-14698-2
19 pages
Credit Card Fraud Detection Using Enhanced Random Forest Classifier For Imbalanced Data
No ratings yet
Credit Card Fraud Detection Using Enhanced Random Forest Classifier For Imbalanced Data
11 pages
Machine Learning For Credit Card Fraud D
No ratings yet
Machine Learning For Credit Card Fraud D
6 pages
Online Transaction Fraud Detection Using Backlogging on e Commerce Website IJERTV11IS050319 (1)
No ratings yet
Online Transaction Fraud Detection Using Backlogging on e Commerce Website IJERTV11IS050319 (1)
6 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
6 pages
Research Paper 4 (Abnormal Transactions)
No ratings yet
Research Paper 4 (Abnormal Transactions)
7 pages
10.1007@s41870 020 00430 y PDF
No ratings yet
10.1007@s41870 020 00430 y PDF
9 pages
Financial Fraud Detection in Healthcare Using Machine and Deep Learning
No ratings yet
Financial Fraud Detection in Healthcare Using Machine and Deep Learning
25 pages
Credit Card Fraud Detection - Machine Learning Methods
No ratings yet
Credit Card Fraud Detection - Machine Learning Methods
5 pages
A Comparative Analysis of Credit Card Fraud Detection Using Machine Learning Techniques
No ratings yet
A Comparative Analysis of Credit Card Fraud Detection Using Machine Learning Techniques
2 pages
Credit-Card-Fraud-Detection-System-Using-Machine-Learning-Process (1)
No ratings yet
Credit-Card-Fraud-Detection-System-Using-Machine-Learning-Process (1)
4 pages
Credit Card Fraud Detection Techniques
No ratings yet
Credit Card Fraud Detection Techniques
8 pages
Credit Card Fraud Detection Using Adaboost and Majority Voting
100% (1)
Credit Card Fraud Detection Using Adaboost and Majority Voting
4 pages
Credit Card Fraud Detection Using Hidden Markov Models
No ratings yet
Credit Card Fraud Detection Using Hidden Markov Models
5 pages
Ms Arjocs 1355
No ratings yet
Ms Arjocs 1355
13 pages
A Study On Credit Card Fraud Detection Using Machine Learning
No ratings yet
A Study On Credit Card Fraud Detection Using Machine Learning
4 pages
Paper-7 - Supervised Machine Learning Model For Credit Card Fraud Detection
No ratings yet
Paper-7 - Supervised Machine Learning Model For Credit Card Fraud Detection
7 pages
Popat 2018
No ratings yet
Popat 2018
6 pages
Credit Card Detection
No ratings yet
Credit Card Detection
9 pages
Ds 1
No ratings yet
Ds 1
6 pages
1 PB
No ratings yet
1 PB
9 pages
Bioconf Iscku2024 00076
No ratings yet
Bioconf Iscku2024 00076
18 pages
Bankingfraude-Data Mining
No ratings yet
Bankingfraude-Data Mining
12 pages
Comparative Analysis of Back-Propagation Neural Network and K-Means Clustering in Fraud Detection
No ratings yet
Comparative Analysis of Back-Propagation Neural Network and K-Means Clustering in Fraud Detection
13 pages
Credit Card Fraud Detection Using Machine Learning Algorithms
No ratings yet
Credit Card Fraud Detection Using Machine Learning Algorithms
11 pages
3.credit Card Fraud Detection Using Adaboost and Majority Voting
No ratings yet
3.credit Card Fraud Detection Using Adaboost and Majority Voting
7 pages
FraudDetection Newformat
No ratings yet
FraudDetection Newformat
6 pages
A Hybrid Approach For Optimized Fraudulent Transaction Detection With Credit Card Using
No ratings yet
A Hybrid Approach For Optimized Fraudulent Transaction Detection With Credit Card Using
7 pages
paper 2
No ratings yet
paper 2
9 pages
Credit Card Fraud Detection Using Machine Learning
100% (1)
Credit Card Fraud Detection Using Machine Learning
5 pages
itmconf_icdsia2023_02012
No ratings yet
itmconf_icdsia2023_02012
10 pages
A Review On Credit Card Fraud Detection Using Machine Learning
No ratings yet
A Review On Credit Card Fraud Detection Using Machine Learning
4 pages
Credit Card Fraud Detection Using Hybrid Machine Learning Algorithm
No ratings yet
Credit Card Fraud Detection Using Hybrid Machine Learning Algorithm
6 pages
Fraud Detection in Banking Data by Machine Learning Techniques
No ratings yet
Fraud Detection in Banking Data by Machine Learning Techniques
10 pages
Credit Card Research Paper
No ratings yet
Credit Card Research Paper
12 pages
Major 1 2nd
No ratings yet
Major 1 2nd
13 pages
Credit Card Fraud Detection Using Machine Learning Methods
No ratings yet
Credit Card Fraud Detection Using Machine Learning Methods
7 pages
Credi Tcardfrauddetecti Onusi NG Adaboostandmajori Tyvoti NG Abstract
No ratings yet
Credi Tcardfrauddetecti Onusi NG Adaboostandmajori Tyvoti NG Abstract
8 pages
Final Report
100% (1)
Final Report
79 pages
Credit Card Fraud Detection Using Machine Learning Techniques A Comparative Analysis
No ratings yet
Credit Card Fraud Detection Using Machine Learning Techniques A Comparative Analysis
9 pages
Credit Card Fraud Detection1
No ratings yet
Credit Card Fraud Detection1
5 pages
Analysis On Credit Card Fraud Detection Methods
0% (1)
Analysis On Credit Card Fraud Detection Methods
7 pages
FINAL PROJECT REPORT - Rohit Singhal
No ratings yet
FINAL PROJECT REPORT - Rohit Singhal
31 pages
Comparative Study of Machine Learning Algorithms F
No ratings yet
Comparative Study of Machine Learning Algorithms F
11 pages
Analysis_of_Discovering_Fraud_in_Master_Card_Based_on_Bidirectional_GRU_and_CNN_Based_Model
No ratings yet
Analysis_of_Discovering_Fraud_in_Master_Card_Based_on_Bidirectional_GRU_and_CNN_Based_Model
6 pages
Anti fraud for Cheques and use of AI: Next gen realtime anti fraud 4 cheque processing
From Everand
Anti fraud for Cheques and use of AI: Next gen realtime anti fraud 4 cheque processing
Prabhs Uyyala
No ratings yet
AI Security
From Everand
AI Security
Kai Turing
No ratings yet
Data and Analytics in Action: Project Ideas and Basic Code Skeleton in Python
From Everand
Data and Analytics in Action: Project Ideas and Basic Code Skeleton in Python
Zemelak Goraga
No ratings yet
Artificial Intelligence and Machine Learning in 2D/3D Medical Image Processing 1st Edition Rohit Raja (Editor) - The full ebook with all chapters is available for download
100% (4)
Artificial Intelligence and Machine Learning in 2D/3D Medical Image Processing 1st Edition Rohit Raja (Editor) - The full ebook with all chapters is available for download
79 pages
classVIII DS Student Handbook
No ratings yet
classVIII DS Student Handbook
30 pages
Emotion Detection From Bangla
No ratings yet
Emotion Detection From Bangla
5 pages
Thinking by Classes in Data Science SDA
No ratings yet
Thinking by Classes in Data Science SDA
34 pages
Detecting and Visualizing Hate Speech in Social Media: A Cyber Watchdog For Surveillance-Modha2020
No ratings yet
Detecting and Visualizing Hate Speech in Social Media: A Cyber Watchdog For Surveillance-Modha2020
11 pages
fin_irjmets1643879496
No ratings yet
fin_irjmets1643879496
7 pages
EE2211_Past_Paper
No ratings yet
EE2211_Past_Paper
14 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
47 pages
ML 2
No ratings yet
ML 2
3 pages
Guru Nanak Dev Engineering College, Ludhiana
No ratings yet
Guru Nanak Dev Engineering College, Ludhiana
48 pages
Patrick Nyanumba Mwaro:ijcatr09091002: Applicability of Naive Bayes Model in Automatic Resume Classification
No ratings yet
Patrick Nyanumba Mwaro:ijcatr09091002: Applicability of Naive Bayes Model in Automatic Resume Classification
8 pages
Hybrid Neuro-Fuzzy Classification Algorithm For Social Network
No ratings yet
Hybrid Neuro-Fuzzy Classification Algorithm For Social Network
4 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
Final Year Internship
No ratings yet
Final Year Internship
54 pages
Open-Source Frameworks For AI
100% (1)
Open-Source Frameworks For AI
3 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
Improved K-Means Clustering Algorithm by Getting Initial Cenroids
No ratings yet
Improved K-Means Clustering Algorithm by Getting Initial Cenroids
9 pages
IoT Based Health Monitoring System Using Machine Learning
No ratings yet
IoT Based Health Monitoring System Using Machine Learning
6 pages
INFORMATION MANAGEMENT Unit 3 NEW
100% (1)
INFORMATION MANAGEMENT Unit 3 NEW
61 pages
URL Based Phishing Website Detection by Using Gradient and Catboost Algorithms
No ratings yet
URL Based Phishing Website Detection by Using Gradient and Catboost Algorithms
8 pages
Chapter 1 - NATURE OF STATISTICS
No ratings yet
Chapter 1 - NATURE OF STATISTICS
14 pages
Image Classification Report
No ratings yet
Image Classification Report
7 pages
Project Loan Automl
No ratings yet
Project Loan Automl
52 pages
Final Proposal PDF
100% (3)
Final Proposal PDF
17 pages
Thinking Like A Researcher Language of Research
No ratings yet
Thinking Like A Researcher Language of Research
69 pages
Data Collection and Presentation
No ratings yet
Data Collection and Presentation
21 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
27 pages
Newzen - Python List - 2021
No ratings yet
Newzen - Python List - 2021
3 pages
Sanjay Ghodawat Group of Institutions: Synopsis
No ratings yet
Sanjay Ghodawat Group of Institutions: Synopsis
7 pages

Redit Card Fraud Detection Using Machine Learning as Data Mining Technique

Uploaded by

Redit Card Fraud Detection Using Machine Learning as Data Mining Technique

Uploaded by

Credit Card Fraud Detection Using Machine

Learning As Data Mining Technique

e-ISSN: 2289-8131 Vol. 10 No. 1-4 23

II. BACKGROUND STUDIES probability of root node. Basically, a Bayesian Network A=

A. Bayesian Network Classifier III. RESEARCH HYPOTHESIS

24 e-ISSN: 2289-8131 Vol. 10 No. 1-4

non-fraudulent characteristics. However, the process of

the training and testing process that occurred on the entire

dataset. Through 10-fold cross validation, the dataset was

e-ISSN: 2289-8131 Vol. 10 No. 1-4 25

transformed into appropriate forms (or standardization) to be Table 3

26 e-ISSN: 2289-8131 Vol. 10 No. 1-4

e-ISSN: 2289-8131 Vol. 10 No. 1-4 27

You might also like